Performance Profiling your Functions

Well, the MSO gets its fair share of "why is this mission killing my server?"!!
So I've written a method to "profile" my functions, by measuring how much time they are spending executing and being about to identify the "hogs" of my scripting.
I'll publish here what I've done and feel free to re-use (credits are nice) but in all honesty, its really is dead simple.
To use it, this is what you do.

private ["_profilerid","my_other_private_declarations"];
_profilerid = ["start","A name to identify my function"] call mso_core_fnc_profiler;

// this is
// where my
//. CPU intensive
// code goes

["stop",_profilerid] call mso_core_fnc_profiler;

Here's the code that does everything.The code that makes up mso_core_fnc_profiler is this:

private ["_result","_param","_status","_start"];
_status = _this select 0;
_param = _this select 1;
_result = 0;

if(isNil "CRB_PROFILER") then {
        diag_log "CRB_PROFILER,NAME,TIME";
        CRB_PROFILER = [];
};

switch(toLower(_status)) do {
        case "start": {
                _start = diag_tickTime;
                _result = count CRB_PROFILER;
                CRB_PROFILER set [_result, [_param, _start]];
        };

        case "stop": {
                private ["_id","_functime"];
                _id = _param;
                _param = CRB_PROFILER select _id;
                _start = _param select 1;
                _functime = diag_tickTime - _start;
                _param set [1,  _functime];
                CRB_PROFILER set [_id, _param];
                diag_log format ["CRB_PROFILER,%1,%2", _param select 0, _param select 1];
        };
};

_result;

So what it does is, it produces a whole bunch of lines in your RPT file, which identify how many times a function was called, and how long its taking.

"CRB_PROFILER,NAME,TIME"
"CRB_PROFILER,mso_core_fnc_initLocations,103.1 59"
"CRB_PROFILER,Nou Caching (triggerUnits),0.000976563"
"CRB_PROFILER,Nou Caching (closestUnit),0.000976563"
"CRB_PROFILER,Nou Caching (cacheGroup),0.000976563"
"CRB_PROFILER,Nou Caching (triggerUnits),0.337891"
"CRB_PROFILER,Nou Caching (closestUnit),0.000976563"
...
"CRB_PROFILER,Nou Caching (closestUnit),0.000976563"
"CRB_PROFILER,Nou Caching (cacheGroup),0.000976563"
"CRB_PROFILER,Nou Caching (triggerUnits),0.000976563"
"CRB_PROFILER,Nou Caching (closestUnit),0.140137"
"CRB_PROFILER,Nou Caching (cacheGroup),0.000976563"
"CRB_PROFILER,Nou Caching (triggerUnits),0.00195313"
"CRB_PROFILER,Nou Caching (closestUnit),0"

You can throw this info into a spreadsheet (MS Excel or OpenOffice Calc) and identify the functions that a chewing the most performance.

In the example above, I ended up with the following figures (sorry about the formatting - stupid CSS).





























Average Function Time
NAME AVERAGE TIME
mso_core_fnc_initLocations 103.59
Nou Caching cacheGroup 0.0005138401
Nou Caching closestUnit 0.0061846531
Nou Caching server 13.61749
Nou Caching triggerUnits 0.1968103143
Nou Caching uncacheGroup 0.00505464
Total Result 0.2756398595




























NAME AVERAGE TIME
Total Function Time
mso_core_fnc_initLocations 103.59
Nou Caching cacheGroup 0.08221442
Nou Caching closestUnit 1.558532572
Nou Caching server 54.46996
Nou Caching triggerUnits 49.5961992
Nou Caching uncacheGroup 0.465026878
Total Result 209.76193307

The result - I know which function I need to work on improving (this example Nou Caching (triggerUnits)).

So I've continued work on

So I've continued work on this profiler. I've created two new files, crbprofiler-on.hpp and crbprofiler-off.hpp.


crbprofiler-on.hpp

#define CRBPROFILERSTART(name) private ["_profilerid"]; \
    _profilerid = ["start",##name] call mso_core_fnc_profiler;

#define CRBPROFILERSTOP         ["stop",_profilerid] call mso_core_fnc_profiler;

crbprofiler-off.hpp

#define CRBPROFILERSTART(name)

#define CRBPROFILERSTOP

These files are preprocessor macros, or shortcuts. This allows me to modify my files with minimal text, to insert the profiler. To make it even simpler, I rename the "active" crbprofiler.hpp so I don't have to change my code everytime. For example:

#include <crbprofiler.hpp>// Code by Tupolov
private ["_pos","_radius","_units"];CRBPROFILERSTART("mso_core_fnc_getUnitsInArea")_pos = _this select 0;
_radius = _this select 1;_units = [];
{
 if(_pos distance _x < _radius) then {
  _units set [count _units, _x];
 };
} forEach allUnits;CRBPROFILERSTOP_units;

 

The next thing I did was

The next thing I did was continued improving the Nou caching system. Here are the results of the improvements found by using this method to identify inefficiencies in code.



In the graph above, I changed the function Nou Caching triggerUnits to instead of iterating through every unit  and check if they're a leader (allUnits), to iterate through every group (allGroups) and check leaders.


You can see almost 4x improvement in the speed. 

When this same system is applied to the MSO, I gained what appear to be substantial improvements, but found it did not equate directly to a Server FPS improvement. The improvement is still identifyable (some players have said its at least 3x faster than the previous version, but the numbers do not correlate exactly for the 5 minutes testing that was performed.


The graph below shows % improvement, for example Wild dogs take up 42% less CPU time than they did previously.



This image below shows profiling results when applied to MSO 3.40. The FPS Avg & FPS Min are scaled on the left side, while the rest are scaled on the right side. What can be seen is, as max units hit the 330 mark, average server FPS hit around 24, while minimum server FPS averaged about 5 for the duration of the test.



In comparison, the image below shows profiling results when applied to MSO 3.41. Again, the FPS Avg & FPS Min are scaled on the left side, while the rest are scaled on the right side. What can be seen is, as max units hit the 30 mark, average server FPS is again around 24 (which appears to be less efficient than 3.40), but the minimum server FPS averages around 8 for the duration of the test.


Conclusion: Well, the improvements in the functions appear to be statistcally there but until a more thorough testing is performed on server FPS, I'll have to be satistfied that the perceived improvement is true.

 
Original Design by Gary Storey