AnalysingCodePerformance

Analysing code performance

It is a common mistake when trying to optimize any piece of code to go in head first and try to optimize everything possible. The problem with this approach is that basically everything could be optimized in a computer program and that the task of optimizing itself can be very time consuming. So before even optimizing a model, it is important to analyse it, in order to know which part has the most impact on the simulation duration and to have precise data on your initial execution time.

Some general concepts and tips before starting

Randomness

In gaml, many operators have an implicit use of randomness (for example the one_of operator) and this can of course impact the simulation duration. Depending on your case, you may need to set the random generator to a certain seed during the analysis part, to make sure that every time the simulation is ran, it will get exactly the same data and operations done, and thus comparisons of execution time are more fair/stable.

In other cases you may want to take that randomness into account in your optimization, and in this case the right approach would be to repeat the tests a certain amount of time that you think is reasonable to "neutralize" the effect of randomness.

In any cases, it is important to keep in mind that randomness exists in gama and to be mindful of it when analysing execution time as it can have an impact.

Exponential processes

In reality, most of the real (unexpected) problems regarding execution time are caused by exponential processes. Those are the ones we are really looking for when analysing code most of the time. By exponential process you have to understand anything that has an execution time that grows faster than linearly (proportionally) in function of its parameters.

For example if in my simulation I have 100 agents of the same species, if I run it again with 200 agents, I would expect the simulation time to approximately double too. If it is not the case and the simulation takes 10 times longer to execute then there's probably a reflex and/or an action somewhere in the model that has an exponential growth of its processing time in function of the number of agents, and this reflex/action is something you really want to monitor. Of course in this case it could also be something else like a part of the code that only triggers when the number of agents is greater than 150.

Sometimes it is possible to change an algorithm of exponential complexity by a linear one, but sometimes it is simply not possible as the nature of the problem we try to address is exponential. But knowing that some part of the code have an exponential growth is still very important as it can help to mitigate its effect on the overall execution time. If we go back to my previous example, maybe we can determine a number of agents under which the execution time is still reasonable and find ways to keep the number of agents in the simulation under that number.

The different ways to get elapsed time in gaml

Analysing a model or a piece of code's performance always comes down to assessing the time it took to do something. To do so, we have a few options in gaml:

The `duration` variable

The duration variable can be used to get the time that the previous step took to execute. At the first step of the simulation it will be 0, then at the next step it will be the time that the first step took etc. It is very useful to easily get an idea of the duration of steps, but if we need to get the duration of the initialization phase, or of some piece of code that is happening inside one step we will need another tool.

Using `gama.machine_time`

The machine_time variable is a more versatile tool, it just returns the current time on the computer running the simulation. Using this we can process the duration of anything we want in the simulation.

For example to get (an approximation of) the duration of the initialization of a model, we could write:

global{
    float init_time <- gama.machine_time;
	// plenty of other variable declarations
    init{
        // init code
        // ...
        write "Initialization took: " + (gama.machine_time - init_time) + "ms";
    }
}

Using the `benchmark` statement

In addition to this, gaml provides a statement, benchmark, that could be used to execute code a certain number of time and that will print in the console the minimum, maximum and average duration of those executions. This could be useful to quickly test different parts of the code or alternative codes one after another. Executing the code multiple times is a common technique in the code optimization world as it is sometimes necessary to eliminate (or assess) the impact of randomness in the given code or on the computer: for example maybe some hidden task starts just at the same moment and impacts the performances of 1 execution. Here is an example of the use of the benchmark statement to compare two alternative codes:

benchmark message:"concatenation standard" repeat:10{
	string result <- '';
	loop times:nb_concat{
		result <- result + rnd(0,10);
	}
}

benchmark message:"concatenation optimized" repeat:10{
	list<string> content;
	loop times:nb_concat{
		content <+ string(rnd(0,10));
	}
	string result <- concatenate(content);			
}

and the result in the console would look something like this:

concatenation standard (over 10 iteration(s)): min = 591.1828 ms (iteration #4) | max = 681.7818 ms (iteration #0) | average = 607.1486699999999ms

concatenation optimized (over 10 iteration(s)): min = 15.3127 ms (iteration #3) | max = 16.3495 ms (iteration #8) | average = 15.753850000000003ms

Using the `benchmark` facet of an experiment

This is a more advanced use, the experiment in gaml have a facet called benchmark which can be turned to true, in that case every executed line of code will be recorded in a csv along with its number of call and cumulated execution time.

How to proceed in practice

Having an overview of the model's execution time step by step

When you want to optimize a model but don't know where to start, a good (and easy to implement) starting point is to check at the global shape of execution time during your simulation. For that we are going to record each step's duration and plot it to get an idea of what's going on during the execution.

To do so, simply add this reflex into your global block:

reflex save_step_duration when: cycle>0 {
	save [cycle, duration] to:"durations.csv" format:"csv" rewrite:false;
}

This way you will have every step duration saved in a csv file that you can analyse later on. Once you have the csv file you can open it in your favourite spreadsheet to visualise it. From there, there are a few possibilities:

The durations are always pretty much the same, no big variation
The durations globally increase during the whole simulation
The durations have a baseline but sometimes there are steps that are significantly longer than that baseline
The durations are chaotic, once low then high, there's no real baseline

Each of these cases call for a different approach

Stable duration throughout the simulation

Congratulation, your model is already stable which means it is probably already well thought! Now it may be harder for you as there are no real clue as to what to improve first, but the good news is that probably any improvement anywhere in the code will reflect in the global model's execution time. See the section about the things to look for that often cost a lot of execution time.

Step duration increases during the simulation

That kind of result points toward an increase of the simulation complexity during its lifespan: as the simulation progresses there are more and more things to process. This is typical of either the number of agent continuously increasing or cumulative data being held and used for processing.

Some steps are significantly longer than the rest

This means that under certain condition, your model will execute intensive code that is normally not executed. To find exactly what it is, you can inspect the when facets of your reflexes and the if and loop while: conditions in your reflexes and actions.

Chaotic step durations

This may be the hardest case to analyse, as the complexity seem to come from a combination of "unstable" things. A good approach here could be to try and isolate different parts of the code. Let's say that your model has two main dynamics that interact together, in that case what you want to do is to try and analyse each one separately. To do so, you will probably need to recreate two new models each only containing one of those dynamics. If both dynamics cannot function without the other you could mimic the other with a very simple functions that for example always return the same thing, or return values that you already processed in a previous simulation so it doesn't cost anything in execution time.

If there are no such things in your model as different dynamics then you could go a step further and log for each reflex and/or action when it is called and the duration of the call. This way you could identify the cost of each action/reflex on the overall simulation duration and work from there to optimize only the actions/reflexes that matter the most.

Another thing you can try is to register individually all reflex/action calls as well as their duration to try and get a better idea of what is happening. Alternatively you could use the benchmark facet of the experiment as described previously, but that may be too much details for now.

Identifying the problem in the code

isolating the code in a simpler model

Looking for exponential behaviour

benchmarking

comparing to alternative code

What to pay the most attention to

every kind of loop (including the ask statements)
number of agents and interactions
accumulating data
reading and writing files
big string concatenation
any process that has an exponential nature
network code
displays/charts
platform settings

Home

What's new (Changelog)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AnalysingCodePerformance

Analysing code performance

Some general concepts and tips before starting

Randomness

Exponential processes

The different ways to get elapsed time in gaml

The `duration` variable

Using `gama.machine_time`

Using the `benchmark` statement

Using the `benchmark` facet of an experiment

How to proceed in practice

Having an overview of the model's execution time step by step

Stable duration throughout the simulation

Step duration increases during the simulation

Some steps are significantly longer than the rest

Chaotic step durations

Identifying the problem in the code

isolating the code in a simpler model

Looking for exponential behaviour

benchmarking

comparing to alternative code

What to pay the most attention to

Home

Platform

Learn GAML step by step

Recipes

GAML References

Developing GAMA

Tutorials

Community

Versions of GAMA

Resources

Clone this wiki locally

AnalysingCodePerformance

Analysing code performance

Some general concepts and tips before starting

Randomness

Exponential processes

The different ways to get elapsed time in gaml

The duration variable

Using gama.machine_time

Using the benchmark statement

Using the benchmark facet of an experiment

How to proceed in practice

Having an overview of the model's execution time step by step

Stable duration throughout the simulation

Step duration increases during the simulation

Some steps are significantly longer than the rest

Chaotic step durations

Identifying the problem in the code

isolating the code in a simpler model

Looking for exponential behaviour

benchmarking

comparing to alternative code

What to pay the most attention to

Resources

Clone this wiki locally

The `duration` variable

Using `gama.machine_time`

Using the `benchmark` statement

Using the `benchmark` facet of an experiment