Skip to content

AnalysingCodePerformance

Baptiste Lesquoy edited this page Sep 26, 2024 · 4 revisions

Analysing code performance

It is a common mistake when trying to optimize any piece of code to go in head first and try to optimize everything possible. The problem with this approach is that basically everything could be optimized in a computer program and that the task of optimizing itself can be very time consuming. So before even optimizing a model, it is important to analyse it, in order to know which part has the most impact on the simulation duration and to have precise data on your initial execution time.

Some general concepts and tips before starting

Randomness

In gaml, many operators have an implicit use of randomness (for example the one_of operator) and this can of course impact the simulation duration. Depending on your case, you may need to set the random generator to a certain seed during the analysis part, to make sure that every time the simulation is ran, it will get exactly the same data and operations done, and thus comparisons of execution time are more fair/stable.

In other cases you may want to take that randomness into account in your optimization, and in this case the right approach would be to repeat the tests a certain amount of time that you think is reasonable to "neutralize" the effect of randomness.

In any cases, it is important to keep in mind that randomness exists in gama and to be mindful of it when analysing execution time as it can have an impact.

Exponential processes

In reality, most of the real (unexpected) problems regarding execution time are caused by exponential processes. Those are the ones we are really looking for when analysing code most of the time. By exponential process you have to understand anything that has an execution time that grows faster than linearly (proportionally) in function of its parameters.

For example if in my simulation I have 100 agents of the same species, if I run it again with 200 agents, I would expect the simulation time to approximately double too. If it is not the case and the simulation takes 10 times longer to execute then there's probably a reflex and/or an action somewhere in the model that has an exponential growth of its processing time in function of the number of agents, and this reflex/action is something you really want to monitor. Of course in this case it could also be something else like a part of the code that only triggers when the number of agents is greater than 150.

Sometimes it is possible to change an algorithm of exponential complexity by a linear one, but sometimes it is simply not possible as the nature of the problem we try to address is exponential. But knowing that some part of the code have an exponential growth is still very important as it can help to mitigate its effect on the overall execution time. If we go back to my previous example, maybe we can determine a number of agents under which the execution time is still reasonable and find ways to keep the number of agents in the simulation under that number.

The different ways to get elapsed time in gaml

Analysing a model or a piece of code's performance always comes down to assessing the time it took to do something. To do so, we have a few options in gaml:

The duration variable

The duration variable can be used to get the time that the previous step took to execute. At the first step of the simulation it will be 0, then at the next step it will be the time that the first step took etc. It is very useful to easily get an idea of the duration of steps, but if we need to get the duration of the initialization phase, or of some piece of code that is happening inside one step we will need another tool.

Using gama.machine_time

The machine_time variable is a more versatile tool, it just returns the current time on the computer running the simulation. Using this we can process the duration of anything we want in the simulation.

For example to get (an approximation of) the duration of the initialization of a model, we could write:

global{
    float init_time <- gama.machine_time;
	// plenty of other variable declarations
    init{
        // init code
        // ...
        write "Initialization took: " + (gama.machine_time - init_time) + "ms";
    }
}

Using the benchmark statement

In addition to this, gaml provides a statement, benchmark, that could be used to execute code a certain number of time and that will print in the console the minimum, maximum and average duration of those executions. This could be useful to quickly test different parts of the code or alternative codes one after another. Executing the code multiple times is a common technique in the code optimization world as it is sometimes necessary to eliminate (or assess) the impact of randomness in the given code or on the computer: for example maybe some hidden task starts just at the same moment and impacts the performances of 1 execution. Here is an example of the use of the benchmark statement to compare two alternative codes:

benchmark message:"concatenation standard" repeat:10{
	string result <- '';
	loop times:nb_concat{
		result <- result + rnd(0,10);
	}
}

benchmark message:"concatenation optimized" repeat:10{
	list<string> content;
	loop times:nb_concat{
		content <+ string(rnd(0,10));
	}
	string result <- concatenate(content);			
}

and the result in the console would look something like this:

concatenation standard (over 10 iteration(s)): min = 591.1828 ms (iteration #4) | max = 681.7818 ms (iteration #0) | average = 607.1486699999999ms

concatenation optimized (over 10 iteration(s)): min = 15.3127 ms (iteration #3) | max = 16.3495 ms (iteration #8) | average = 15.753850000000003ms

Using the benchmark facet of an experiment

This is a more advanced use, the experiment in gaml have a facet called benchmark which can be turned to true, in that case every executed line of code will be recorded in a csv along with its number of call and cumulated execution time.

How to proceed in practice

Having an overview of the model's execution time step by step

When you want to optimize a model but don't know where to start, a good (and easy to implement) starting point is to check at the global shape of execution time during your simulation. For that we are going to record each step's duration and plot it to get an idea of what's going on during the execution.

To do so, simply add this reflex into your global block:

reflex save_step_duration when: cycle>0 {
	save [cycle, duration] to:"durations.csv" format:"csv" rewrite:false;
}

This way you will have every step duration saved in a csv file that you can analyse later on. Once you have the csv file you can open it in your favourite spreadsheet to visualise it. From there, there are a few possibilities:

  1. The durations are always pretty much the same, no big variation
  2. The durations globally increase during the whole simulation
  3. The durations have a baseline but sometimes there are steps that are significantly longer than that baseline
  4. The durations are chaotic, once low then high, there's no real baseline

Each of these cases call for a different approach

Stable duration throughout the simulation

Congratulation, your model is already stable which means it is probably already well thought! Now it may be harder for you as there are no real clue as to what to improve first, but the good news is that probably any improvement anywhere in the code will reflect in the global model's execution time. See the section about the things to look for that often cost a lot of execution time.

Step duration increases during the simulation

That kind of result points toward an increase of the simulation complexity during its lifespan: as the simulation progresses there are more and more things to process. This is typical of either the number of agent continuously increasing or cumulative data being held and used for processing.

Some steps are significantly longer than the rest

This means that under certain condition, your model will execute intensive code that is normally not executed. To find exactly what it is, you can inspect the when facets of your reflexes and the if and loop while: conditions in your reflexes and actions.

Chaotic step durations

This may be the hardest case to analyse, as the complexity seem to come from a combination of "unstable" things. A good approach here could be to try and isolate different parts of the code. Let's say that your model has two main dynamics that interact together, in that case what you want to do is to try and analyse each one separately. To do so, you will probably need to recreate two new models each only containing one of those dynamics. If both dynamics cannot function without the other you could mimic the other with a very simple functions that for example always return the same thing, or return values that you already processed in a previous simulation so it doesn't cost anything in execution time.

If there are no such things in your model as different dynamics then you could go a step further and log for each reflex and/or action when it is called and the duration of the call. This way you could identify the cost of each action/reflex on the overall simulation duration and work from there to optimize only the actions/reflexes that matter the most.

Another thing you can try is to register individually all reflex/action calls as well as their duration to try and get a better idea of what is happening. Alternatively you could use the benchmark facet of the experiment as described previously, but that may be too much details for now.

Identifying the problem in the code

isolating the code in a simpler model

Looking for exponential behaviour

benchmarking

comparing to alternative code

What to pay the most attention to

  1. every kind of loop (including the ask statements)
  2. number of agents and interactions
  3. accumulating data
  4. reading and writing files
  5. big string concatenation
  6. any process that has an exponential nature
  7. network code
  8. displays/charts
  9. platform settings
  1. What's new (Changelog)
  1. Installation and Launching
    1. Installation
    2. Launching GAMA
    3. Updating GAMA
    4. Installing Plugins
  2. Workspace, Projects and Models
    1. Navigating in the Workspace
    2. Changing Workspace
    3. Importing Models
  3. Editing Models
    1. GAML Editor (Generalities)
    2. GAML Editor Tools
    3. Validation of Models
  4. Running Experiments
    1. Launching Experiments
    2. Experiments User interface
    3. Controls of experiments
    4. Parameters view
    5. Inspectors and monitors
    6. Displays
    7. Batch Specific UI
    8. Errors View
  5. Running Headless
    1. Headless Batch
    2. Headless Server
    3. Headless Legacy
  6. Preferences
  7. Troubleshooting
  1. Introduction
    1. Start with GAML
    2. Organization of a Model
    3. Basic programming concepts in GAML
  2. Manipulate basic Species
  3. Global Species
    1. Regular Species
    2. Defining Actions and Behaviors
    3. Interaction between Agents
    4. Attaching Skills
    5. Inheritance
  4. Defining Advanced Species
    1. Grid Species
    2. Graph Species
    3. Mirror Species
    4. Multi-Level Architecture
  5. Defining GUI Experiment
    1. Defining Parameters
    2. Defining Displays Generalities
    3. Defining 3D Displays
    4. Defining Charts
    5. Defining Monitors and Inspectors
    6. Defining Export files
    7. Defining User Interaction
  6. Exploring Models
    1. Run Several Simulations
    2. Batch Experiments
    3. Exploration Methods
  7. Optimizing Model Section
    1. Runtime Concepts
    2. Analyzing code performance
    3. Optimizing Models
  8. Multi-Paradigm Modeling
    1. Control Architecture
    2. Defining Differential Equations
  1. Manipulate OSM Data
  2. Diffusion
  3. Using Database
  4. Using FIPA ACL
  5. Using BDI with BEN
  6. Using Driving Skill
  7. Manipulate dates
  8. Manipulate lights
  9. Using comodel
  10. Save and restore Simulations
  11. Using network
  12. Headless mode
  13. Using Headless
  14. Writing Unit Tests
  15. Ensure model's reproducibility
  16. Going further with extensions
    1. Calling R
    2. Using Graphical Editor
    3. Using Git from GAMA
  1. Built-in Species
  2. Built-in Skills
  3. Built-in Architecture
  4. Statements
  5. Data Type
  6. File Type
  7. Expressions
    1. Literals
    2. Units and Constants
    3. Pseudo Variables
    4. Variables And Attributes
    5. Operators [A-A]
    6. Operators [B-C]
    7. Operators [D-H]
    8. Operators [I-M]
    9. Operators [N-R]
    10. Operators [S-Z]
  8. Exhaustive list of GAMA Keywords
  1. Installing the GIT version
  2. Developing Extensions
    1. Developing Plugins
    2. Developing Skills
    3. Developing Statements
    4. Developing Operators
    5. Developing Types
    6. Developing Species
    7. Developing Control Architectures
    8. Index of annotations
  3. Introduction to GAMA Java API
    1. Architecture of GAMA
    2. IScope
  4. Using GAMA flags
  5. Creating a release of GAMA
  6. Documentation generation

  1. Predator Prey
  2. Road Traffic
  3. 3D Tutorial
  4. Incremental Model
  5. Luneray's flu
  6. BDI Agents

  1. Team
  2. Projects using GAMA
  3. Scientific References
  4. Training Sessions

Resources

  1. Videos
  2. Conferences
  3. Code Examples
  4. Pedagogical materials
Clone this wiki locally