-
Notifications
You must be signed in to change notification settings - Fork 878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster agent attribute collection #576
Conversation
Only significantly faster if all agent reporters are attributes
1 similar comment
Okay, forget this implementation! I found a much better way with virtually no overhead!! Instead of evaluating the reporter functions directly we can use the map function to create a generator and only calculate the values once we build up the dataframe. Something like: This will also work with custom reporter functions and not just string based reporters. I think I could still speed up the creation of the dataframe for pure string reporters, but I think this will be less critical. |
Oh gosh I wish I could trust my eyes more... I somehow thought that map would auto-magically track the agents current state at the cost of increased memory usage. I quickly checked it and everything seemed fine, but I must have looked at the wrong data or something. Unfortunately (but logically), no auto-magic is happening. With the nature of lazy evaluation once you create the DataFrame, the values for all steps are actually only from the current state (the last step). Guess I was too excited 😢 I will look into this further and probably revert to the previous implementation (which speeds things up, but still with some overhead). Until then, please don't merge anything just yet. |
Ok, new implementation online, again. And since I am spamming this thread already, here is some explanation how the current implementation is faster (for those interested). In the original implementation we had to access each agent 2*reports times (because each reporter would also store the agents unique id). For 2000 agents and 5 reporters this would result in 20000 memory accesses and operations). Now in the current implementation I have "outsourced" the attribute collection into a function called NB: collect now stores agent records in a dictionary with model steps as the keys. That means it would also be trivial to add a "step" keyword to the |
Addresses #575
This implementation adds a fast-track for collecting agent attributes if all reporters are string-based (no custom or lambda functions). If no string-based reporters are present or for a mixture of reporters it defaults to a only slightly improved version of the current implementation.
I also changed the structure of the class variable agent_vars, to store less data, it specifically stores the agent.unique_id only once and not for every reporter. This breaks code that relies on that variable, although none of the examples do.
One idea to encourage the use of string-based reporters for agent reporters would be to include a warning if an agent reporter is not string-based and tell users that they could benefit from switching to string-based reporters (or even that support for non string-based reporters will be dropped in a much later version of mesa).
Hope you like it