Feature Request: Agent DataCollection Can't Handle Different Attributes in ActivationByType #1419

tpike3 · 2022-09-24T11:55:24Z

Describe the Feature
Receive attribute error when running an agent reporter through datacollector when using RandomActivationByType

Desired behavior
I can collect against different attributes based on agent type

To Reproduce
Create a model with different agent types and attributes and try to collect against that agent attribute

Additional context
Feature would need to update at line 170 of datacollection.py agent_records = map(get_reports, model.schedule.agents)

The text was updated successfully, but these errors were encountered:

jackiekazil · 2022-09-24T20:08:16Z

@tpike3 I am unable to recreate the bug -- it looks like sugarscape_cg uses RandomActivationByType. Can you add it to that example & recreate it?

tpike3 · 2022-09-24T22:06:24Z

If you use this colab you will get the error https://github.com/edgeofchaos42/ComplexityExplorer/blob/main/Session_18_Data_Collector_agent.ipynb Attached is the landscape you will need to load into your google drive. or update the sugar_distribution = line

Per the dev session this morning I should be able to add a custom function to filter out non-traders or at least return None, but so far unsuccessful on that code.

sugar-map.txt

rht · 2022-09-25T09:55:00Z

I think a temporary workaround is to put the trade partners in the model reporter instead.

lambda m: {k: list(a.trade_partners) for k, a in m.schedule.agents_by_type[Trader].items()}

I used list() to copy the a.trade_partners.

EwoutH · 2022-10-07T22:22:44Z

This is indeed a (in my opinion big) limitation. I wrote a proof of concept implementation for it in #1142, and used it in a personal project. If anyone ones to built on it, feel free to use it in any way!

rht · 2022-10-09T10:06:09Z

agent_reporters = {Cat: {"Weight": "weight", "Size": "size"}, Dog: {"Weight": "weight"}}

While this looks more concise for RandomActivationByType, it is a pattern that only works for this scheduler. We can't just modify DataCollector to expect more different types of schedulers. DataCollector will become more bloated. And this is not much more concise than the workaround I specified in my earlier comment. The lack of the feature is not a blocker in @tpike3's situation.

Either:

Create a DataCollectorRABT (shorthand for DataCollectorRandomActivationByType) class that handles types for agent collector
Tell people to put their type-specific data in the model collector, like in Feature Request: Agent DataCollection Can't Handle Different Attributes in ActivationByType #1419 (comment). Put the guide in in best-practices.rst

We should look at how other ABM libraries choose their trade-off here.

EwoutH · 2022-10-18T11:43:05Z

Could we create a new switch for the DataCollector, multiple_agent_types, for example, which is False by default but can be set to True if you want to collect data from multiple agent types? That way it's still contained to a single class, which can reduce the maintenance effort.

rht · 2022-10-19T04:38:47Z

The maintenance effort is about the same with DataCollectorRandomActivationByType, since it will inherit from DataCollector, and only _record_agents needs to be replaced to handle agents by type. Adding an extra switch and if clause inside DataCollector doesn't scale, because what if there is a scheduler with 2-level type hierarchy (Dict[str, Dict[str, Dict[int, Agent]]], see Multi-level Mesa)? The switch and the if's keep growing.

EwoutH · 2022-10-19T09:09:52Z

That's a good point.

Would you be interested in creating an (draft) implementation?

rht · 2022-10-19T09:33:30Z

My plate is rather full right now.

EwoutH · 2022-10-19T09:37:25Z

Thanks for the honest reply, let's discuss at the next development meeting how we can move this project forward.

EwoutH · 2022-10-20T13:41:11Z

What if you would restructure the DataCollector in such way that it's split in an agent-data and model-data part?

Maybe a DataCollector superclass with a ModelDataCollector and a AgentDataCollector subclasses? In that case the ModelDataCollector always expects a single model as input, and a AgentDataCollector a single agent type. If you want to collect data for more that one agent type, you simply call the AgentDataCollector multiple times.

Many of the code is now duplicate such as _new_model_reporter and _new_agent_reporter, so why not split those? And everything that's shared can sit in the DataCollector superclass.

jacob-thrackle · 2022-10-27T01:00:58Z

I'm also running into this issue with respect to the agent attributes. After reading through the code, I wonder if it would be possible to just pass a specific schedule to the DataCollector constructor instead of assuming (hard-coding) that there exists an object in the model called schedule. That would seem to limit the bloat of DataCollector and remove one of the implicit assumptions in the class.

So something like this in the Model:

self.datacollector_my_agents = mesa.DataCollector(
    schedule = self.schedule_my_agents, # arbitrary schedule with an .agents attribute that returns a list of agents
    agent_reporters = {...},
    model_reporters = {...}
)

self.datacollector_your_agents = mesa.DataCollector(
    schedule = self.schedule_your_agents, # arbitrary schedule with an .agents attribute that returns a list of agents
    agent_reporters = {...},
    model_reporters = {...}
)

Then to get the data:

my_agent_data = model.datacollector_my_agents.get_agent_vars_dataframe()
your_agent_data = model.datacollector_your_agents.get_agent_vars_dataframe()

rht · 2022-10-27T04:51:25Z

Maybe a DataCollector superclass with a ModelDataCollector and a AgentDataCollector subclasses?
...
If you want to collect data for more that one agent type, you simply call the AgentDataCollector multiple times.

This definitely could provide ingredients for people who want to build their own data collector class. But I don't see it to be cleaner than just inheriting from the current DataCollector class and modifying the method you want to change as needed.

I think for now, for trading Sugarscape use case, we should implement the solution within the examples folder. Then wait for a while for it to become a part of the library.

rht · 2022-10-27T12:01:35Z

@jacob-thrackle your solution could work. Though we also want to make sure the >80% of the use case stays simple and convenient, where the scheduler is assigned to the model attribute schedule anyway. So that users don't have to specify the scheduler object every time.

jacob-thrackle · 2022-10-28T00:26:53Z

Absolutely. I started hacking around with it last night but the following section makes it difficult because of the string type for model.schedule.steps in prefix (line 158):

mesa/mesa/datacollection.py

Lines 156 to 166 in 373637e

    
           rep_funcs = self.agent_reporters.values() 
        
           if all(hasattr(rep, "attribute_name") for rep in rep_funcs): 
        
               prefix = ["model.schedule.steps", "unique_id"] 
        
               attributes = [func.attribute_name for func in rep_funcs] 
        
               get_reports = attrgetter(*prefix + attributes) 
        
           else: 
        
               def get_reports(agent): 
        
                   _prefix = (agent.model.schedule.steps, agent.unique_id) 
        
                   reports = tuple(rep(agent) for rep in rep_funcs) 
        
                   return _prefix + reports

If I'm understanding the code correctly, the condition at line 157 is only True if all the rep_funcs are partial which is only true if the reporter is a str:

mesa/mesa/datacollection.py

Lines 137 to 141 in 373637e

    
           if type(reporter) is str: 
        
               attribute_name = reporter 
        
               reporter = partial(self._getattr, reporter) 
        
               reporter.attribute_name = attribute_name 
        
           self.agent_reporters[name] = reporter

So having a reporter be a lambda or some other arbitrary function would make this not true and evaluate the else statement at line 161. I assume that it is done this way so that get_reports = attrgetter(*prefix + attributes) is really fast if the reporters are just attributes of the agents but this gain is nullified once you use anything other than a string in the reporter dictionary in the DataCollector itself. Since the DataCollector.collect method is typically only called once per model-step, and unless there are a huge number of agents in which one is only interested in getting attributes, it may be worthwhile to consider removing the condition and expression in lines 157-160 and just using the function at 163.

Using the function has the added advantage of calling self._getattr() which has a default value of None, meaning that attributes which do not exist return None instead of throwing an AttributeError. The code in lines 158-160 never call self._getattr() and just use the attribute_name which is passed to attrgetter().

This is all to say that I was able to make my suggestion work by simply removing lines 157-161, only using the function get_reports at line 163 with some minor adjustments, including a condition for backwards compatibility where the scheduler is assigned to the model attribute schedule:

if self.schedule:
    schedule = self.schedule # new behavior
else:
    schedule = model.schedule # old behavior, self.schedule defaults to None in the constructor

This conditional has to be placed, unfortunately, in both _record_agents and collect, although one could limit it to one call and just pass in the schedule to _record_agents which is called from inside collect. Thoughts?

rht · 2022-10-29T04:53:10Z

That L157-161 seems to be written by @Corvince in aaa4c54. The original commit contains this comment for the attrgetter

# Fast path if all reporters are attribute collecters

but had since been removed in the subsequent iterations.

But the original PR that did the specialization of the code when all the reporters are string-based, is #576. #575 contains some benchmark results

Corvince · 2022-10-29T05:42:32Z

Thanks for digging that up @rht

And yes, @jacob-thrackle you understood exactly right how it works.
Basically, attrgetter gives the best possible performance, but only works if we have only string collectors.
This is a pure performance optimization. But, as you figured out, does not allow a default value.

Is is probably better to give up a tiny bit of performance by using getattr instead of attrgetter to have default values

…eudule and to return a default value of None if no such attribute exists.

…creation (defaults to model.schedule otherwise) and will return None if an attribute is not found instead of throwing an AttributeError.

woolgathering · 2022-10-29T21:23:51Z

@Corvince Cool, I opened a PR from my personal account (this is @jacob-thrackle). Let me know when you guys get around to reviewing it and what changes, if any, you'd like me to make. I forgot to link it in the commit but this also fixes #976.

I made another version that kept the optimization but it only happened if two things were true: that self.schedule == None and if all the reporters in the dictionary are strings. I'm honestly not sure how often people use that (a lot?) but I can throw it back in if it's wanted. I removed it since the behavior would have been different if one had passed in model.schedule versus letting self.schedule be None. Felt inconsistent.

…() now accepts a scheudle instead of a model. Attributes that do not exist return None instead of an AttributeError.

tpike3 changed the title ~~Agent DataCollection Can't Handle Different Attributes in ActivationByType~~ Feature Request: Agent DataCollection Can't Handle Different Attributes in ActivationByType Oct 2, 2022

EwoutH mentioned this issue Oct 20, 2022

DataCollector requires all agents to have the same Attributes #976

Closed

woolgathering added a commit to woolgathering/mesa that referenced this issue Oct 29, 2022

Fix projectmesa#1419. Updated DataCollector to accept an arbitary sch…

9cdf3da

…eudule and to return a default value of None if no such attribute exists.

woolgathering mentioned this issue Oct 29, 2022

Fix #1419. DataCollector accepts an arbitrary schedule at creation (d… #1481

Closed

tpike3 added this to the v1.2.0 Taylor milestone Oct 31, 2022

woolgathering added a commit to woolgathering/mesa that referenced this issue Nov 2, 2022

Fix projectmesa#1419 as discussed in projectmesa#1481. _record_agents…

4fcf6a0

…() now accepts a scheudle instead of a model. Attributes that do not exist return None instead of an AttributeError.

tpike3 modified the milestones: v1.2.0 Taylor, Mesa 2.0 Dec 8, 2022

tpike3 modified the milestones: Mesa 2.0, Milestone 1.2 Jan 8, 2023

tpike3 removed this from the Milestone 1.2 milestone Feb 5, 2023

tpike3 added this to the Mesa 2.0 milestone Feb 5, 2023

rht mentioned this issue May 2, 2023

Fix bug in tutorial to calculate MRS with potential trade projectmesa/mesa-examples#33

Merged

tpike3 mentioned this issue May 19, 2023

Collecting data from multiple differentiable classes of agents #1701

Closed

rht mentioned this issue May 21, 2023

feat: Implement exclude_none_values in DataCollector #1702

Merged

tpike3 closed this as completed in #1702 May 27, 2023

Corvince mentioned this issue Sep 17, 2023

Multi-Agent data collection #348

Closed

EwoutH mentioned this issue Oct 24, 2023

DataCollector: Allow agent reporters to take class methods and functions with parameter lists #1838

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Agent DataCollection Can't Handle Different Attributes in ActivationByType #1419

Feature Request: Agent DataCollection Can't Handle Different Attributes in ActivationByType #1419

tpike3 commented Sep 24, 2022 •

edited

Loading

jackiekazil commented Sep 24, 2022

tpike3 commented Sep 24, 2022

rht commented Sep 25, 2022

EwoutH commented Oct 7, 2022

rht commented Oct 9, 2022

EwoutH commented Oct 18, 2022

rht commented Oct 19, 2022 •

edited

Loading

EwoutH commented Oct 19, 2022

rht commented Oct 19, 2022

EwoutH commented Oct 19, 2022

EwoutH commented Oct 20, 2022

jacob-thrackle commented Oct 27, 2022 •

edited

Loading

rht commented Oct 27, 2022

rht commented Oct 27, 2022

jacob-thrackle commented Oct 28, 2022 •

edited

Loading

rht commented Oct 29, 2022

Corvince commented Oct 29, 2022

woolgathering commented Oct 29, 2022 •

edited

Loading

Feature Request: Agent DataCollection Can't Handle Different Attributes in ActivationByType #1419

Feature Request: Agent DataCollection Can't Handle Different Attributes in ActivationByType #1419

Comments

tpike3 commented Sep 24, 2022 • edited Loading

jackiekazil commented Sep 24, 2022

tpike3 commented Sep 24, 2022

rht commented Sep 25, 2022

EwoutH commented Oct 7, 2022

rht commented Oct 9, 2022

EwoutH commented Oct 18, 2022

rht commented Oct 19, 2022 • edited Loading

EwoutH commented Oct 19, 2022

rht commented Oct 19, 2022

EwoutH commented Oct 19, 2022

EwoutH commented Oct 20, 2022

jacob-thrackle commented Oct 27, 2022 • edited Loading

rht commented Oct 27, 2022

rht commented Oct 27, 2022

jacob-thrackle commented Oct 28, 2022 • edited Loading

rht commented Oct 29, 2022

Corvince commented Oct 29, 2022

woolgathering commented Oct 29, 2022 • edited Loading

tpike3 commented Sep 24, 2022 •

edited

Loading

rht commented Oct 19, 2022 •

edited

Loading

jacob-thrackle commented Oct 27, 2022 •

edited

Loading

jacob-thrackle commented Oct 28, 2022 •

edited

Loading

woolgathering commented Oct 29, 2022 •

edited

Loading