refactor: Remove dependence on model.schedule, add clock to Model #1942

rht · 2024-01-07T03:01:09Z

The current AgentSet API allows users to define the structure and order of agent actions without initializing a scheduler object. However, a scheduler object is currently necessary until this pull request addresses the issue by incorporating the steps and time attributes directly into the model (which were previously only tracked in the scheduler).

A complication arises for users who still use the scheduler object, as they now need to manually specify model.advance_time() within the model's step() function. To resolve this, the pull request proposes a solution with assistance from ChatGPT, involving the injection of model.advance_time() into the scheduler's step() function. This modification aims to streamline the process for users and enhance the overall functionality of the AgentSet API.

quaquel · 2024-01-07T07:48:02Z

The use of model.agents looks good to me.

One quick question: you also changed how time is handled. It was retrieved from the schedule, and now you track it internally. This is because you don't want to depend on schedule, which makes sense. However, from a conceptual point of view, is this the nicest, longer-term way of handling time?

rht · 2024-01-07T09:25:13Z

It was retrieved from the schedule, and now you track it internally.
This is because you don't want to depend on schedule, which makes sense. However, from a conceptual point of view, is this the nicest, longer-term way of handling time?

Thank you for raising the issue. In this PR, I track it within the datacollector object, which signifies the observer's clock. Which is not ideal. Still looking for a better solution. One alternative would be to define a method in the mesa.Model:

def advance_time(self):
    self.time += 1

In the user's model step

def step(self):
    self.agents.shuffle().do("step")
    self.datacollector.collect(self)
    self.advance_time()

This is how it is done in abcEconomics.

quaquel · 2024-01-07T09:37:32Z

I think tying time to model.step is indeed the way to go. The simple solution would be to handle it through a super call or through some kind of annotation. To be discussed separately at some point.

EwoutH · 2024-01-07T09:40:23Z

I think tying time to model.step is indeed the way to go.

At least by default. I think some concept of actual time would also be very useful. See #1912 (reply in thread)

rht · 2024-01-07T09:50:15Z

The simple solution would be to handle it through a super call or through some kind of annotation. To be discussed separately at some point.

I had thought about super().step(). The problem is that it is very easy to forget doing so. It doesn't have a direct meaning of incrementing the time.

At least by default. I think some concept of actual time would also be very useful. See #1912 (reply in thread)

There would be the ContinuousSpace equivalent of time in that the system may advance a float-type amount of time, and indeed that steps needs to remain separate.

rht · 2024-01-07T10:08:19Z

Regarding with the term, advance_time is more general than increment_time or step_time, with the latter 2 are specific for discrete time step. But at the same time, the term needs to also encompass incrementing the model steps count. As such, advance_time is not sufficient.

rht · 2024-01-07T10:10:52Z

To be pedantic, step_and_advance_time would be the term for incrementing both time and steps.

quaquel · 2024-01-07T10:11:37Z

There would be the ContinuousSpace equivalent of time in that the system may advance a float-type amount of time

In my view, the moment you allow for this, you should go all the way and have a discrete event-style event list at the heart of everything. If you want traditional ABM behavior, you just schedule evenly-spaced events. Each event would be then a call to model.step. If you want full-blown discrete event behavior, you can schedule events (i.e., a combination of a time instant and callable) at any other non-discrete time instant. In fact, you can simply hybridize this by allowing both. Time then is held by the event list.

building on @rht on advance_time, increment_time, and step_time, with an event list, all you would have is advance which means goes to the next scheduled event and update time to the time of this event. So, the entire problem simply disappears.

No idea what a resulting clear API could look like.

EwoutH · 2024-01-07T10:31:31Z

Another thought I have was giving Model.step() a timestep argument. Default would be timestep=1, but you can change that (once or every step if you want).

This could integrate the advance time part into the step, right?

quaquel · 2024-01-07T10:58:24Z

There is a way to track the number of calls to model.step without relying on super. It involves the use of metaclasses.

from collections import defaultdict
from functools import wraps

def count_calls(func):
    name = func.__name__

    @wraps(func)
    def wrapper(self, *args, **kwargs):
        # creates the instance counter if necessary
        counter = getattr(self, "time", None)
        if counter is None:
            counter = 0
        setattr(self,"time", counter + 1)
        return func(self, *args, **kwargs)

    wrapper._is_count_call_wrapper = True
    return wrapper


class CountStep(type):
    def __new__(cls, name, bases, attrs):
        if name != Model.__name__:        
            try:
                step_method = attrs["step"]
            except KeyError:
                pass
            else:
                attrs["step"] = count_calls(step_method)
        return super(CountStep, cls).__new__(cls, name, bases, attrs)

class Model(metaclass=CountStep):
    pass
    
class MyModel(Model):

    def step(self):
        print(self.time)

If we run this

model = MyModel()
for _ in range(10):
    model.step()

we nicely get 1 ... 10.

quaquel · 2024-01-07T11:02:11Z

Another thought I have was giving Model.step() a timestep argument. Default would be timestep=1, but you can change that (once or every step if you want).

This could integrate the advance time part into the step, right?

Not sure how to read something like this. What would it mean if you say step(timestep=3.1415)? Do you mean to take the current time and add the timestep to it and execute all events scheduled between the current time and the new endtime? Or something else?

rht · 2024-01-07T11:06:11Z

(I'm probably digressing too much here on DES...)

Considering #1912 (reply in thread)

In case of ABM, there are only fixed time intervals, or ticks, between sets of events.
MESA lacks an eventlist. Instead, it is up to the user to advance the eventlist by one tick at a time by calling model.step().

I'm drawing example from the Eurace@Unibi model, one of the most elaborate macroeconomic model that has existed.

Taking an excerpt from the paper on Eurace@Unibi

Concerning the activation of agents, the actions can be calendar-based (time-driven) or event-based, where the former can follow either subjective or objective time schedules (agent-time vs.
clock-time). Furthermore, the economic activities take place on a hierarchy of time-scales: yearly,
monthly, weekly and daily activities all take place following the calendar-time or subjective agent-time. Agents are activated asynchronously according to their subjective time schedules that are
anchored on an individual activation day. These activation days are uniformly and randomly
distributed among the agents at the start of the simulation but may change endogenously.

If we were to be able to model a reduced version of Eurace@Unibi in Mesa, for pedagogical purpose. Extending Mesa to describe events would be necessary.

quaquel · 2024-01-07T11:17:36Z

Colleagues of mine have been doing pandemic modeling with models involving between 150 thousand and 25 million agents. The only way to make this computationally feasible was by switching from calender-based to event-based activation. So at some point figuring out how to support this in MESA would be great.

In the meantime, however, there is still the issue of tracking the time of the simulation. Would it make sense to do it along the lines of the metaclass example I have given above?

rht · 2024-01-07T11:28:39Z

In the meantime, however, there is still the issue of tracking the time of the simulation. Would it make sense to do it along the lines of the metaclass example I have given above?

While it is more convenient to the user, I find the implementation to be too complex for the reader of Mesa code. The library code needs to be simple enough without requiring one to spend an effort to decipher the implementation to what amount to tracking the time automatically.

rht · 2024-01-07T11:47:09Z

Not sure how to read something like this. What would it mean if you say step(timestep=3.1415)? Do you mean to take the current time and add the timestep to it and execute all events scheduled between the current time and the new endtime? Or something else?

I see it as the step period that is not necessarily an integer. All the events that happen within the step, are sorted based on their activation times, and are executed in order. That event1 fires at time 0.06674, event2 at 0.1054, event3 at 2.9979, which information is used to decide their execution order.

Edit:
It seems that the AgentSet implementation in #1916 has yet to be able to replace DiscreteEventScheduler.

rht · 2024-01-07T12:16:41Z

What do you think of this approach instead?

class Clock:
    def __init__(self):
        self.steps = 0
        self.time = 0

    def step(deltat=1):  # deltat is more mnemonic than timestep
        self.steps += 1
        self.time += deltat

class MyModel(mesa.Model):
    def __init__(self, ...):
        self.clock = Clock()

    def step(self):
        self.agents.shuffle().do("step")
        self.datacollector.collect(self)
        # This is sufficiently mnemonic, as a replacement of self.schedule.step()
        self.clock.step()

And so, we reuse the existing ABM terms without having to invent new terms. It's FSM all the way down.

EwoutH · 2024-01-07T12:29:44Z

That's exactly what I had in mind. Only instead of creating a new class, I would just integrate it in the Model class.

Edit: and don't have to call the clock explicitly, that should

quaquel · 2024-01-07T12:37:43Z

Why do I use a library for something? To avoid having to write boilerplate code. All models need to track time, so if I use a library, the library should handle this for me. With this suggestion, I must add Clock myself and remember to advance it. It also adds two lines of code to any model I make. Also, speaking from experience teaching MESA for the last 3 years at the MSc level, this is something that will easily trip up new users. So, no, I don't like this suggestion.

While it is more convenient to the user, I find the implementation to be too complex for the reader of Mesa code. The library code needs to be simple enough without requiring one to spend an effort to decipher the implementation to what amount to track the time automatically.

So, here I have a different view. For me, the cleanliness of the API and the use of the library come first. So be it if a clean and easy-to-use API requires some more obscure Python machinery. Because, who is going to read the source code? Only users who are invested in the library and already have some programming background. So, as long as the code is well documented and explained at a high level (i.e., what does it do) and with some detail on how this is achieved, I prefer such a solution over forcing my user always to add boilerplate code.

rht · 2024-01-07T13:13:18Z

I appreciate the criticism and see the point regarding with the annoyance of having to manually carry along the Clock object to wherever the dynamics of the model happens. But I do think that having a simple library encourages users to read the source code, to extend and experiment with them, and to more likely contribute back to the code. From the maintainers' perspective, obscure code only works if there are only a small number of maintainers who understand the code but no one else does. If #1942 (comment) were to be incorporated, from the perspective of an uninitiated developer new to this section of the code, the Git archaelogy would have been more involved than the issue I encountered with __new__ and __init__ in the model initialization.

At the very least, model.clock simply replaces model.schedule in the previous code. And the students may understand what is going on under the hood, instead of using a library that "just works".

Regarding with PEP 20:

Simple is better than complex.

I interpret it as overall simplicity, instead of simple API but obscure implementation.

rht · 2024-01-07T13:24:50Z

That said, I still think there might be a solution where a model.clock or model.steps & model.time is not needed, depending on how model.datacollector's definition, as an observer to the system, should be modified.

rht · 2024-01-07T13:49:27Z

That's exactly what I had in mind. Only instead of creating a new class, I would just integrate it in the Model class.

This seems the simplest approach for now. model.steps and model.time are automatically initialized during super().__init__(). Users then just need to remember to do self.step_and_advance_time() inside the model step(), as the only boilerplate line.

quaquel · 2024-01-07T14:59:43Z

I am fine with a simple solution, although I would advocate calling super over adding another method to the model class. It is still only one additional line, but at least for me and in my teaching, I always advocated calling super anyway.

Some other thoughts

The discussion for me is not about simplicity versus complexity. I agree that PEP 20 should guide all Python projects. Here, however, there is a trade-off between the simplicity of the API and the simplicity of the implementation. I am personally also doubtful whether using metaclasses for something trivial like tracking time is defensible. However, it is the only solution I could devise which avoids forcing the user to write additional boilerplate code.
The argument that implementation simplicity would stimulate users to contribute back to MESA makes no sense to me. In my experience, other factors drive that choice. WRT to mesa, the fact that ruff is not automated, that there are many open issues where it is unclear whether the maintainers actually want to address them (see. e.g., my discussion with @Corvince in Add system state tracking #1933 on Model state format #574), the lack of milestones, and the many stale pull requests, are much more important in my decision on whether to continue to contribute than readability. In fact, having spent the better part of yesterday getting my head around the current space implementation, there are more important parts of the current code base that are hard to comprehend than a relatively small part of the code that could be quite easily explained (also PEP 20: If the implementation is easy to explain, it may be a good idea.) with some comments in the code (i.e., the suggestion is a combination of an annotation (count_calls) and a metaclass to automatically assign this annotation to the step method of the user's model).
Yes, datacollection is in need of an overhaul. However, I believe the data collector should retrieve time from the model rather than be responsible for maintaining time. Because, as @rht stated, the data collector observers the model and a model should run fine without any data collection.

rht · 2024-01-07T21:23:59Z

The argument that implementation simplicity would stimulate users to contribute back to MESA makes no sense to me. In my experience, other factors drive that choice. WRT to mesa, the fact that ruff is not automated, that there are many open issues where it is unclear whether the maintainers actually want to address them ...

I think the emphasis on the implementation simplicity of the code should be orthogonal to the questioning of the maintainers' time commitment. There would definitely be a situation where both the code is simple and readable, together with active maintenance. (On my end, #1933 on #574 is definitely on my radar; I just need some time to digest them.)

To localize the discussion on the system clock: that said, #1942 (comment) handles the model.steps update by counting the number of step() calls, but it hasn't taken into account of model.time update. You could specify the timestep at model __init__, but there is an implicitness in this design choice.

tests/test_batch_run.py

EwoutH · 2024-01-07T22:59:11Z

Let’s separate some issues here:

Tracking time in the model
Decoupling the data collector from the scheduler
State tracking
Event based activation
Complexity in implementation vs user API
Mesa maintenance

1 and 2 are implementation discussions. I think everyone agrees they should be done, so let’s (continue) discussing how. Maybe in separate issues or PRs though, and I think it might be useful to do 1 first and then 2.

3 and 4 are long term and conceptual. In any case, you probably want a central clock in the model, right? So it doesn’t block 1 or 2, and we can continue discussing 3 and 4 in their respective discussions.

5 important, but can quickly get very broad. If it’s not about this specific implementation anymore I would say spin off into a new discussion.

6 also important, but can get personal, and thus maybe face to face stuff (or very well-thought out written out).

(might still be missing some stuff)

In general, I would suggest issues and PRs to be atomic, and only focus on one coherent issue. Of course it can touch other stuff, and therefore spin-off new discussions (which is great in general, also about meta things like user/contributor friendliness), but let’s try to spin-off those discussions in separate threads, or discuss them in a face to face dev meeting. That helps to keep the PRs on topic.

Corvince · 2024-01-08T07:48:40Z

Thanks a lot for summarizing the sometimes confusing discussion @EwoutH !

However I disagree on

Decoupling the data collector from the scheduler
[...]
1 and 2 are implementation discussions. I think everyone agrees they should be done, so let’s (continue) discussing how. Maybe in separate issues or PRs though, and I think it might be useful to do 1 first and then 2.

From the discussion in #1912 I think it is still unclear if we want to actually get rid of schedulers or not. I think we need to continue to discuss this first before we continue this path here. Because if we want to keep schedulers I think they are the right place to keep track of time and so there is no need to decouple the logic. I mean the whole discussion can be viewed as an advantage of schedulers - it is clear that they need to track time. For example, if we remove schedulers, but add a Clock instance we don't really gain anything. Same for tracking time inside the model instance - we just further clutter the model namespace, but keep a tied coupling. This isn't necessarily bad, but we should really first discuss about the future of schedulers before arguing about implementation details. And the best place for this discussion is #1912.

rht · 2024-01-08T08:44:30Z

For example, if we remove schedulers, but add a Clock instance we don't really gain anything.

The gain: model.agents.shuffle().do("step") and model.agents.shuffle().do("advance") is conceptually clearer than the term StagedActivation with lots of boilerplate code. The Clock instance has a very specific purpose and is easy to conceptualize and explain.

EwoutH · 2024-01-26T07:55:31Z

Right, I now understand the complication:

The data collector depended on the schedule.
If an agent is only removed from the schedule (and not from the model) the data collection used to stop.
Now it doesn’t, and data keeps being collected.

While it isn’t best practice, there might be models out there that rely on this behavior.

EwoutH · 2024-01-26T07:58:12Z

(sorry misclicked)

Let me think a bit about possible solutions. Maybe we can get away with throwing a clear warning in the right place.

Edit: I feel in both the old and new implementation we make assumptions about for which agents data will be collected. In the new one we definitely have to make that explicit.

EwoutH · 2024-01-26T08:08:52Z

Another option could be adding some switch like old_datacollector_behaviour which we flip on 3.0.

rht · 2024-01-26T08:13:52Z

I am removing the commit "time: Remove agent.remove in remove" so that this PR is ready to merge as is.

EwoutH · 2024-01-26T08:28:26Z

Sorry but for me this doesn't solve the issue:

There might be models which remove the agent after removing it from the schedule. Those will now crash, since the agent is already removed.
There might be models that remove it from the schedule temporarily and then add it back again.
There might be models that remove it from the schedule but keep it on the grid

I'm not happy about altering time module behavior. I would like to solve it in the datacollector. Some kind of flag or switch that "if an agents is removed from model.schedule, we will stop collecting data. With Mesa 3.0 this might change. We recommend explicitly removing your agent from the model with agent.remove() if you want to completely remove the agent."

If you want, I can try to come up with an implementation in the weekend.

Corvince · 2024-01-26T09:25:56Z

I think the simplest solution would be to check if model.schedule exists and if it does collect data from its agents. Otherwise use model.agents. This would be backwards compatible, but allow removing the scheduler. And agree that in a future datacollector it should be made explicit which agent data is collected.

/edit and of course don't remove agents from model.agents if they are only removed from the scheduler

quaquel · 2024-01-26T09:30:35Z

I agree with @Corvince proposed solution.

EwoutH · 2024-01-26T10:22:45Z

Good idea, also agreed. @rht would you like to implement it?

rht · 2024-01-26T10:53:35Z

Done. You can check the last commit of this PR.

mesa/datacollection.py

EwoutH

Good to go! Thanks a lot, we churned out a lot of conceptual things together with this PR. I will try to do a quick write up tomorrow.

I can merge later today if preferred. I recommend either squashing or cleaning up the commits.

) * refactor: Remove dependence on model.schedule * model: Implement internal clock * time: Call self.model.advance_time() in step() This ensures that the scheduler's clock and the model's clock are updated and are in sync. * Ensure advance_time call in schedulers happen only once in a model step * Turn model steps and time to be private attribute * Rename advance_time to _advance_time * Annotate model._steps * Remove _advance_time from tests This is because schedule.step already calls _advance_time under the hood. * model: Rename _time to time_ * Rename _steps to steps_ * Revert applying _advance_time in schedulers step * feat: Automatically call _advance_time right after model step() Solution drafted by and partially attributed to ChatGPT: https://chat.openai.com/share/d9b9c6c6-17d0-4eb9-9eae-484402bed756 * fix: Make sure agent removes itself in schedule.remove * fix: Do step() wrapping in scheduler instead of model * fix: JupyterViz: replace model.steps with model.steps_ * Rename steps_ -> _steps, time_ -> _time * agent_records: Use model.agents only when model has no scheduler --------- Co-authored-by: Ewout ter Hoeven <E.M.terHoeven@student.tudelft.nl>

EwoutH · 2024-07-03T14:21:01Z

The only thing I currently still dislike is that time and step have to explicitly increased (by using _advance_time). It should just keep track of the number of steps by default, and optionally can be overwritten if you want to do something else.

This is still the case in the current codebase right? If so, it's a bit weird that users should call an private function to increase the time.

rht mentioned this pull request Jan 7, 2024

Boltzmann wealth model: Use the new Mesa 2.2 API & some refactors projectmesa/mesa-examples#79

Open

1 task

rht force-pushed the rm_schedule branch from eb3d001 to 3fec08c Compare January 7, 2024 14:18

rht force-pushed the rm_schedule branch from 3fec08c to c7799a8 Compare January 7, 2024 15:04

EwoutH reviewed Jan 7, 2024

View reviewed changes

tests/test_batch_run.py Outdated Show resolved Hide resolved

EwoutH closed this Jan 26, 2024

EwoutH reopened this Jan 26, 2024

rht force-pushed the rm_schedule branch from 40dce90 to 8e7754f Compare January 26, 2024 08:14

agent_records: Use model.agents only when model has no scheduler

eae90a6

EwoutH reviewed Jan 26, 2024

View reviewed changes

mesa/datacollection.py Show resolved Hide resolved

quaquel approved these changes Jan 26, 2024

View reviewed changes

EwoutH changed the title ~~refactor: Remove dependence on model.schedule~~ refactor: Remove dependence on model.schedule, add time andto Model Jan 26, 2024

EwoutH changed the title ~~refactor: Remove dependence on model.schedule, add time andto Model~~ refactor: Remove dependence on model.schedule, add clock to Model Jan 26, 2024

EwoutH approved these changes Jan 26, 2024

View reviewed changes

EwoutH added the enhancement Release notes label label Jan 26, 2024

rht merged commit 003cbe3 into projectmesa:main Jan 26, 2024
12 of 13 checks passed

rht deleted the rm_schedule branch January 26, 2024 16:36

This was referenced Jan 30, 2024

model.get_agents_of_type() still contains agents that have been removed both from the grid and the scheduler #2019

Closed

fix: Initialize model _steps and _time during __new__ #2026

Merged

EwoutH mentioned this pull request Mar 9, 2024

support for discrete event scheduling #2066

Merged

EwoutH mentioned this pull request May 5, 2024

Issue with data collection of PropertyLayer objects #2128

Closed

This was referenced Aug 17, 2024

Automatic time and step incrementing #2222

Closed

model: Automatically increase steps counter #2223

Merged

refactor: Remove dependence on model.schedule, add clock to Model #1942

refactor: Remove dependence on model.schedule, add clock to Model #1942

Conversation

rht commented Jan 7, 2024 • edited Loading

quaquel commented Jan 7, 2024

rht commented Jan 7, 2024

quaquel commented Jan 7, 2024

EwoutH commented Jan 7, 2024

rht commented Jan 7, 2024

rht commented Jan 7, 2024

rht commented Jan 7, 2024

quaquel commented Jan 7, 2024 • edited Loading

EwoutH commented Jan 7, 2024

quaquel commented Jan 7, 2024 • edited Loading

quaquel commented Jan 7, 2024

rht commented Jan 7, 2024

quaquel commented Jan 7, 2024

rht commented Jan 7, 2024

rht commented Jan 7, 2024 • edited Loading

rht commented Jan 7, 2024

EwoutH commented Jan 7, 2024 • edited Loading

quaquel commented Jan 7, 2024

rht commented Jan 7, 2024

rht commented Jan 7, 2024

rht commented Jan 7, 2024

quaquel commented Jan 7, 2024 • edited by jackiekazil Loading

rht commented Jan 7, 2024

EwoutH commented Jan 7, 2024

Corvince commented Jan 8, 2024

rht commented Jan 8, 2024

EwoutH commented Jan 26, 2024 • edited Loading

EwoutH commented Jan 26, 2024 • edited Loading

EwoutH commented Jan 26, 2024

rht commented Jan 26, 2024

EwoutH commented Jan 26, 2024 • edited Loading

Corvince commented Jan 26, 2024 • edited Loading

quaquel commented Jan 26, 2024

EwoutH commented Jan 26, 2024

rht commented Jan 26, 2024

EwoutH left a comment

Choose a reason for hiding this comment

EwoutH commented Jul 3, 2024

rht commented Jan 7, 2024 •

edited

Loading

quaquel commented Jan 7, 2024 •

edited

Loading

quaquel commented Jan 7, 2024 •

edited

Loading

rht commented Jan 7, 2024 •

edited

Loading

EwoutH commented Jan 7, 2024 •

edited

Loading

quaquel commented Jan 7, 2024 •

edited by jackiekazil

Loading

EwoutH commented Jan 26, 2024 •

edited

Loading

EwoutH commented Jan 26, 2024 •

edited

Loading

EwoutH commented Jan 26, 2024 •

edited

Loading

Corvince commented Jan 26, 2024 •

edited

Loading