Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve AgentSet method chaining performance #2264

Open
quaquel opened this issue Sep 1, 2024 · 4 comments
Open

Improve AgentSet method chaining performance #2264

quaquel opened this issue Sep 1, 2024 · 4 comments

Comments

@quaquel
Copy link
Member

quaquel commented Sep 1, 2024

While adding groupby and updating select, there was some discussion on the performance of method chaining. The current code loops over all agents in each method. However, if you are chaining methods, this can become quite inefficient. Because each loop involves turning weakrefs back into hard refs, applying the operation, and creating a new weakref. Moreover, you are looping over all agents multiple times.

@EwoutH mentioned exploring deferred execution, while @Corvince suggested using generators. The main advance of chaining generators is that it is possible to let each operation in the chain operate on a single agent. So, you are effectively looping over all agents only once. Moreover, it might be possible to pass the actual agent from operation to operation so that only at the beginning the weakref is resolved and only at the end of all operations a new weakref is created.

Exactly how chaining generators would work, what the API implications would be, and what the performance gains are remains to be explored. As a first step, it is probably a good idea to develop a proof of principle. Next, we can discuss how this proof of principle can be used with the existing AgentSet, assuming real performance benefits exist.

@Corvince
Copy link
Contributor

Corvince commented Sep 2, 2024

Before jumping to a POC, maybe let's think about this a bit more.

What should this do

agentset.do("foo").do("bar")

It should do foo and then bar. But what if bar depends on foo? So should it be

# A)
for agent in agentset:
    agent.foo()
    agent.bar()

# B)
for agent in agentset:
    agent.foo()

for agent in agentset:
    agent.bar()

I think B) is more common in abms - it's interactions what makes abm interesting.

But for that case we can't use generators - we have to evaluate first.

For agentset.shuffle() it also doesn't make sense to use generators (reasoning left as an exercise for the reader ;) )

That leaves agentset.select. Conceptually it makes sense to return an iterator here. But I don't have an idea on how this may look like. Maybe an LazyAgentset

Or we scrap this all together and users just need to use an good old for loop for best performance. For which the semantics are already good

@quaquel
Copy link
Member Author

quaquel commented Sep 2, 2024

While thinking about this over the weekend, I also realized it makes no sense for shuffle. Your point about do is also very clear. It might make sense for set and get in addition to select. I, however, doubt the hassle, code and API wise, is worth the performance gains.

@EwoutH
Copy link
Member

EwoutH commented Sep 6, 2024

@EwoutH
Copy link
Member

EwoutH commented Sep 19, 2024

I was thinking, I preferably don't want to change the user API too much with specialty functions. So after brainstorming a bit (conversation), adding a _pending_shuffle flag could help:

class AgentSet:
    def __init__(self, agents, random: Random = None):
        self._agents = agents  # Assuming _agents is a dict or similar structure
        self.random = random or Random()
        self._pending_shuffle = False  # Internal flag to track shuffle

    def shuffle(self):
        self._pending_shuffle = True  # Set the flag
        return self  # Allow method chaining

    def do(self, method: str | Callable, *args, **kwargs) -> 'AgentSet':
        if self._pending_shuffle:
            # If shuffle was called before do, use shuffle_do
            self._pending_shuffle = False  # Reset the flag
            return self.shuffle_do(method, *args, **kwargs)
        
        # Original do implementation
        if isinstance(method, str):
            for agent in self._agents:
                getattr(agent, method)(*args, **kwargs)
        else:
            for agent in self._agents:
                method(agent, *args, **kwargs)
        return self

    def shuffle_do(self, method: str | Callable, *args, **kwargs) -> 'AgentSet':
        # Optimized shuffle_do implementation
        agents = list(self._agents.keys())  # Assuming _agents is a dict
        self.random.shuffle(agents)

        if isinstance(method, str):
            for agent in agents:
                getattr(agent, method)(*args, **kwargs)
        else:
            for agent in agents:
                method(agent, *args, **kwargs)

        return self

However, this will break calling shuffle() on itself. So, maybe we add a argument to shuffle(), like delay=True, which will delay the shuffle to the next function. Then functions like do, map and select could accept this delayed shuffle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants