-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delegation fixes #6165
base: main
Are you sure you want to change the base?
Delegation fixes #6165
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
@openhands-agent Read the diff of this PR carefully. Understand what it tries to achieve. Then, we have two things to do:
Important: |
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
0c734f3
to
870dd39
Compare
870dd39
to
0f00ea6
Compare
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
Trigger by: Pull Request (integration-test label on PR #6165) uncomment meIntegration Tests Report (DeepSeek) uncomment meIntegration Tests Report Delegator (Haiku) Total cost: USD 0.00
Integration Tests Report Delegator (DeepSeek) Total cost: USD 0.00
Download testing outputs (includes both Haiku and DeepSeek results): Download |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this fix!
@@ -165,7 +169,11 @@ async def close(self) -> None: | |||
) | |||
|
|||
# unsubscribe from the event stream | |||
self.event_stream.unsubscribe(EventStreamSubscriber.AGENT_CONTROLLER, self.id) | |||
# only the root parent controller subscribes to the event stream | |||
if not self.is_delegate: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: does this design work for multi-layer delegation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I thought about keeping that, but I don't know how useful it really is in the practice around us, and I'd leave it for a follow-up. I think it did work before, but I don't think we ever tested for it? Do you find it useful?
A use case that keeps appearing around is planner / executor type of workflows (for example, and it's only one example 3770) - and those don't need it, but I would love it if we support them well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a more generalized solution that can handle a tree of multi-level delegation.
We don't use multi-level right now, but it would be annoying if we had to hack up the framework to make it work later and diverge the agent_controller
implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it did work before, but I don't think we ever tested for it?
Yeah it did work before, and I think we had a test for it; the dummy agent used to delegate to itself then itself and then itself again...
Do you find it useful?
Not really 😁 in my imaginary setting, it wasn't useful because LLMs were not powerful enough to be used for super long-horizon tasks (e.g. found a company and release a product). Could it be? Maybe, maybe not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just babbling: maybe multi-layer delegation setting would be more useful in robotic/industrial engineering areas? Where "agents" really don't care about the past and future, and there's very narrow but different action spaces that they shall follow.
Or maybe this: a very intelligent yet expensive agent that makes decisions, which hands over to good and not-too-expensive agents, which sometimes needs some work to be done by mediocre and cheap agents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If CodeAct ever decides to delegate to BrowsingAgent again or delegate to a micro agent, then we will need multi layer for anything that uses planner -> CodeAct -> Browsing/Micro
It doesn't need to be that crazy deep but needing to support 2 to 3 layers doesn't seem that unlikely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are good points! I think it will work fine. This PR changes the behavior at the edges, so that the delegated agent is exactly like a non-delegated agent, no special handling necessary to become a delegate or to delegate, they use the event stream.
Like if you want to use CodeAct, you send a delegate action asking for codeact, and you don't have to modify the agent for it to work. If an agent knows about the delegate tool, it can use it and become a parent, if not, it's just a kid.😅
I think that makes it easier to add depth, not harder. I kept it simple to see and test the flow, but it should be just making use of the level counter. (famous last words? 🤣) Will look!
End-user friendly description of the problem this fixes or functionality that this introduces
Fix agent delegation; use events for communication between parent and delegates.
Fix the lockup when the model returned a message.
Give a summary of what the PR does, explaining any non-trivial design decisions
Delegation was broken after we made the agent loop rely exclusively on a controller-as-observer logic. This PR proposes to fix it in a simple way: by forwarding to the delegate
should_step
on both MessageActions from 'user' and 'agent', except when waiting for user input is explicitly setshould_step
on DelegateAction too, it will create a MessageAction to kickstart the delegateAlso:
The code is ready for review - or this logic of delegation.
(please ignore the print() stuff, will clean up later)
Link of any specific issues this addresses
Fix #6162
To run this PR locally, use the following command: