[Bug]: Regression in AgentController broke AgentDelegationAction #6162

diwu-sf · 2025-01-09T02:02:27Z

Is there an existing issue for the same bug?

I have checked the existing issues.

Describe the bug and reproduction steps

#5868 seem to have broken agent delegation and multi agent.

    def should_step(self, event: Event) -> bool:
        if isinstance(event, Action):
            if isinstance(event, MessageAction) and event.source == EventSource.USER:
                return True
            return False
        if isinstance(event, Observation):
            if isinstance(event, NullObservation) or isinstance(
                event, AgentStateChangedObservation
            ):
                return False
            return True
        return False

The should step is too restrictive and doesn't allow any agent delegation.

I think there should be tests introduced to catch future regressions with AgentDelegation.

I'm curious how the CodeActAgent -> BrowsingAgent delegation is working right now on HEAD because the AgentDelegateAction doesn't step() according to the code

OpenHands Installation

Docker command in README

OpenHands Version

No response

Operating System

None

Logs, Errors, Screenshots, and Additional Context

No response

The text was updated successfully, but these errors were encountered:

li-boxuan · 2025-01-09T04:53:47Z

First, I didn't read the bug report you filed; sorry! Hopefully someone else could take a peek. I'd like to comment on two things:

I think there should be tests introduced to catch future regressions with AgentDelegation.

History is a bit complicated here. We had integration tests for agent delegation at some point (which was introduced because a refactoring PR broke this functionality). Those tests were then removed at some point due to 1) non-determinism introduced by the LLM-based editing, 2) the daily pipelines that run mini evaluation (I am not sure if those suites include delegation).

I'm curious how the CodeActAgent -> BrowsingAgent delegation is working right now on HEAD

On HEAD, CodeActAgent doesn't delegate to BrowsingAgent. It handles browsing by itself.

diwu-sf · 2025-01-09T05:14:56Z

The DelegatorAgent on HEAD should also be equally broken at the moment.
Without delegation working, all multi-agent configurations are broken. I think we should bring back some level of simple agent delegation with deterministic simple tasks back into the daily tests.

li-boxuan · 2025-01-09T05:17:32Z

I think we should bring back some level of simple agent delegation with deterministic simple tasks back into the daily tests.

Yeah I agree, that's what I wanted to achieve with #6049

FWIW we used to run integration tests with mocked prompts & responses from LLMs, which was not very dev-friendly whenever they do even just a little change to the prompt.

yufansong · 2025-01-09T05:35:03Z

I think we should bring back some level of simple agent delegation with deterministic simple tasks back into the daily tests.

Yeah I agree, that's what I wanted to achieve with #6049

FWIW we used to run integration tests with mocked prompts & responses from LLMs, which was not very dev-friendly whenever they do even just a little change to the prompt.

@li-boxuan do we still delegate action to other agents? I though we move all needed action to codeact agent. 🤔

neubig · 2025-01-09T05:56:58Z

Yeah, CodeActAgent doesn't use delegation anymore, which is also probably why we didn't notice that this broke. I think the core use case of OpenHands doesn't need this, but if other users need it it should probably be fixed (as long as it doesn't cause too much maintenance burden).

diwu-sf · 2025-01-09T16:46:02Z

"I think the core use case of OpenHands doesn't need this, but if other users need it it should probably be fixed (as long as it doesn't cause too much maintenance burden)."

I think this is core usecase to OpenHands as a usable "multi" agent framework, even if CodeAct is a non delegating agent.

Is OpenHands a multi agent framework or is it just a UI on top of CodeAct?

enyst · 2025-01-09T17:30:43Z

I think there should be tests introduced to catch future regressions with AgentDelegation.

As @li-boxuan said, we have deeply changed the integration tests, and we haven't yet given the new tests enough love. Sorry about that. Specifically, the new integration tests only run CodeAct, which means all other agents are not tested on the full execution flow for a while now. That includes DelegatorAgent, and with it, the core functionality of delegation.

I intended to add tests for the other agents. Added in the linked PR a couple for DelegatorAgent, to see how the PR changes work.

enyst · 2025-01-09T17:50:23Z

Is OpenHands a multi agent framework or is it just a UI on top of CodeAct?

This is a fair question for more reasons than one, and a good discussion to have.

To be clear, my fixing of this issue is separate from that discussion. It bothers me that we have core functionality that hasn't been tested anymore for a couple of months, and now it broke, as it obviously would have someday!

On the related note, may I ask how are you using delegation, are you using these micro-agents we have now, or just Delegator with your own, or not even Delegator? How are you finding the delegation feature, was it working for you?

diwu-sf · 2025-01-09T18:13:42Z

We built our own domain specific agent, there's 3+ different roles implemented using AgentDelegateAction and it's more reliable at staying on track than CodeAct agent when the tasks take more than 5 mins of iteration. CodeAct agent is used as one of the sub-agent that get coding tasks delegated to, but we also have non-coding agents to "focus" on other parts of the workflows.

We are not using the DelegatorAgent (https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/delegator_agent) but that one served as an example for us to learn how agent delegation should be implemented in OpenHands framework.

I suspect if integration tests are introduced to ensure that DelegatorAgent always executed correctly, then it would cover our architecture.

Thanks for the bug fix.

diwu-sf added the bug Something isn't working label Jan 9, 2025

kevin-support-bot bot mentioned this issue Jan 9, 2025

[Bug]: Regression in AgentController broke AgentDelegationAction SmartManoj/Kevin#195

Open

mamoodi added the severity:medium Affecting multiple users label Jan 9, 2025

enyst linked a pull request Jan 9, 2025 that will close this issue

Delegation fixes #6165

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Regression in AgentController broke AgentDelegationAction #6162

[Bug]: Regression in AgentController broke AgentDelegationAction #6162

diwu-sf commented Jan 9, 2025 •

edited

Loading

li-boxuan commented Jan 9, 2025 •

edited

Loading

diwu-sf commented Jan 9, 2025

li-boxuan commented Jan 9, 2025 •

edited

Loading

yufansong commented Jan 9, 2025

neubig commented Jan 9, 2025

diwu-sf commented Jan 9, 2025

enyst commented Jan 9, 2025

enyst commented Jan 9, 2025 •

edited

Loading

diwu-sf commented Jan 9, 2025

[Bug]: Regression in AgentController broke AgentDelegationAction #6162

[Bug]: Regression in AgentController broke AgentDelegationAction #6162

Comments

diwu-sf commented Jan 9, 2025 • edited Loading

Is there an existing issue for the same bug?

Describe the bug and reproduction steps

OpenHands Installation

OpenHands Version

Operating System

Logs, Errors, Screenshots, and Additional Context

li-boxuan commented Jan 9, 2025 • edited Loading

diwu-sf commented Jan 9, 2025

li-boxuan commented Jan 9, 2025 • edited Loading

yufansong commented Jan 9, 2025

neubig commented Jan 9, 2025

diwu-sf commented Jan 9, 2025

enyst commented Jan 9, 2025

enyst commented Jan 9, 2025 • edited Loading

diwu-sf commented Jan 9, 2025

diwu-sf commented Jan 9, 2025 •

edited

Loading

li-boxuan commented Jan 9, 2025 •

edited

Loading

li-boxuan commented Jan 9, 2025 •

edited

Loading

enyst commented Jan 9, 2025 •

edited

Loading