Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Human intervention abstraction #121

Open
mmurad2 opened this issue Oct 30, 2024 · 19 comments · May be fixed by #255
Open

Human intervention abstraction #121

mmurad2 opened this issue Oct 30, 2024 · 19 comments · May be fixed by #255
Assignees
Labels
enhancement New feature or request

Comments

@mmurad2
Copy link
Member

mmurad2 commented Oct 30, 2024

Is your feature request related to a problem? Please describe.
For certain use cases, the agent requires additional information or steering from the user in order to execute correctly.

Describe the solution you'd like
An abstraction that calls for human intervention during a single agent's workflow. This could be consider a "tool" that can be configured for a certain use case. I would assume the agent builder would need to define when this "tool" should be used.

@mmurad2 mmurad2 added the enhancement New feature or request label Oct 30, 2024
@prattyushmangal
Copy link

prattyushmangal commented Nov 7, 2024

Hi Maya, this ticket is super interesting and might be a good first iteration. This could be part of a larger "epic" of work enabling human-in-the-loop interaction.

The human as a tool is valid for cases when the agent determines the need to call out to a human for more information, but a complementary paradigm for the framework developers might be building agents with predefined points of validation and corrections from the end user.

So having a concept similar to the emitter for observing agent behaviour might be useful to have a concept that allows developers to pause and resume agent behaviour with a human interaction in the middle.

Eg:

Validate Step/Tool to call
Validate Tool Input to use

In a more complex case, the end user may also be able to rerun previous prompt with adapted tool inputs that they observed were used by the agent last time.

@mmurad2
Copy link
Member Author

mmurad2 commented Nov 12, 2024

thanks for the feedback @prattyushmangal - would you be open to contributing this?

@matiasmolinas
Copy link

matiasmolinas commented Nov 17, 2024

Hi @mmurad2 , I find this issue very interesting and would love to contribute! I'm new to the Bee Framework but have experience developing an agent-based product with a custom framework, which I'm now migrating to Bee. This feature aligns closely with patterns I've worked on, and I'd be thrilled to help implement it.

Is someone already working on this, or can I take the issue?

Thanks
Matias

@mmurad2
Copy link
Member Author

mmurad2 commented Nov 19, 2024

@matiasmolinas just assigned to you :) @prattyushmangal has been thinking about human intervention and can help provide feedback. Also @Tomas2D if you have any specific input on this, please do share!

@prattyushmangal
Copy link

Hey everyone, yep this works for me too, apologies was not able to commit to completely contributing this feature.

Happy to help @matiasmolinas where you need. Once we have a pattern for "pausing execution" to collect inputs from humans by implementing the Human Tool we should be able to replicate that for the human-in-the-loop interceptor/verifier interactions too 😄

@matiasmolinas
Copy link

Hi everyone!

Thank you for the feedback and support on this issue. I’ve brainstormed a solution using GitHub Copilot Workspace to address the need for human intervention during an agent's workflow. Below is the proposed approach, incorporating feedback and outlining how I plan to implement the feature:

Current Observations
There is no abstraction in the current framework that allows for human intervention during an agent's workflow.
Tools, emitters, and agent runners currently lack the mechanisms for pausing and resuming workflows to collect human inputs.
Proposed Solution
I propose introducing a HumanTool class to enable human intervention. Here are the key components and design decisions:

Triggering the Human Intervention Tool
Selected Approach: Define a specific tool name, such as "HumanTool".

Update the BeeAgentRunner class to recognize "HumanTool" in the parser regex.
Implement logic in the tool method to handle human intervention and pause the agent’s workflow.
Expected User Inputs

Tool Name: "HumanTool".
Tool Inputs: Any relevant data required for the tool to function.
Validation Points: Allow users to validate or correct outputs at predefined points in the workflow.
Optional Rerun Prompt: Enable rerunning the previous prompt with updated inputs.
Design of the Human Intervention Process
Selected Approach: Human Tool Integration

Define "HumanTool" as part of the BeeAgent’s tools.
Implement pausing and resuming workflows when the tool is invoked.
Collect and validate user inputs during the pause, then continue the workflow.
Configuration of the Human Intervention Tool

Add "HumanTool" to the BeeAgentRunner parser regex.
Update the BeeAgent class to integrate this tool seamlessly into the workflow.
Workflow Resumption

The workflow will resume from where it was paused after collecting human inputs.
This ensures continuity and allows the agent to proceed with validated or updated inputs.
Next Steps
While I await feedback, I’m working on a first version following this approach. This will allow me to test the design and identify potential challenges early.

Before I move too far ahead, I’d appreciate your feedback, especially on:

The selected approaches for triggering human intervention and resuming workflows.
Any additional use cases or edge cases to consider for the HumanTool.
Let me know if this aligns with the project’s vision or if there are any adjustments needed. Once confirmed, I’ll move forward with refining the implementation and providing updates as I progress.

@prattyushmangal
Copy link

Hi Matias,

Thank you for your thoughts on this issue already. Here are my thoughts.

Step 1: Define the 'HumanTool' as a tool which can be equipped to any Bee and be invoked by it when it deems there is some missing information required for the following steps and hence must call out to the user to collect it. So I agree with you on the following:

  • Tool Name: 'HumanTool'
  • Tool Input: For a Bee to use this tool, it should generate a NL message asking the user for some specific information.
  • Tool Output: NL response from the end user
  • BeeAgentRunner 'HumanTool' Execution: The Runner should then execute the HumanTool by sending the NL message to the end user and wait for a response. Once the user has responded, the tool output (message from the user) is appended to the Agent memory and the loop for Action and ActionInput can continue

Step 2: Is repurposing this HumanTool pattern for other, developer defined interventions. But I propose that we defer this to a secondary issue once we have implemented a pattern for HumanTool which the Bee Agents can use for info gathering.

@prattyushmangal
Copy link

prattyushmangal commented Nov 20, 2024

For the technical details on:

The selected approaches for triggering human intervention and resuming workflows.

Any additional use cases or edge cases to consider for the HumanTool.

  • I think for now we can consider just a single use case where the Bee agent has deemed it needs to get some additional info from the user in the form of an NL message.

@matiasmolinas
Copy link

matiasmolinas commented Nov 20, 2024

Thank you for the detailed guidelines! I’ll share updates as I progress!

matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Nov 22, 2024
@matiasmolinas
Copy link

Hi all,

I’ve published a branch containing the first working draft, which I’m currently refactoring and refining. You can find it here:

draft

Key Points and Observations:
Prompt Dependencies:

I’m using OpenAI’s GPT-4 for generating responses and have been defining/modifying prompts as part of this draft.
One concern is that these prompts might work well with GPT-4 but could potentially fail with smaller models, such as LLaMA’s smaller versions. This might introduce compatibility risks, which I’ll need to address in later iterations.
HumanTool Invocation:

I noticed that the conditions for calling HumanTool aren’t well-defined. My current assumption is that it should be triggered when additional information is required to achieve a goal, but this needs further refinement to ensure consistency and reliability.
Initial Testing:

The draft works for a very simple use case in its "happy path." This provides a starting point for making the solution more generic and identifying and fixing bugs.
Class and Prompt Modifications:

I edited some existing classes and prompts for this draft. While this approach is acceptable for internal testing, I plan to refactor these changes in the final version to ensure a cleaner implementation. The final version will use my own extensions and classes to maintain modularity and readability.
Request for Feedback:
I’m sharing this early progress to gather partial feedback during development. This will help me align my approach with the project’s expectations and make necessary adjustments before the final commit.

Thanks! Let me know your thoughts.

@matiasmolinas
Copy link

This is a sample output of the current draft:

User 👤 : Can you write the formula to calculate the are of a triangle?
Agent (thought) 🤖 :  The user is asking for the formula to calculate the area of a triangle, which I can provide directly.
Agent (final_answer) 🤖 :  The formula to calculate the area of a triangle is:

\[ \text{Area} = \frac{1}{2} \times \text{base} \times \text{height} \]

Where "base" is the length of the base of the triangle, and "height" is the perpendicular height from the base to the opposite vertex.
Agent 🤖 :  The formula to calculate the area of a triangle is:

\[ \text{Area} = \frac{1}{2} \times \text{base} \times \text{height} \]

Where "base" is the length of the base of the triangle, and "height" is the perpendicular height from the base to the opposite vertex.
User 👤 : I need help to calculate an are of a shape.
Agent (thought) 🤖 :  The user's request is too general. I need to ask for more specifics about the shape.
Agent (tool_name) 🤖 :  HumanTool
Agent (tool_input) 🤖 :  { "message": "Could you please specify the type of shape you need help with? For example, is it a triangle, rectangle, circle, etc.?" }
HumanTool Could you please specify the type of shape you need help with? For example, is it a triangle, rectangle, circle, etc.?
Interactive session has started. To escape, input 'q' and submit.
Please provide the required information: a circle

Agent (tool_output) 🤖 :  a circle
Agent (thought) 🤖 :  The user needs help calculating the area of a circle. I will provide the formula for that.

\[ \text{Area} = \pi \times r^2 \]

Where \( r \) is the radius of the circle.
Agent 🤖 :  The formula to calculate the area of a circle is:

\[ \text{Area} = \pi \times r^2 \]

Where \( r \) is the radius of the circle.

@matiasmolinas
Copy link

Summary of work in progress

Key Updates

  1. Introduction of HumanTool:

    • Adds the ability for the agent to pause workflows and request human input dynamically.
    • Facilitates interactive back-and-forth communication during task execution.
  2. Interactive Input Handling:

    • Utilizes createConsoleReader for real-time user interaction.
    • Ensures a smooth and responsive dialog to gather additional information when needed.
  3. Prompt Template Enhancements:

    • Prompts now explicitly require the use of HumanTool for cases where user clarification is needed.

Workflow Comparison: Before and After the Update

Aspect Before Update After Update
Clarification Handling No mechanism for user input. Uses HumanTool for interactive clarification.
Workflow Continuity Terminated workflows with incomplete inputs. Pauses and resumes workflows after user input.
Agent-User Interaction Limited to static outputs. Enables dynamic and conversational interactions.
Tool Integration Relied on predefined tools. Adds flexibility with HumanTool.
Prompt Behavior Static prompts guiding responses. Prompts enforce human intervention when needed.

Benefits of the Update

  1. Enhanced User Experience:

    • The agent is more conversational and adaptable to user needs.
  2. Improved Workflow Continuity:

    • Prevents workflow termination due to missing or unclear data.
  3. Support for Complex Tasks:

    • Handles ambiguous scenarios or multi-step workflows requiring user guidance.

Sequence Diagrams

Before Update

sequenceDiagram
    participant User
    participant ConsoleReader
    participant BeeAgent
    participant LLM
    participant Tools
    participant Memory

    User->>ConsoleReader: Provide Input
    ConsoleReader->>BeeAgent: Submit Prompt
    BeeAgent->>LLM: Generate Initial Response
    LLM-->>BeeAgent: Provide Output
    BeeAgent->>Tools: Invoke Tools (if required)
    Tools-->>BeeAgent: Return Results
    BeeAgent->>Memory: Update Memory
    BeeAgent-->>ConsoleReader: Deliver Response
Loading

After Update

sequenceDiagram
    participant User
    participant ConsoleReader
    participant BeeAgent
    participant LLM
    participant HumanTool
    participant Tools
    participant Memory

    User->>ConsoleReader: Provide Input
    ConsoleReader->>BeeAgent: Submit Prompt
    BeeAgent->>LLM: Generate Initial Response
    LLM-->>BeeAgent: Decide to Use HumanTool
    BeeAgent->>HumanTool: Request User Input
    HumanTool-->>User: Display Message and Await Response
    User->>HumanTool: Provide Input
    HumanTool-->>BeeAgent: Return User Input
    BeeAgent->>Tools: Invoke Tools (if required)
    Tools-->>BeeAgent: Return Results
    BeeAgent->>Memory: Update Memory
    BeeAgent-->>ConsoleReader: Deliver Final Response
Loading

@matiasmolinas
Copy link

Hi everyone,

I’ve implemented the first version of the HumanTool and integrated it into the BeeAgent framework. You can find my work in my fork here: human-tool branch.

Key Updates:

  • HumanTool Implementation: Allows the agent to dynamically request human input during workflows and resume once input is received.
  • Shared Input/Output Management: Uses a shared console reader (sharedConsoleReader) for consistent handling of user interactions.
  • Improved Session Continuity: Ensures workflows are paused and resumed correctly without unexpected termination.
  • Prompt Enforcement: Updated prompts to ensure the agent reliably invokes HumanTool when encountering missing or unclear data.

I plan to review and refine this implementation tomorrow. If everything works as expected during testing, I will create a pull request to merge these changes into the main repository.

If there are any specific areas you’d like me to double-check or additional feedback before the PR, feel free to let me know.

Thank you for the guidance and support on this issue!

matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Nov 29, 2024
Refactored \`human.ts\` to remove unused imports and variables:
- Removed unused \`ToolOutput\` import.
- Renamed \`options\` to \`_options\` to comply with linting rules for unused variables.

Refactored \`io.ts\` to eliminate console statements:
- Replaced \`console.error\` with comments for fallback handling.

These changes ensure that \`yarn lint\` runs without errors or warnings.

Ref: i-am-bee#121
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 3, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 3, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 3, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 3, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 3, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 4, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 4, 2024
… and optimize testing

Refactored \humantool_agent.ts\ based on PR feedback:
- Retained the original system prompt to prevent unnecessary changes.
- Adjusted testing to use LLaMA instead of GPT-4 as per optimization requirements.
- Addressed issues where the model occasionally skipped tool calling.

Ref: i-am-bee#121

Signed-off-by: Matias Molinas <[email protected]>
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 5, 2024
Removed the extended prompt for the human tool, enhanced the tool description, and updated the example to use the default system prompt.

Ref: i-am-bee#121
Signed-off-by: Matias Molinas <[email protected]>
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
… and optimize testing

Refactored \humantool_agent.ts\ based on PR feedback:
- Retained the original system prompt to prevent unnecessary changes.
- Adjusted testing to use LLaMA instead of GPT-4 as per optimization requirements.
- Addressed issues where the model occasionally skipped tool calling.

Ref: i-am-bee#121

Signed-off-by: Matias Molinas <[email protected]>
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
Removed the extended prompt for the human tool, enhanced the tool description, and updated the example to use the default system prompt.

Ref: i-am-bee#121
Signed-off-by: Matias Molinas <[email protected]>
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 7, 2024
@matiasmolinas
Copy link

@mmurad2 @prattyushmangal I created a PR with a basic implementation of what @prattyushmangal define as step 1:

Step 1: Define the 'HumanTool' as a tool which can be equipped to any Bee and be invoked by it when it deems there is some missing information required for the following steps and hence must call out to the user to collect it. So I agree with you on the following:

Tool Name: 'HumanTool'
Tool Input: For a Bee to use this tool, it should generate a NL message asking the user for some specific information.
Tool Output: NL response from the end user
BeeAgentRunner 'HumanTool' Execution: The Runner should then execute the HumanTool by sending the NL message to the end user and wait for a response. Once the user has responded, the tool output (message from the user) is appended to the Agent memory and the loop for Action and ActionInput can continue

In the meantime, while the PR is approved or a change is requested, I am working on step 2:

Step 2: Is repurposing this HumanTool pattern for other, developer defined interventions. But I propose that we defer this to a secondary issue once we have implemented a pattern for HumanTool which the Bee Agents can use for info gathering.

Taking into account the original idea:

The human as a tool is valid for cases when the agent determines the need to call out to a human for more information, but a complementary paradigm for the framework developers might be building agents with predefined points of validation and corrections from the end user.

So having a concept similar to the emitter for observing agent behaviour might be useful to have a concept that allows developers to pause and resume agent behaviour with a human interaction in the middle.

Eg:

Validate Step/Tool to call
Validate Tool Input to use
In a more complex case, the end user may also be able to rerun previous prompt with adapted tool inputs that they observed were used by the agent last time.

Here is my question: Should I implement step 2 using the same issue ID and create a new PR, or do I need to create a new issue to implement it?

@matiasmolinas
Copy link

@mmurad2 @prattyushmangal I’ve created a PR for Step 1 implementing the HumanTool as defined:

Tool Name: HumanTool
Tool Input: Generates an NL message requesting specific information from the user.
Tool Output: NL response from the end user.
Execution: The BeeAgentRunner sends the message, waits for the response, appends it to the Agent memory, and continues the action loop.
The PR also includes fixes based on @Tomas2D’s observations.

In parallel, I’m working on Step 2 to repurpose the HumanTool for additional developer-defined interventions, integrating it with the new RePlanAgent. This involves creating an InterventionTool and an InterventionManager to handle validations and corrections, leveraging the RePlanAgent's planning and event-driven capabilities.

Should I continue using this issue ID for Step 2 and submit a new PR, or would it be better to create a separate issue for implementing the human intervention abstraction?

@matiasmolinas
Copy link

Difference Between HumanTool and Human Intervention

I'm currently working on distinguishing between HumanTool and Human Intervention to establish a clear and structured approach for human-in-the-loop interactions within the Bee framework. This is a work in progress, and I welcome any feedback or suggestions to refine these concepts further.

HumanTool

Definition:

  • HumanTool is a specific tool designed to facilitate information gathering from the user. It allows the agent to request missing or additional information necessary to proceed with its tasks.

Key Characteristics:

  • Purpose: Solely focused on collecting specific pieces of information from the user.
  • Functionality:
    • Tool Input: Generates a natural language (NL) message asking the user for specific information.
    • Tool Output: Receives an NL response from the user.
    • Execution Flow: The BeeAgentRunner sends the NL message, waits for the user's response, appends the response to the agent's memory, and continues the action loop based on the new information.
  • Usage Scenario: When the agent encounters an ambiguity or lacks necessary data to complete a task, it invokes the HumanTool to clarify or obtain the required details.

Example:

  • Input: "What is the weather?"
  • HumanTool Action: Sends a message like, "Could you provide the location for which you would like to know the weather?"
  • User Response: "Santa Fe, Argentina."
  • Outcome: The agent uses the provided location to fetch and deliver the weather information.

Human Intervention

Definition:

  • Human Intervention is a broader abstraction that encompasses various types of interactions between the agent and the user beyond mere information gathering. It includes mechanisms for validation, correction, clarification, and more complex interactions that may influence the agent's decision-making process.

Key Characteristics:

  • Purpose: Facilitates diverse interactions to enhance the agent's performance, reliability, and accuracy by involving the user in different stages of the workflow.
  • Functionality:
    • Validation: Seeks confirmation from the user to ensure the correctness of data or decisions made by the agent.
    • Correction: Allows the user to provide corrections to any mistakes or inaccuracies identified by the agent.
    • Clarification: Requests additional details or explanations to resolve ambiguities or uncertainties in user input or agent actions.
    • Flexibility: Can handle multiple types of interactions, making it adaptable to various scenarios where human input is beneficial.
  • Execution Flow: Managed through an abstraction layer (e.g., InterventionManager and InterventionTool) that listens for specific events (like intervention_requested) and orchestrates the appropriate human interactions before allowing the agent to proceed.
  • Integration with RePlanAgent: Utilizes the planning and event-driven capabilities of the RePlanAgent to determine when and how to invoke different types of interventions, ensuring a seamless workflow that incorporates human feedback effectively.

Usage Scenario:

  • Validation Example: After compiling a report, the agent asks, "Is the data presented in the report accurate?" based on which it either proceeds or revises the report.
  • Correction Example: If the agent makes an incorrect assumption, it prompts, "I believe the project deadline is next Friday. Is this correct?" allowing the user to correct the deadline if necessary.
  • Clarification Example: When faced with vague user instructions, the agent asks, "Could you please specify what you mean by 'optimize the system'?"

Summary of Differences

Aspect HumanTool Human Intervention
Scope Specific to information gathering Broad, encompassing validation, correction, clarification, and more
Purpose Obtain missing or additional information Enhance agent reliability and accuracy through diverse user interactions
Functionality Generates NL messages for specific info requests Manages various interaction types, orchestrates when and how to engage the user
Implementation Single tool (HumanTool) Composite system (InterventionTool, InterventionManager, integrated with RePlanAgent)
Use Cases Resolving ambiguities, filling data gaps Confirming decisions, correcting errors, clarifying instructions, adapting workflows based on user input
Integration Invoked directly by the agent when needed Triggered through event-driven mechanisms, allowing for pausing and resuming agent workflows based on interventions

How They Work Together

  • HumanTool serves as a foundational component for Human Intervention by handling the basic task of information collection.
  • Human Intervention builds upon this by introducing additional layers and types of interactions, enabling more sophisticated and context-aware human-agent collaborations.
  • By integrating Human Intervention with the RePlanAgent, I aim to create a flexible and robust system where the agent can dynamically decide when to seek user input for various reasons, not limited to just gathering information.

This is an ongoing effort, and I plan to further develop these concepts to enhance the agent's interaction capabilities. Any feedback or insights would be greatly appreciated!

matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 13, 2024
Added human.ts to experimental tools and human.ts to agents. Updated io.ts in helpers to support the new tool.
Ref: i-am-bee#121

Signed-off-by: Matias Molinas <[email protected]>
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 15, 2024
- Add InterventionTool for handling human interactions
- Implement InterventionManager for managing intervention events
- Extend RePlan agent to support intervention events
- Add intervention-aware system prompt
- Add intervention event types to RePlanEvents

Ref: i-am-bee#121
Signed-off-by: Matias Molinas <[email protected]>
@prattyushmangal
Copy link

Should I continue using this issue ID for Step 2 and submit a new PR, or would it be better to create a separate issue for implementing the human intervention abstraction?

Hey Matias, thank you for your work on this issue.

With regards to your question I think the human intervention should be a separate ticket and separate PR to make it easier for you and for reviewers most likely.

For the human intervention piece, I imagine an initial PR which demonstrates how an agent can refer back to the end user to validate the step selected and the parameters selected would be a good first pass. In followup work, that could be abstracted to an intervention class, that you mention, that each Agent can utilise and then switch on the Human Intervention features based on a flag at runtime.

Hope that helps.

@matiasmolinas
Copy link

Thank you for the feedback, @prattyushmangal! Here’s how I plan to proceed based on your suggestions:

Separate Issue and PR

I’ll create a new issue and PR specifically for the human intervention abstraction. This will keep the scope focused and easier for reviewers to follow, as suggested.

Initial PR Scope: Full Support for Validation, Basic Support for Correction and Clarification, and the Intervention Class

To ensure a robust design and implementation, the initial PR will include:

  1. Full Support for Validation:
    • The agent will validate both the step selected and the parameters passed to tools.
  2. Basic Support for Correction and Clarification:
    • The agent will handle simple interactions where corrections or additional clarifications are needed.
  3. Intervention Class:
    • A reusable Intervention Class will be introduced to manage validation, correction, and clarification workflows in a consistent manner. This class will:
      • Provide a unified interface for handling different types of human interventions.
      • Allow agents to dynamically enable or disable intervention features using runtime flags.

Examples of Human Intervention Scenarios

To align with your request, I’ll include examples for validation, correction, and clarification:

  1. Validation (Full Support):

    • Example 1: The agent validates the step selected:
      • Agent Prompt: "I am about to perform the step 'Optimize Database Queries'. Is this the correct next step?"
      • User Response: "Yes, that's correct."
    • Example 2: The agent validates tool parameters:
      • Agent Prompt: "I will call the WeatherTool with the location 'Santa Fe, Argentina'. Please confirm if this is correct."
      • User Response: "No, the location should be 'Rosario, Argentina'."
  2. Correction (Basic Support):

    • Example 1: The agent identifies an error:
      • Agent Prompt: "There seems to be an error in the project deadline. Please provide the correct date."
      • User Response: "The correct deadline is March 15th."
    • Example 2: The agent validates and updates a parameter:
      • Agent Prompt: "The expected value for the parameter is 42, but it was set to 24. Should I correct it?"
      • User Response: "Yes, change it to 42."
  3. Clarification (Basic Support):

    • Example 1: The agent resolves ambiguous input:
      • Agent Prompt: "Could you clarify what you mean by 'optimize the system'?"
      • User Response: "I mean improving the database performance."
    • Example 2: The agent clarifies tool input:
      • Agent Prompt: "You requested a translation but didn’t specify the target language. Could you clarify?"
      • User Response: "Please translate it to French."

Planned Implementation

  1. Intervention Class:

    • The Intervention Class will serve as a reusable component for managing all types of human interventions (validation, correction, clarification).
    • It will standardize the interaction workflow and provide runtime flag support to dynamically enable or disable intervention features.
  2. Validation:

    • The agent will use the Intervention Class to validate both steps and tool parameters before proceeding.
    • Validation interactions will ensure workflows are accurate and aligned with user intent.
  3. Correction and Clarification:

    • Basic support for these interactions will allow the agent to request corrections for identified errors or clarify ambiguous inputs.
    • User responses will be incorporated into the agent’s memory to adjust workflows as needed.
  4. Runtime Flag for Human Intervention:

    • The runtime flag will allow dynamic toggling of human intervention features, ensuring flexibility in how and when they are triggered.
  5. Documentation and Diagrams:

    • Detailed documentation and sequence diagrams will be included to illustrate:
      • How validation, correction, and clarification scenarios are handled using the Intervention Class.
      • How workflows are paused and resumed after receiving user input.

@matiasmolinas
Copy link

I forgot to mention earlier that human intervention will be integrated with RePlan to make the framework more adaptive and capable of handling nuanced, real-world scenarios. This integration introduces the InterventionTool, allowing the assistant to emit an "intervention_requested" event when it needs user input for validation, clarification, or correction.

By seamlessly blending automation with human oversight, RePlan can support a broader range of complex scenarios while maintaining accuracy, adaptability, and trustworthiness. Here’s how this integration expands RePlan’s capabilities:

  1. Error Resolution and Plan Refinement: When tools provide incomplete or ambiguous data, human intervention enables the assistant to pause and refine its steps with user guidance.

  2. High-Stakes Decision Support: For critical scenarios such as regulatory compliance or creative decision-making, human validation ensures the plan aligns with legal requirements, creative standards, or specific constraints.

  3. Enhanced Collaboration: By involving users at key moments, the assistant not only ensures better outcomes but also creates a more interactive and user-centric experience.


Examples of Scenarios Leveraging Human Intervention

  1. Scenario: Event Planning with Conflicting Requirements

    • Context: A user requests the assistant to plan a corporate event but provides inconsistent details about the number of attendees and venue preferences.
    • Execution:
      • The assistant gathers venue options using tools but detects conflicting requirements (e.g., an attendee count that exceeds venue capacities).
      • It emits an "intervention_requested" event with the type clarification, prompting the user to confirm the attendee count and prioritize venue attributes (e.g., location, size, cost).
      • With the clarified inputs, the assistant updates the plan and completes the workflow seamlessly.
  2. Scenario: Regulatory Compliance in Financial Planning

    • Context: A user asks the assistant to draft a financial plan for a multinational company, requiring compliance with tax laws in multiple regions.
    • Execution:
      • The assistant uses tools to retrieve tax regulations but identifies potential conflicts or gaps in the data.
      • It emits an "intervention_requested" event with the type validation, asking the user to confirm specific legal interpretations or provide additional details from a legal expert.
      • After receiving the input, the assistant adjusts the financial plan to ensure compliance and alignment with the user’s objectives.

Integrating human intervention with RePlan not only enhances the assistant's flexibility but also allows it to confidently tackle scenarios that demand both automation and human insight. This addition ensures that RePlan remains reliable, adaptive, and user-focused while supporting a wider variety of complex use cases.

matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 18, 2024
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 19, 2024
…ustment

- Added basic structure for handling user input during plan adjustments.

Ref: i-am-bee#121
Signed-off-by: Matias Molinas <[email protected]>
matiasmolinas added a commit to matiasmolinas/bee-agent-framework that referenced this issue Dec 19, 2024
- Improved the description for the human tool to better reflect its capabilities and usage.
- Ensures clarity and consistency with other tool descriptions.

Ref: i-am-bee#121
Signed-off-by: Matias Molinas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment