Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Framework Global architecture #226

Open
ismaelfaro opened this issue Dec 5, 2024 · 9 comments
Open

Framework Global architecture #226

ismaelfaro opened this issue Dec 5, 2024 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@ismaelfaro
Copy link
Member

ismaelfaro commented Dec 5, 2024

Goal

  • Clarify the key elements and relationship in the Bee Framework.
  • Simplify experience of creating new BeeAgents

Out of scope for the moment: changing the communication structure of an agent

Proposed Functional Architecture

classDiagram
    class Agent {
        +LLM llm?
        +Memory memory?
        +Tool[] tools?
        +DevTools[] devTools?
        +run(prompt: string, options?: ExecutionOptions)
    }

    Agent *-- LLM
    Agent *-- Memory
    Agent *-- Tool
    Agent *-- DevTools

    class LLM {
        +inference()
        +templates: TemplatePrompt
    }

    class Memory {
        +store()
        +retrieve()
        +cache: Cache
    }

    class Tool {
        +execute()
        +validate()
    }

    class DevTools {
        +emitter: Emitter
        +logger: Logger
        +adapter: Adapter
        +serializer: Serializer
        +errorHandler: ErrorHandler
    }
Loading
@ismaelfaro
Copy link
Member Author

one of the potential changes is about, if in some moment LLM and Memory can be considered as a Tool, and define a Memory architecture that cover the "short","middle","long_term", and functionalities behind.

@ismaelfaro ismaelfaro added the enhancement New feature or request label Dec 5, 2024
@mmurad2
Copy link
Member

mmurad2 commented Dec 5, 2024

Discussion notes:

  • Consider renaming DevTools -> DevUtilities to avoid confusion with Tools

@aleskalfas
Copy link
Contributor

aleskalfas commented Dec 5, 2024

I would say that from very basic perspectivean Agent is a unit that can be run with an input and return an output everything else is "extra". Actually It should be an interface.

classDiagram
    class IAgent~TInput, TOutput~ {
        <<interface>>
        +boolean isRunning
        +run(input TInput) TOutput
    }
Loading
interface IAgent<TInput,TOutput> {
  isRunning: boolean;
  run(input: TInput): TOutput;
}

class ReproAgent implements IAgent<string, string> {
    get isRunning() {
       return false; // This wil be useful for long running tasks not here
    }
    run(input: string) {
       return input;
    }
}

const myReproAgent = new ReproAgent();
console.log(myReproAgent.run("Hello world!"));

@aleskalfas
Copy link
Contributor

If we would move up from this very basic interface I suggest that we can define a concept (this is not a class diagram ☝️) of agent as this:

Screenshot 2024-12-06 at 19 57 53

where:

  • Agent represents an agent class.
  • run represents a method that triggers an agent's actions.
  • Runner implements the run method or is an class that implements agent's behavior.
    • It manages references to all components required for executing a run.
    • It performs interactions.
  • Loop represents the control loop governing the agent's internal behavior. It is directed by the Decision Maker.
  • Decision maker controls how long agent spends in the loop and selects which interactions to process.
  • Memory stores the history of inputs and outputs from previous runs.
  • Interactions provide mechanisms for the agent to interact with the external environment.

For case of LLM agent it could look like this:
Screenshot 2024-12-06 at 19 57 21

where:

  • Memory consists of two types of memories one (the unconstrained) for persisting inputs and outputs the second for control the size of input to LLM because the LLM context window limit.
  • Decision maker consist of all is needed to prepare message to LLM, call it and parse the response.
  • Interactions consist of tools provided to the LLM, which are executed by the Runner based on the LLM's decisions.

@mmurad2
Copy link
Member

mmurad2 commented Dec 9, 2024

Next steps:

  • get feedback on proposal
  • plan work to execute on this proposal in subsequent releases

@matiasmolinas
Copy link

one of the potential changes is about, if in some moment LLM and Memory can be considered as a Tool, and define a Memory architecture that cover the "short","middle","long_term", and functionalities behind.

By considering both the LLM and Memory as tools within the agent architecture, we can maintain their original functions while achieving a more modular and flexible design. The LLM-as-a-tool approach encapsulates its capabilities behind a simple “function-like” interface, making it easy to swap models or adjust prompts. Similarly, treating Memory as a tool allows layering short-, medium-, and long-term storage for efficiently retrieving previously seen data. For example, when categorizing incoming issues that may arrive in various languages, the LLM tool can standardize them into a consistent category ID, while the Memory tool caches these mappings for repeated queries, reducing unnecessary LLM calls. This scenario illustrates how the LLM’s classification ability can be utilized without requiring full agentic behavior, and how both LLM and Memory tools can seamlessly fit into the original agent architecture. I’m happy to provide two prototype implementations: one using only the LLM tool, and another integrating the Memory tool for caching repeated patterns. Additionally, by selecting smaller, more specialized models for these tasks, we can improve overall efficiency and reduce the system’s carbon footprint.

@matiasmolinas
Copy link

Proposal:

We could consider introducing dependency injection as a way to make the agent’s architecture more modular and dynamically configurable. With dependency injection, an LLM could be leveraged not only to define prompts and tools, but also to generate a comprehensive configuration for an entire agent—covering which models to use, how Memory layers are structured, and what tools are available. By allowing a capable LLM (e.g., Claude 3.5 Sonnet) to produce a dependency injection configuration file based on a given goal, we can instantiate the complete agent directly from this configuration. This approach streamlines the setup process, enables rapid experimentation, and paves the way for easily swapping models or adjusting prompts, all while retaining flexibility to incorporate advanced patterns such as Memory-as-a-tool or LLM-as-a-tool.

@matiasmolinas
Copy link

Another potentially valuable enhancement to consider—one that could be analyzed and validated during this architecture review—is enabling the system to generate new tools on-the-fly through code generation. Imagine a scenario where, during execution, the agent identifies a requirement for a tool that doesn’t yet exist. Instead of failing or halting, it could invoke code generation capabilities to create that missing tool dynamically. Such a generated tool would ideally have at least three core operations: a primary function to perform the desired task, a validation mechanism (akin to unit tests) to ensure that the generated code meets the intended specifications, and an input validation method to prevent runtime errors or unexpected behavior.

While this feature would extend the agent’s flexibility and adaptability, it might be beyond the minimal “core” definition of the agent framework. Instead, it could be treated as an advanced module or extension, layered on top of the core agent architecture. This way, the foundational design remains simple and robust, while more sophisticated functionality—like on-demand tool generation—is available as an optional enhancement that teams can integrate when it makes strategic sense.

@mmurad2
Copy link
Member

mmurad2 commented Dec 16, 2024

Related ticket #254

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants