Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate dynamic tool selection/filtering #184

Open
michael-desmond opened this issue Nov 19, 2024 · 12 comments
Open

Investigate dynamic tool selection/filtering #184

michael-desmond opened this issue Nov 19, 2024 · 12 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@michael-desmond
Copy link
Contributor

Is your feature request related to a problem? Please describe.
For a particular user message not all of the available tools may be necessary to produce a response. The inclusion of the entire tool set introduces superfluous tokens in the LLM input.

Describe the solution you'd like
A way for the agent to select a subset of tools in response to a given user message during agent execution.

Describe alternatives you've considered
This touches on decomposition of agent functions. An alternative would be a way to compose an agent from a set of more primitive elements with predefined connections.

@mhdawson
Copy link
Contributor

mhdawson commented Dec 2, 2024

@michael-desmond, not related to building the capability into the agent itself, but I did something simlar for a different reason in - https://developers.redhat.com/blog/2024/10/25/building-agents-large-language-modelsllms-and-nodejs

@Tomas2D Tomas2D added the enhancement New feature or request label Dec 4, 2024
@matoushavlena matoushavlena added the help wanted Extra attention is needed label Dec 4, 2024
@matoushavlena
Copy link
Contributor

@michael-desmond Could you comment a bit more on the alternative solution and the decomposition of agents? Is this motivated by the open-canvas?

Tomas2D added a commit that referenced this issue Dec 4, 2024
@Tomas2D
Copy link
Contributor

Tomas2D commented Dec 4, 2024

We could extend the emitted data in the start event, as shown in #224.

This would let anybody to do something like this.

const agent = new BeeAgent(...)

agent.emitter.on("start", async ({ memory, tools, meta }) => {
  const lastMessage = memory.messages.at(-1); // better check should be used

  if (lastMessage?.text.includes("weather")) {
    const newTools = tools.filter((tool) => tool.name.includes("weather"))
    tools.splice(0, Infinity, ...newTools);
  }
});

@michael-desmond
Copy link
Contributor Author

@matoushavlena Yes it’s somewhat inspired by openCanvas, and generally by the langGraph approach to handling complexity. Right now a single agent (sys prompt) is responsible for selecting the tool, calling the tool, and producing a final answer. Conceivably this process could be handled by a set of nodes/components each with a much narrower scope that work together to (potentially) produce a more robust overall agent i.e. tool selection (look at dialog history and choose a tool), tool calling (call the given tool), response generation etc.

@geneknit
Copy link

geneknit commented Dec 4, 2024

Accelerated Discovery team uses a normal RAG pattern on Tool Names and Descriptions to filter the list of "equipped tools" to select only 10 most relevant based on MMR algorithm.

Here is a pubilcaly available documente that matches the the pattern they used from LangChain: How to handle large numbers of tools. Courtesy of @prattyushmangal.

@Tomas2D
Copy link
Contributor

Tomas2D commented Dec 5, 2024

RAG with MMR can work well, but from my point of view, it is useful only if you have many tools and want to do pre-filtering to speed up the decision time.

An example of how this can be done with the help of structured generation within the framework.

const prompt = "What is the current weather in San Francisco?";
const maxTools = 1;

const llm = new OllamaChatLLM();
const driver = new JsonDriver(llm);
const tools = [new GoogleSearchTool(), new WikipediaTool(), new OpenMeteoTool()] as const;

const response = await driver.generate(
  z.object({
    tools: z.array(z.enum(tools.map((tool) => tool.name) as [string, ...string[]])).max(maxTools),
  }),
  [
    BaseMessage.of({
      role: Role.USER,
      text: `# Tools
${tools.map((tool) => `Tool Name: ${tool.name}\nTool Description: ${tool.description}`).join("\n")}

# Objective
Give me a list of the most relevant tools to answer the following prompt.
Prompt: ${prompt}`,
    }),
  ],
);

console.info(response.tools); // ["OpenMeteo"]

@prattyushmangal
Copy link

Hi @Tomas2D, your remark on speed up decision time is right, but our main reasoning for tool filtering was to do with accuracy.

Instead of showing the LLM all available tools for decision making, if you show a filtered list, then it may be more likely to avoid hallucinations and improve its accuracy in tool selection.

So benefits of pre-filtering = accuracy ⬆️ and inference time ⬇️

@Tomas2D
Copy link
Contributor

Tomas2D commented Dec 5, 2024

I see. What if you pick the wrong subset of tools, and when is the tool selection done?

Because Bee Agent is an extended ReAct agent, I see two approaches.

  1. We pre-filter tools at the very beginning (faster, which may lead to an agent being unable to respond).
  2. We pre-filter tools before every iteration (slower, leads to better results).

In addition, the Agent's system prompt, which contains all the tool-related information, sits at the beginning of the conversation history, followed by conversation messages. I am worried that this tool's filtering could confuse the agent because the agent might see old messages requiring tool calls to tools that are not available in the current interaction.

@prattyushmangal
Copy link

I see. What if you pick the wrong subset of tools, and when is the tool selection done?

Yep no way to mitigate this completely but can reduce the likelihood by choosing a number of filtered tools to return to be sufficiently high.

In addition, the Agent's system prompt, which contains all the tool-related information, sits at the beginning of the conversation history, followed by conversation messages. I am worried that this tool's filtering could confuse the agent because the agent might see old messages requiring tool calls to tools that are not available in the current interaction.

On this concern, I think we may have to assess different system prompt configurations and the behaviours exhibited by the LLMs. In my LangChain based implementation, the tools are only present in the prompts to the LLMs when there is a "tool selection" activity being undertaken by the agent. All LLM prompts only have "need to know" information in their prompts. I know for Bee currently it is one single prompt based Agent but in the future, what I suggest might be suited for a more multi-agent based approach.

@Tomas2D
Copy link
Contributor

Tomas2D commented Dec 5, 2024

In other words, you don't preserve intermediate steps between iterations, which has its own pros and cons.

@dakshiagrawal
Copy link
Contributor

this is a fluid area and we should not harden our stance yet. I think changing tool selection at different steps is a fine approach to try.

Tomas2D added a commit that referenced this issue Dec 10, 2024
@Tomas2D
Copy link
Contributor

Tomas2D commented Dec 11, 2024

Would extending events emitted by Agent through be helpful for further exploration? As depicted in #224.

Tomas2D added a commit that referenced this issue Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

7 participants