Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Langgraph agent is too slow #2920

Open
4 tasks done
maryshgh opened this issue Jan 2, 2025 · 9 comments
Open
4 tasks done

Langgraph agent is too slow #2920

maryshgh opened this issue Jan 2, 2025 · 9 comments

Comments

@maryshgh
Copy link

maryshgh commented Jan 2, 2025

Checked other resources

  • This is a bug, not a usage question. For questions, please use GitHub Discussions.
  • I added a clear and detailed title that summarizes the issue.
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I included a self-contained, minimal example that demonstrates the issue INCLUDING all the relevant imports. The code run AS IS to reproduce the issue.

Example Code

class State(TypedDict):
    messages: Annotated[list[AnyMessage], add_messages]

class Assistant:
    def __init__(self, runnable: Runnable):
        self.runnable = runnable

    def __call__(self, state: State, config: RunnableConfig):
        while True:
            configuration = config.get("configurable", {})
            passenger_id = configuration.get("passenger_id", None)
            state = {**state, "user_info": passenger_id}
            result = self.runnable.invoke(state)
            # If the LLM happens to return an empty response, we will re-prompt it
            # for an actual response.
            if not result.tool_calls and (
                not result.content
                or isinstance(result.content, list)
                and not result.content[0].get("text")
            ):
                messages = state["messages"] + [("user", "Respond with a real output.")]
                state = {**state, "messages": messages}
            else:
                break
        return {"messages": result}

builder = StateGraph(State)


# Define nodes: these do the work
builder.add_node("assistant", Assistant(part_1_assistant_runnable))
builder.add_node("tools", create_tool_node_with_fallback(part_1_tools))
# Define edges: these determine how the control flow moves
builder.add_edge(START, "assistant")
builder.add_conditional_edges(
    "assistant",
    tools_condition,
)
builder.add_edge("tools", "assistant")

# The checkpointer lets the graph persist its state
# this is a complete memory for the entire graph.
memory = MemorySaver()
part_1_graph = builder.compile(checkpointer=memory)

Error Message and Stack Trace (if applicable)

No response

Description

I am using a very basic graph structure to call a tool, basically the code is the same as provided by LangGraph documentation (https://langchain-ai.github.io/langgraph/tutorials/customer-support/customer-support/#define-graph), but for model I am using Gemini for agent. The problem is when I use the tool separately, it generates response pretty quick (I am using Gemini inside the tool as well) but when I call the agent, it takes a lot of time. I actually noticed this with langChain runnable as well. Does anyone know why is this happening and how it v=can be resolved?

System Info

langchain-core version: 0.2.40

@bigsela
Copy link

bigsela commented Jan 6, 2025

i face the same thing, when the results of the tools are back the app final response from the agent llm , is very very slow.
getting me sometimes 1min + to return a result, depands on my tool :(
image

@hinthornw
Copy link
Contributor

Thanks for sharing! At first glance, this seems like LLM provider latency and not LangGraph code latency. Why do you think it's caused by LangGraph? Would be happy to dig in further if you have more information.

In OP's description, it seems like the agent has at least three LLM calls: the agent call requesting the tool, the llm call within the tool, and then the llm call in response.

To speed things up you typically want to either decrease the number of LLM calls if possible or speed up each LLM call ( by reducing the amount of context passed to the LLM or by using a faster/cheaper LLM if the task is simple enough) or if the situation permits, parallelize work.

@hinthornw
Copy link
Contributor

Will close in a few days if no one else provides information indicating the latency indeed is related to langgraph

@bigsela
Copy link

bigsela commented Jan 10, 2025

I’m a long conversations, where the history is getting longer, I can see that only the final step, is taking by far a lot of time:

IMG_2654

It’s not a normal response time.
And, it’s always in the final step.. something is happening there in the last step of the agent , when it needs to come up with an aswer , exiting the loop.

@vbarda
Copy link
Collaborator

vbarda commented Jan 10, 2025

Based on the trace, the latency is clearly coming from the LLM and the tool call, why do you think this is a LangGraph latency issue?

@maryshgh
Copy link
Author

I have tested the LLM by calling API directly, and it respond pretty quickly. I have had similar observations as @bigsela, specifically I have noticed that it becomes slow when thread history is included, i.e we pass thread id inside invoke method.

@vbarda
Copy link
Collaborator

vbarda commented Jan 10, 2025

But is the latency actually coming from the langgraph or from the LLM having to think longer because of the long message history and tool calls involved? Could you provide some specific examples where latency is caused by the library and not by the LLMs?

@bigsela
Copy link

bigsela commented Jan 10, 2025

I’ll have it on Sunday and will share , in longer sequences, where the agent get to decide to call another tool before finalizing the flow, it won’t use this long time to decide it, even with history. It chooses fast to use another tool, and then in the final step, when giving a final answer , it takes a long time to do so. This brings me to the idea, that it’s related to the last step and langraph/langchain, and not connected to the llm.

@vbarda
Copy link
Collaborator

vbarda commented Jan 10, 2025

Sure, if you can provide a minimal reproducible example of the issue that demonstrates an issue with langgraph/langgchain we'll investigate https://stackoverflow.com/help/minimal-reproducible-example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants