Make agent reasoning smarter by providing a space to organize thoughts #873
MaxAnfilofyev
started this conversation in
Ideas
Replies: 2 comments
-
Q* recipe: instead of
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
For the reasoning tasks, give a model an ability to reflect/ organize its thoughts before offering a solution.
So you can prompt a model in a single call to output all of the following - with gpt-4 output limit of 8k tokens, there may not be a need to separate calls for reasoning, reflection, and code
-- programming requested output
reasoning
reflection on last reasoning
... as many reasoning-reflection iterations as needed
Once the agent is happy with the reasoning, ask the agent to write pseudocode
The ask to rewrite the pseudocode as code
- could be done in the same or a separate call
-- issue resolution requested output
</prior solution attempts>
reflection on last reasoning - this is missing currently
reflection on last reasoning</new solution concept> - this is missing currently
- optional
The ask to rewrite the pseudocode as code
- could be done in the same or a separate call
Asking model to critique its solution until Evaluator agent passes the solution has shown meaningful improvements on programming tasks
Reflexion significantly outperforms all baseline approaches over several learning steps. For reasoning only and when adding an episodic memory consisting of the most recent trajectory, Reflexion + ReAct outperforms ReAct only approach that our system currently uses.
The key steps of the Reflexion process are a) define a task, b) generate a trajectory, c) evaluate, d) perform reflection, and e) generate the next trajectory.
A diagram for a generic approach that incorporates reflection in a task chain
Reflexion is designed to help agents improve their performance by reflecting on past mistakes and incorporating that knowledge into future decisions. This makes it well-suited for tasks where the agent needs to learn through trial and error, such as decision-making, reasoning, and programming.
The reflection could enhance any task that involves reasoning: 'project_description', 'user_stories', 'user_tasks', 'architecture', 'environment_setup', 'development_planning', 'coding', debug.
While this would cause agents to output more content, the reduction of tokens consumed in rework should more than offset the increase.
For example, we could enhance ebug by asking it to articulate reasons why the prior attempts failed, come up with a new approach that addresses those reasons, and then break down that approach into steps. Then, we would pass the reasons forward, so that the future attempts benefit from the reasoning already performed.
The result could look something like
Sample Debug Relfection Prompt
References:
Beta Was this translation helpful? Give feedback.
All reactions