Make agent reasoning smarter by providing a space to organize thoughts #873

MaxAnfilofyev · 2024-04-12T20:03:50Z

MaxAnfilofyev
Apr 12, 2024

For the reasoning tasks, give a model an ability to reflect/ organize its thoughts before offering a solution.

So you can prompt a model in a single call to output all of the following - with gpt-4 output limit of 8k tokens, there may not be a need to separate calls for reasoning, reflection, and code

-- programming requested output
reasoning
reflection on last reasoning
... as many reasoning-reflection iterations as needed
Once the agent is happy with the reasoning, ask the agent to write pseudocode

The ask to rewrite the pseudocode as code
- could be done in the same or a separate call

-- issue resolution requested output

</prior solution attempts>
reflection on last reasoning - this is missing currently
reflection on last reasoning</new solution concept> - this is missing currently
- optional
The ask to rewrite the pseudocode as code
- could be done in the same or a separate call

Asking model to critique its solution until Evaluator agent passes the solution has shown meaningful improvements on programming tasks

Reflexion significantly outperforms all baseline approaches over several learning steps. For reasoning only and when adding an episodic memory consisting of the most recent trajectory, Reflexion + ReAct outperforms ReAct only approach that our system currently uses.

The key steps of the Reflexion process are a) define a task, b) generate a trajectory, c) evaluate, d) perform reflection, and e) generate the next trajectory.

A diagram for a generic approach that incorporates reflection in a task chain

Reflexion is designed to help agents improve their performance by reflecting on past mistakes and incorporating that knowledge into future decisions. This makes it well-suited for tasks where the agent needs to learn through trial and error, such as decision-making, reasoning, and programming.

The reflection could enhance any task that involves reasoning: 'project_description', 'user_stories', 'user_tasks', 'architecture', 'environment_setup', 'development_planning', 'coding', debug.

While this would cause agents to output more content, the reduction of tokens consumed in rework should more than offset the increase.

For example, we could enhance ebug by asking it to articulate reasons why the prior attempts failed, come up with a new approach that addresses those reasons, and then break down that approach into steps. Then, we would pass the reasons forward, so that the future attempts benefit from the reasoning already performed.

The result could look something like

Sample Debug Relfection Prompt

You will be given a function implementation and a series of test results. Your goal is to write a few sentences to explain why your implementation is wrong as indicated by the tests. You will need this as a hint when you try again later. Only provide the few sentence description in your answer, not the implementation.
[previous impl]:
[test results from previous impl]:
Tests passed:
Tests failed:
[reflection on previous impl]:
[improved impl]:

References:

MaxAnfilofyev · 2024-07-15T17:02:17Z

MaxAnfilofyev
Jul 15, 2024
Author

Q* recipe:

instead of

question > answer
use
question > rationale > answer

https://www.youtube.com/watch?v=T9gAg_IXB5w

0 replies

MaxAnfilofyev · 2024-10-17T19:20:11Z

MaxAnfilofyev
Oct 17, 2024
Author

https://arxiv.org/abs/2410.10630

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make agent reasoning smarter by providing a space to organize thoughts #873

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Make agent reasoning smarter by providing a space to organize thoughts #873

MaxAnfilofyev Apr 12, 2024

Replies: 2 comments

MaxAnfilofyev Jul 15, 2024 Author

MaxAnfilofyev Oct 17, 2024 Author

MaxAnfilofyev
Apr 12, 2024

MaxAnfilofyev
Jul 15, 2024
Author

MaxAnfilofyev
Oct 17, 2024
Author