-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use keyword matching for CodeAct microagents #4568
base: main
Are you sure you want to change the base?
Conversation
user_turns_processed = 0 | ||
for message in reversed(messages): | ||
if message.role == 'user' and user_turns_processed < 2: | ||
message.content[-1].cache_prompt = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the re-ordering here would make the reminder TextContent get the "cache_prompt"? That would break prompt caching, since the reminder text changes with each request so the cache is never hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure the logic is unchanged here, but could be wrong.
messages
should be exactly the same here (system, user example, rest)
as it was previously when we did this check. I just had to move the logic around, because now the example_user_message
function needs the most recent user message in order to do the keyword-matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if there's a good way to test this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say the last user message is simple, no images, just a string:
- it will be in the Message, represented as a list with a single TextContent object
- we add the cache marker on it
- then we add the reminder as a second TextContent in the list
Result: ask the Anthropic API to cache the actual content only, not the reminder.
Currently on branch, if GitHub diff doesn't fool me 😅
- the last Message would have in its list the TextContent with the actual content
- we add the reminder in a second TextContent
- we add cache marker to the last TextContent => that's the reminder now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhhh I see what you're saying :/
Let me bring this up in slack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: testing. It's Anthropic only, and we have to send a few more messages than the first two. The log in console should show both cache writes and cache hits.
We might end up seeing that after some 4k (system message, user message), it does or doesn't hit any more tokens
openhands/utils/prompt.py
Outdated
) | ||
return rendered.strip() | ||
if len(micro_agent_prompts) > 0: | ||
micro_text = "EXTRA INFO: the following information has been included based on a keyword match. It may or may not be relevant to the user's request.\n\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may sound like magical thinking, but since we're at it on other branches: the Anthropic prompts everywhere seem to show heavy use of xml tags. While this isn't exactly news, the latest bunch of reveals seems worth trying more and hey they probably don't hurt anyway. We could add this like <EXTRA_INFO> </EXTRA_INFO>
?
On a side note, I feel like it's becoming more important to figure out what to do with other LLMs... when optimizing for one, it perhaps shouldn't come as a surprise that it seems to win "big". I mean, I'm not talking about tool use; that one, I think, may yet prove very good for all LLMs that support it, but apart from tool use itself, we also do stuff like this. ^ 🤷
End-user friendly description of the problem this fixes or functionality that this introduces
Better support for pushing changes to GitHub
Give a summary of what the PR does, explaining any non-trivial design decisions
This redesigns the way that microagents plug into CodeAct
Link of any specific issues this addresses