You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be most useful to be able to access the "chat" history for the multi-modal agent like is possible with the pipeline agent.
To achieve this, it would make sense to me that the messages array in the chatCtx object already present in the MultimodalAgent class would be built up as user/agent transcripts and tool calls come in.
MultimodalAgent and VoicePipelineAgent are undergoing a major change on the Python side that will be ported over in the next few weeks, hopefully. since it's a rebuild from the ground up, i don't see much reason to refactor the existing MMA to sync chat history with the LLM, since that's going to be part of the new agent structure anyway. thank you for your understanding
Thanks for replying with the additional context.
I have a serviceable workaround on my side where I listen to the emitted transcript events to build the chat history on my side.
Looking forward to what the re-build will bring, though.
(Off-topic: Will the new EOU detector also be ported to the TS release along with these changes?)
It would be most useful to be able to access the "chat" history for the multi-modal agent like is possible with the pipeline agent.
To achieve this, it would make sense to me that the
messages
array in thechatCtx
object already present in the MultimodalAgent class would be built up as user/agent transcripts and tool calls come in.Similar functionality is already available in OpenAI's beta typescript client for the realtime API.
The text was updated successfully, but these errors were encountered: