Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some issues related to support for an Agent declarative format #10237

Open
markwallace-microsoft opened this issue Jan 20, 2025 · 3 comments
Open
Assignees
Labels
agents Build Features planned for next Build conference experimental Associated with an experimental feature .NET Issue or Pull requests regarding .NET code question Further information is requested

Comments

@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Jan 20, 2025

1. Use KernelFunction return value

The default KernelFunctionTerminationStrategy.ResultParser is:
public Func<FunctionResult, bool> ResultParser { get; init; } = (_) => true;

If the KernelFunction returns false then this will override this return value and cause termination.

Why doesn't the default implementation change the function result?

2. Pass Agent and ChatHistory in KernelArguments

Consider this code

    protected sealed override async Task<bool> ShouldAgentTerminateAsync(Agent agent, IReadOnlyList<ChatMessageContent> history, CancellationToken cancellationToken = default)
    {
        history = await history.ReduceAsync(this.HistoryReducer, cancellationToken).ConfigureAwait(false);

        KernelArguments originalArguments = this.Arguments ?? [];
        KernelArguments arguments =
            new(originalArguments, originalArguments.ExecutionSettings?.ToDictionary(kvp => kvp.Key, kvp => kvp.Value))
            {
                { this.AgentVariableName, agent.Name ?? agent.Id },
                { this.HistoryVariableName, ChatMessageForPrompt.Format(history, this.EvaluateNameOnly) },
            };

        this.Logger.LogKernelFunctionTerminationStrategyInvokingFunction(nameof(ShouldAgentTerminateAsync), this.Function.PluginName, this.Function.Name);

        FunctionResult result = await this.Function.InvokeAsync(this.Kernel, arguments, cancellationToken).ConfigureAwait(false);

        this.Logger.LogKernelFunctionTerminationStrategyInvokedFunction(nameof(ShouldAgentTerminateAsync), this.Function.PluginName, this.Function.Name, result.ValueType);

        return this.ResultParser.Invoke(result);
    }

It would be more convenient to:

  1. Pass Agent instance rather than the name or id
  2. Pass the ChatHistory rather than a string (which cannot be parsed back into a ChatHistory
  3. Allow for KernelFunctions which has a signature which takes typed Agent and ChatHistory parameters

3. Should we have a standard way to Invoke an Agent

When we support declarative agents the follow will look something like this

  1. Load the Agent instance from a file which contains a declarative definition of an Agent
  2. Invoke the Agent instance optionally using a provided input and other arguments
  3. Capture the result which should contain everything needed to invoke the Agent instance again and maintain context

Here's some psuedo code:

Kernel kernel = ...
string agentYaml = EmbeddedResource.Read("MyAgent.yaml");
AgentFactory agentFactory = new AggregatorAgentFactory(
    new ChatCompletionFactory(),
    new OpenAIAssistantAgentFactory(),
    new XXXAgentFactory());
Agent agent = kernel.LoadAgentFromYaml(agentYaml); // What return type should we use to standardise?

// Should we have a unified pattern here?
ChatHistory chatHistory = new();
chatHistory.AddUserMessage(input);
await foreach (ChatMessageContent content in agent.InvokeAsync(chatHistory))
{
    chatHistory.Add(content);
}

At the moment the code to "Invoke" a ChatCompletionAgent versus a OpenAIAssistantAgent is different. Should we have an abstraction in the Agent framework that allows us to standardise?

@markwallace-microsoft markwallace-microsoft added triage .NET Issue or Pull requests regarding .NET code and removed triage labels Jan 20, 2025
@github-actions github-actions bot changed the title Why does the default KernelFunctionTerminationStrategy.ResultParser causes termination? .Net: Why does the default KernelFunctionTerminationStrategy.ResultParser causes termination? Jan 20, 2025
@markwallace-microsoft markwallace-microsoft added agents Build Features planned for next Build conference labels Jan 20, 2025
@markwallace-microsoft markwallace-microsoft changed the title .Net: Why does the default KernelFunctionTerminationStrategy.ResultParser causes termination? Changes to KernelFunctionTerminationStrategy.ResultParser Jan 20, 2025
@crickman crickman added experimental Associated with an experimental feature question Further information is requested labels Jan 21, 2025
@crickman
Copy link
Contributor

For the initial question (#1), I believe the original thinking was to avoid infinite chat when a developer failed to provide parser. One option is that the default implementation could be null and then raise a KernelException if not defined. If I remember correctly, at the time you were keen to avoid null defaults.

In response to your question: Why doesn't the default implementation change the function result?

The termination function could provide any result and isn't bound to forcing the model to respond in any particular fashion. (For example, I personally like using structured output and then parsing the JSON result.) Since there is insufficient visibility for the base implementation to evaluate the result, the default posture is currently to avoid a non-terminating chat (with the most conservative approach - always terminate).

As we're having this discussion, I do kind've like the approach of not having any default implemenation and raising an actionable error when no-result parser is defined. Thoughts?

@markwallace-microsoft markwallace-microsoft changed the title Changes to KernelFunctionTerminationStrategy.ResultParser Some issues related to support for an Agent declarative format Jan 22, 2025
@markwallace-microsoft
Copy link
Member Author

Going back a little. TerminationStrategy has the following abstract method protected abstract Task<bool> ShouldAgentTerminateAsync(Agent agent, IReadOnlyList<ChatMessageContent> history, CancellationToken cancellationToken);.

So if I am creating a KernelFunction to implement a TerminationStrategy I think we want to support two cases:

  1. A semantic function i.e. the developer provide a prompt which get's executed and then the result is parsed to extract a bool value.
  2. A native function i.e. the developer provides a native function (which could be Open API or gRPC) which get's executed and is expected to return a bool value directly. In this case the result parser can just convert the function result to a bool.

It is option #2 which I am looking at.

Note: We should consider changing the TerminationStrategy abstract method to protected abstract Task<bool> ShouldAgentTerminateAsync(Agent agent, ChatHistory chatHistory, CancellationToken cancellationToken);. ChatHistory is already a list of messages but has the added advantage of providing extensibility e.g. we could add a Metadata property in the future if needed.

@crickman
Copy link
Contributor

crickman commented Jan 22, 2025

KernelFunctionTerminationStrategy was originally designed for use with a prompt function only. The Function parameter is declared as KernelFunction since that is the only public type available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agents Build Features planned for next Build conference experimental Associated with an experimental feature .NET Issue or Pull requests regarding .NET code question Further information is requested
Projects
Status: No status
Development

No branches or pull requests

2 participants