LLM Meta - token limit definition #150
Labels
enhancement
New feature or request
help wanted
Extra attention is needed
question
Further information is requested
Right now, the
BaseLLM
(class)[/src/llms/base.ts] defines an abstract method calledmeta
that provides meta information about a given model. The response interface (LLMMeta
) defines a single property calledtokenLimit.
The problem is that typically,
tokenLimit
is not enough as typically providers further subdivide limits into the following:input
(max input tokens) - for WatsonX, this field is calledmax_sequence_length.
output
(max generated tokens) - for WatsonX, this field is calledmax_output_tokens.
Because
TokenMemory
behavior heavily depends on thetokenLimit
value, we must be sure that we are not throwing messages out because we have retrieved the wrong value from an LLM provider.The Solution to this issue is to develop (figure out) a better approach that would play nicely with
TokenMemory
and other practical usages.Relates to #159 (Granite context window limit)
The text was updated successfully, but these errors were encountered: