Output continuation option #805

economy · 2024-06-25T19:51:27Z

economy
Jun 25, 2024

Hi, I've been running into an issue for awhile now, and it looks like a good opportunity for a feature for instructor.

Most of the supported models return up to 4096 tokens, and if the model returns more than that and indicates it stopped due to "length" (token length), an IncompleteOutputException is rightfully thrown.

For OpenAI and Anthropic (haven't tested on other models), it's possible to prompt the model to continue providing structured output where it left off, so long as the user includes the progress so far in the prompt. It may take one or more passes to actually retrieve the entire output. This happens when context length is on the longer side, which a lot of newer models are supporting, but lists of items returned are too long to fit into the output context window.

I'd be glad to take a stab at a contribution here, but want to know if it would be useful to provide an option to prompt the llm API to "continue" up to x number of times, similar to retry logic.

Mr-Ruben · 2024-06-26T11:14:44Z

Mr-Ruben
Jun 26, 2024

Think what you are asking.

If output is text, split in 2 responses (for content_limit or whatever reason):
half_Text + half_Text = Text

If output is an object, split in 2 responses:
half_object + half_object = no_object.

Why? Neither Instructor nor Pydantic expect an object in 'pieces'.

I think, but could be wrong.

See #566

0 replies

economy · 2024-06-26T15:15:14Z

economy
Jun 26, 2024
Author

Think what you are asking.

If output is text, split in 2 responses: half_Text + half_Text = Text

If output is an object, split in 2 responses: half_object + half_object = no_object.

Why? Neither Instructor nor Pydantic expect an object in 'pieces'.

I think, but could be wrong.

See #566

Maybe I was unclear in my description: the issue in this case happens when producing a list or an iterable of smaller objects. If the number of those objects is high, you can still reach the output context window limit.

0 replies

jxnl · 2024-07-03T02:27:19Z

jxnl
Jul 3, 2024
Maintainer

this is tought, not sure what the best path forward is, can you provide a sketch? just api level what the DX looks like?

0 replies

economy · 2024-07-10T15:26:16Z

economy
Jul 10, 2024
Author

@jxnl the thought was to use a similar design to the retry logic, but check the streamed completion chunks for stop_reason: length and reprompt using "continue from..." with the items gathered so far.

Testing this manually, it works well for a few iterations and then all the models start to repeat themselves regardless of the complexity of the requested objects. I'm a little puzzled by that result, and it has me questioning if this would even be a useful feature.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output continuation option #805

{{title}}

Replies: 4 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Output continuation option #805

economy Jun 25, 2024

Replies: 4 comments

Mr-Ruben Jun 26, 2024

economy Jun 26, 2024 Author

jxnl Jul 3, 2024 Maintainer

economy Jul 10, 2024 Author

economy
Jun 25, 2024

Mr-Ruben
Jun 26, 2024

economy
Jun 26, 2024
Author

jxnl
Jul 3, 2024
Maintainer

economy
Jul 10, 2024
Author