Best practices for Prompt + response_model #497

Mr-Ruben · 2024-03-08T20:50:44Z

Mr-Ruben
Mar 8, 2024

There are lots of literature about Prompts, but this is (in my opinion) a slightly different game.

With Instructor, you have the prompt and you have the class definition that you pass to the LLM/AI/openai call.

Both need to be aligned and one needs to consider the extra cost (in tokens) of processing the JSON schema together with generating the output that fulfills that class definition.

I am sure each one has found some tricks/caveats/quirks and sharing them would/may provide some light to others.

Mr-Ruben · 2024-03-08T21:09:51Z

Mr-Ruben
Mar 8, 2024
Author

Some of my findings are:

Less is more.

The more energy the AI has to spend processing the prompt / writing the response, the less it will have to do the job.

Once I get a working prompt, I run experiments trimming it (just like with neural networks) to find how much I can remove without affecting the results.

A class can be descriptive without consuming a lot of tokens

Dummy example ahead

# Instead of this
class ObjectDetails(BaseModel):
    parameter_blah: str
    type_of_item: Literal["A", "B", "None"]
    grand_sum: int

class Evidence(BaseModel):
    list_objects: List[ObjectDetails]

which can get very costly because it may have many instances

# You can achieve the same level of 'description' with

class OD(BaseModel):
    b: str = Field(description="parameter blah")
    t: Literal["A", "B", "None"] = Field(description="type of item")
    g: int = Field(description="grand sum")

class Evidence(BaseModel):
    lo: List[OD] = Field(description="list of objects")

The second example may have longer text on the input (json schema), but it will get N times shorter on the output.

As of 2024-03-08, openai output tokens are 3 times the cost of input tokens. But more important than that is the 'employ your energy on the task more than on writing the results' .

I'd like to see if someone has found the opposite to be true (maybe it does in some cases).

2 replies

Mr-Ruben Apr 2, 2024
Author

I asked to Perplexity:
openai api output is less quality with function call than with plain text

and this was the answer.

Based on the search results, it appears that there are some issues and challenges with using the OpenAI API's function call feature, which can result in lower quality outputs compared to using the API for plain text generation. Here are the key points:

Function calls can make the API responses repetitive and of lower quality compared to regular text generation. The search results suggest that the function call feature is still relatively new and may have some reliability issues. [1][3][4]
There seem to be challenges with properly formatting the function call input and handling the function call response in a way that avoids repetition of the output. The search results indicate that the recommended way to use function calls may not always work as expected. [3][4]
The order and placement of the function call information in the API request can impact the quality of the response. Putting the function call details in the system message versus the user message may produce different results. [3]
Using the ChatCompletion API endpoint (gpt-3.5-turbo) instead of the Completion API (text-davinci-003) may provide better results for function calls, but the former has its own set of formatting requirements. [2]
Overall, the search results suggest that the function call feature is still a work in progress, and users may need to experiment with different approaches to get reliable and high-quality outputs. Relying solely on plain text generation may be a more stable option for now. [1][3][4]

In summary, the OpenAI API's function call feature appears to have some limitations that can lead to lower quality outputs compared to plain text generation. Users should be aware of these challenges and be prepared to experiment to find the best approach for their use case. [1][2][3][4]

Citations:
[1] https://community.openai.com/t/bad-results-when-using-fine-tuned-model-with-function-calling/428832
[2] https://community.openai.com/t/text-davinci-003-api-is-way-weaker-than-normal-chatgpt/290393
[3] https://community.openai.com/t/function-calling-very-unreliable/268439
[4] https://community.openai.com/t/function-calling-should-not-repeat-outputted-content/351356
[5] https://community.openai.com/t/low-quality-text-production-in-api/401210

I asked that because I have a prompt that works very well with text output, but with function call (a very simple one) has some chance of doing mistakes (in the content, not the format).

And I thought, If I combine
Call1: Text output to produce the desired result
Call2: Function call to capture the output from Call1 with the desired format/specs.
I may be able to reduce the chances of mistakes.

Mr-Ruben Apr 6, 2024
Author

The last proposed approach has worked very well for me.

I use it to make the LLM chew the Input, chew it again, and then output the result in plain text, and on the last call, I capture the output with Instructor.

# Pseudocode
response1= call(prompt=p1)
response2= call(prompt= response1 + p2 )
...
responseN= call(prompt= responseN-1  + pN  , response_model=Output )

Only responseN is an Object, the previous ones are text.

This is a format that works very well for the output, that will be sent to the next call.

Output:
var name1: str #  Describe here what it is
var name2: bool #  Describe here what it is

This usually produces something like

var name1: blah blah .....
var name2: False

As you may have noticed, it is very similar to Pydantic.
The LLM understands it perfectly and doesn't output what is behind the #

Mr-Ruben · 2024-05-15T14:52:55Z

Mr-Ruben
May 15, 2024
Author

One rule I learnt interacting with the LLM is this:

"Don't ask for what cannot be done OR does not exist."

You may think, what an obvious observation ...

But it is easy to forget (for me) when creating prompt+output_templates.

What happens if you forget the rule?

The LLM will respond, with rubbish / overthinking / hallucination

As far as I know you cannot include conditions while creating the class.

### not possible ###
class Output(BaseModel):
    Answer_to_Q1: str
    if  condition:
         Score_of_Q1: conint(ge=0, le=5)

However you can build the class progressively.

Example:

class Output(BaseModel):
    Answer_to_Q1: str
    Score_of_Q1: conint(ge=0, le=5)

Which produces

print(Output.model_json_schema())

{'properties': {'Answer_to_Q1': {'title': 'Answer To Q1', 'type': 'string'},
                'Score_of_Q1': {'maximum': 5,
                                'minimum': 0,
                                'title': 'Score Of Q1',
                                'type': 'integer'}},
 'required': ['Answer_to_Q1', 'Score_of_Q1'],
 'title': 'Output',
 'type': 'object'}

And then, conditionally increase it:

if my_condition:
    class Output(Output):
        Answer_to_Q2: str
        Score_of_Q2: conint(ge=0, le=5)

Which produces:

print(Output.model_json_schema())

{'properties': {'Answer_to_Q1': {'title': 'Answer To Q1', 'type': 'string'},
                'Answer_to_Q2': {'title': 'Answer To Q2', 'type': 'string'},
                'Score_of_Q1': {'maximum': 5,
                                'minimum': 0,
                                'title': 'Score Of Q1',
                                'type': 'integer'},
                'Score_of_Q2': {'maximum': 5,
                                'minimum': 0,
                                'title': 'Score Of Q2',
                                'type': 'integer'}},
 'required': ['Answer_to_Q1', 'Score_of_Q1', 'Answer_to_Q2', 'Score_of_Q2'],
 'title': 'Output',
 'type': 'object'}

0 replies

Mr-Ruben · 2024-10-24T09:58:56Z

Mr-Ruben
Oct 24, 2024
Author

Did you know .... ?

That given a Pydantic model (example)

from pydantic import BaseModel
from typing import List, Literal

class UserProfile(BaseModel):
    username: str
    email: str
    is_active: bool
    roles: List[Literal['admin', 'editor', 'viewer']]
    tags: List[str]

you can ask the LLM to convert it to YAML

UserProfile:
  type: object
  properties:
    username: { type: string }
    email: { type: string }
    is_active: { type: boolean }
    roles: 
      type: array
      items: { type: string, enum: ['admin', 'editor', 'viewer'] }
    tags: 
      type: array
      items: { type: string }
  required: [username, email, is_active, roles, tags]

and then, at the end of your prompt you can use that (adding comments where required/beneficial)

within your prompt

Output and Format:

UserProfile:
  type: object
  properties:
    username: { type: string }  # User's unique identifier
    email: { type: string }
    is_active: { type: boolean }
    roles: # < SOME TEXT EXPLAINING YOUR REQUIREMENTS/CONDITIONS/ETC. >
      type: array
      items: { type: string, enum: ['admin', 'editor', 'viewer'] } # Allowed roles
    tags:  # < SOME TEXT EXPLAINING YOUR REQUIREMENTS/CONDITIONS/ETC. >
      type: array  
      items: { type: string }
  required: [username, email, is_active, roles, tags]

and the LLM will follow that format without having to use output_model | response_format | instructor.

Example output.

username: johndoe
email: [email protected]
is_active: true
roles: [admin, editor]
tags: [python, pydantic]

Why would this be useful?:

Quick prototyping
Less constrained / more free output generation
Lower amount of output tokens (less important)

Once your prompt design is solid ...
You can throw the initial_output back to the LLM (on a new call) using response_format + Instructor , and then you would obtain the Python object.

I know this would throw away the retry benefits of Instructor. But my experience has taught me that if the LLM/Model finds it hard to produce the answer, is because the prompt has problems (lack of examples, ambiguous instructions, etc.).

While troubleshooting, you can add a

additional_comments: { type: string } # Write here anything you want to say if you find problems completing the task, or you find it unclear, ambiguous, .... blah blah blah ....

to the output format so that the LLM can have a place to vent out its frustration with your prompt :o|

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best practices for Prompt + response_model #497

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Best practices for Prompt + response_model #497

Mr-Ruben Mar 8, 2024

Replies: 3 comments · 2 replies

Mr-Ruben Mar 8, 2024 Author

Mr-Ruben Apr 2, 2024 Author

Mr-Ruben Apr 6, 2024 Author

Mr-Ruben May 15, 2024 Author

Mr-Ruben Oct 24, 2024 Author

Output and Format:

Mr-Ruben
Mar 8, 2024

Replies: 3 comments 2 replies

Mr-Ruben
Mar 8, 2024
Author

Mr-Ruben Apr 2, 2024
Author

Mr-Ruben Apr 6, 2024
Author

Mr-Ruben
May 15, 2024
Author

Mr-Ruben
Oct 24, 2024
Author