Skip to content

Commit

Permalink
[BFCL] Relocate Formatting Instructions and Function Documentation to…
Browse files Browse the repository at this point in the history
… System Prompt (#593)

Previously, formatting instructions and function documentation were
included in the user prompt when interacting with models in prompting
mode. However, these details are better suited for the system prompt,
where they can more effectively guide the model's behaviour. This PR
updates the model prompting process by moving the formatting
instructions and function documentation to the system prompt, ensuring
they are appropriately positioned for optimal model performance.

This **will affect** the leaderboard score.

----

Also in this PR:

1. Update the model handlers to record the processed prompt/message and
tools (if FC mode) in the result file when inference. This helps to
identify if there are any issues with the pre-processing phase.
2. Fix 6 dataset issues: `irrelevance_49, live_irrelevance_157-18-1,
live_simple_79-40-0, live_parallel_0-0-0, live_parallel_4-1-0,
live_parallel_5-2-0`
  • Loading branch information
HuanzhiMao authored Aug 25, 2024
1 parent 984883d commit 3850c2b
Show file tree
Hide file tree
Showing 28 changed files with 169 additions and 186 deletions.
7 changes: 7 additions & 0 deletions berkeley-function-call-leaderboard/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,13 @@ Some companies have proposed some optimization strategies in their models' handl

## Changelog

* [August 22, 2024] [#593](https://github.com/ShishirPatil/gorilla/pull/593):
* Move formatting instructions and function documentation to system prompt instead of user prompt in the message section. All prompting models are affected.
* Bug fix in the dataset and possible answers.
* irrelevance: 1 affected
* live_irrelevance: 1 affected
* live_simple: 1 affected
* live_parallel: 3 affected
* [August 19, 2024] [#580](https://github.com/ShishirPatil/gorilla/pull/580): Introduce BFCL V2 Live dataset, featuring user-contributed live prompts and function docs. To read more about the composition and construction of this dataset, please refer to our [blog](https://gorilla.cs.berkeley.edu/blogs/12_bfcl_v2_live.html). All CLI commands have been updated to support the new dataset.
* [August 8, 2024] [#574](https://github.com/ShishirPatil/gorilla/pull/574): Set temperature to 0.001 for all models for consistency and reproducibility.
* [August 7, 2024] [#571](https://github.com/ShishirPatil/gorilla/pull/571): Support parallel inference for hosted models. User can specify the number of threads to use for parallel inference by setting the `--num-threads` flag. The default is 1, which means no parallel inference.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -197,4 +197,4 @@
{"question": "What is the air quality index in London 2022/08/16?", "function": ""}
{"question": "Find the air quality index in San Diego at 12pm.", "function": ""}
{"question": "Calculate the required water daily intake for a person with weight 70 kg.", "function": ""}
{"question": "Find air quality index in San Jose for next three days.", "function": ""}
{"question": "Find air quality index in San Jose for next three days.", "function": ""}
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
{"id": "irrelevance_46", "question": [{"role": "user", "content": "Give me the price of a Tesla model S in India."}], "function": [{"name": "get_exchange_rate", "description": "Retrieve the current exchange rate between two currencies.", "parameters": {"type": "dict", "properties": {"base_currency": {"type": "string", "description": "The base currency."}, "target_currency": {"type": "string", "description": "The target currency."}}, "required": ["base_currency", "target_currency"]}}]}
{"id": "irrelevance_47", "question": [{"role": "user", "content": "What are the ingredients for lasagna?"}], "function": [{"name": "flight_schedule.get_timings", "description": "Get the departure and arrival times for flights between two airports.", "parameters": {"type": "dict", "properties": {"from_airport": {"type": "string", "description": "The code for the departure airport."}, "to_airport": {"type": "string", "description": "The code for the destination airport."}, "date": {"type": "string", "description": "The departure date.", "default": "2000-12-3"}}, "required": ["from_airport", "to_airport"]}}]}
{"id": "irrelevance_48", "question": [{"role": "user", "content": "What is the current Gini Coefficient of USA?"}], "function": [{"name": "finance.fetchGDP", "description": "Fetch the GDP of the given country in the given year.", "parameters": {"type": "dict", "properties": {"country": {"type": "string", "description": "The name of the country to get the GDP of."}, "year": {"type": "integer", "description": "The year to get the GDP of."}, "format": {"type": "string", "description": "The format to return the data in. Default is 'USD'.", "enum": ["USD", "EUR", "GBP"]}}, "required": ["country", "year"]}}]}
{"id": "irrelevance_49", "question": [{"role": "user", "content": "What is the time difference between Los Angeles and Berlin?"}], "function": [{"name": "get_co-ordinate", "description": "Fetch geographical coordinates of a particular location.", "parameters": {"type": "dict", "properties": {"location": {"type": "string", "description": "The city name you want coordinates for."}}, "required": ["location"]}}]}
{"id": "irrelevance_49", "question": [{"role": "user", "content": "What is the time difference between Los Angeles and Berlin?"}], "function": [{"name": "get_co_ordinate", "description": "Fetch geographical coordinates of a particular location.", "parameters": {"type": "dict", "properties": {"location": {"type": "string", "description": "The city name you want coordinates for."}}, "required": ["location"]}}]}
{"id": "irrelevance_50", "question": [{"role": "user", "content": "Give me a selection of horror movies to watch on a Friday night."}], "function": [{"name": "convert_celsius_to_fahrenheit", "description": "Convert a temperature from Celsius to Fahrenheit.", "parameters": {"type": "dict", "properties": {"celsius": {"type": "float", "description": "The temperature in Celsius to be converted."}, "precision": {"type": "integer", "description": "The decimal precision for the conversion result.", "default": 2}}, "required": ["celsius"]}}]}
{"id": "irrelevance_51", "question": [{"role": "user", "content": "Calculate the fibonacci of number 20."}], "function": [{"name": "cryptocurrency_price", "description": "Get the current price of a specific cryptocurrency.", "parameters": {"type": "dict", "properties": {"currency": {"type": "string", "description": "The symbol of the cryptocurrency."}, "vs_currency": {"type": "string", "description": "The target currency to represent the price."}, "include_market_cap": {"type": "boolean", "default": "false", "description": "Optional field to include market capitalization."}}, "required": ["currency", "vs_currency"]}}]}
{"id": "irrelevance_52", "question": [{"role": "user", "content": "Convert the sentence 'Hello, how are you?' from English to French."}], "function": [{"name": "compress_file", "description": "Compresses a given file into a zip archive.", "parameters": {"type": "dict", "properties": {"file_path": {"type": "string", "description": "The path of the file to compress."}, "archive_name": {"type": "string", "description": "The name of the resulting archive."}, "compression_level": {"type": "integer", "description": "The level of compression to apply (from 0 to 9). Default is 5."}}, "required": ["file_path", "archive_name"]}}]}
Expand Down
Loading

0 comments on commit 3850c2b

Please sign in to comment.