v0.1.0
What's Changed
- Bump anthropic from 0.28.1 to 0.30.0 by @dependabot in #127
- Add a test case for a single dimension evaluation by @bugsz in #123
- Change the input type of the ReachGoalLLMEvaluator by @bugsz in #129
- Support Azure API for agent and env models; Fix updates on langchain V0.2 by @ruiyiw in #132
- Improve the benchmark by evaluating multiple models and display the results by @bugsz in #126
- Bump litellm from 1.23.16 to 1.40.13 by @dependabot in #125
- background as variable by @XuhuiZhou in #137
- [Automated] Merge release into main by @ProKil in #146
- Bump anthropic from 0.30.1 to 0.31.2 by @dependabot in #139
- finished custom url call by @Jasonqi146 in #142
- Autofix precommit for PRs by @ProKil in #143
- Enabled custom API key for custom OpenAI API by @Jasonqi146 in #154
- Bump types-setuptools from 70.3.0.20240710 to 71.1.0.20240724 by @dependabot in #153
- Bump pytest from 8.2.2 to 8.3.2 by @dependabot in #157
- fix call back handler by @XuhuiZhou in #156
- Fix CI on main by @ProKil in #167
- Bump pydantic from 1.10.12 to 1.10.17 by @dependabot in #168
- Improve the docs and sync with dev by @XuhuiZhou in #166
- Bump mypy from 1.10.1 to 1.11.1 by @dependabot in #162
- Remove
all_pks
from Sampler.sample by @ProKil in #160 - Bump types-setuptools from 71.1.0.20240806 to 72.2.0.20240821 by @dependabot in #173
- structured output by @XuhuiZhou in #175
- Improvements on benchmark display and usage by @bugsz in #135
- [Automated] Merge release into main by @ProKil in #177
- fix LLM_Name by @XuhuiZhou in #179
- upgrade default model to handle bad-foratted outputs to gpt-4o-mini by @yangalan123 in #183
- [Automated] Merge release into main by @ProKil in #181
- Bump pre-commit from 3.7.1 to 3.8.0 by @dependabot in #161
- Add
bad_output_process_model
option anduse_fixed_model_version
option for all generation methods, to avoid future OpenAI API changes break Sotopia running. by @yangalan123 in #196
New Contributors
- @Jasonqi146 made their first contribution in #142
- @yangalan123 made their first contribution in #183
Full Changelog: v0.0.11...v0.1.0