Skip to content

v5.6.0

Compare
Choose a tag to compare
@jamesbraza jamesbraza released this 02 Dec 21:53
· 47 commits to main since this release
0130233

Highlights

This release is mainly a bunch of bug fixes:

  • Pulling in breaks in upstream dependencies (e.g. Pydantic 2.10, aviary 0.10.1)
  • Makes GradablePaperQAEnvironment's evaluations robust to an empty answer or multiple answers

Due to the introduction of Complete.NO_ANSWER_PHRASE in #726 it was requested we consider this a minor version bump, as it will impact system performance.

What's Changed

  • Fixed settings session into EnvironmentState, and suppressing PyMuPDF derived DeprecationWarning by @jamesbraza in #713
  • Adding assertion gather_evidence doesn't populate session.answer by @jamesbraza in #716
  • Lock file maintenance by @renovate in #715
  • Fixes gather_with_concurrency typing by @maykcaldas in #714
  • Latest tooling dependencies by @jamesbraza in #719
  • Lock file maintenance by @renovate in #718
  • Fixed EVAL_PROMPT_TEMPLATE to handle empty string or multiple match answers by @jamesbraza in #724
  • Address missing GenerateAnswer in trajectories, no answers after Complete tools, and better history by @mskarlin in #726
  • Pulling in latest aviary for concurrency rename by @jamesbraza in #728
  • Pulling in latest aviary for dependencies fix, and retrying flaky test_propagate_options more by @jamesbraza in #729
  • Pulling in latest ldp for Callback.before_rollout by @jamesbraza in #734
  • Documenting why we don't handle evaluation failures in GradablePaperQAEnvironment.step by @jamesbraza in #738
  • Created LitQAEvaluation.calculate_accuracy_precision utility by @jamesbraza in #733
  • Refreshed test cassettes, fixed flaky test test_search, and fixed test type ignores by @jamesbraza in #739
  • Unpins pydantic >2.10.2 requirement, removes TYPE_CHECKING by @nadolskit in #725
  • Lock file maintenance by @renovate in #737
  • Alternative maybe is text by @loesinghaus in #717

New Contributors

Full Changelog: v5.5.0...v5.6.0