Release v5.6.0 · Future-House/paper-qa

Highlights

This release is mainly a bunch of bug fixes:

Pulling in breaks in upstream dependencies (e.g. Pydantic 2.10, aviary 0.10.1)
Makes GradablePaperQAEnvironment's evaluations robust to an empty answer or multiple answers

Due to the introduction of Complete.NO_ANSWER_PHRASE in #726 it was requested we consider this a minor version bump, as it will impact system performance.

What's Changed

Fixed settings session into EnvironmentState, and suppressing PyMuPDF derived DeprecationWarning by @jamesbraza in #713
Adding assertion gather_evidence doesn't populate session.answer by @jamesbraza in #716
Lock file maintenance by @renovate in #715
Fixes gather_with_concurrency typing by @maykcaldas in #714
Latest tooling dependencies by @jamesbraza in #719
Lock file maintenance by @renovate in #718
Fixed EVAL_PROMPT_TEMPLATE to handle empty string or multiple match answers by @jamesbraza in #724
Address missing GenerateAnswer in trajectories, no answers after Complete tools, and better history by @mskarlin in #726
Pulling in latest aviary for concurrency rename by @jamesbraza in #728
Pulling in latest aviary for dependencies fix, and retrying flaky test_propagate_options more by @jamesbraza in #729
Pulling in latest ldp for Callback.before_rollout by @jamesbraza in #734
Documenting why we don't handle evaluation failures in GradablePaperQAEnvironment.step by @jamesbraza in #738
Created LitQAEvaluation.calculate_accuracy_precision utility by @jamesbraza in #733
Refreshed test cassettes, fixed flaky test test_search, and fixed test type ignores by @jamesbraza in #739
Unpins pydantic >2.10.2 requirement, removes TYPE_CHECKING by @nadolskit in #725
Lock file maintenance by @renovate in #737
Alternative maybe is text by @loesinghaus in #717

New Contributors

@maykcaldas made their first contribution in #714
@loesinghaus made their first contribution in #717

Full Changelog: v5.5.0...v5.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v5.6.0

Highlights

What's Changed

New Contributors

Contributors