2.5.0 (2024-01-16)
2.4.1 (2024-01-11)
- traces: prevent missing key exception when extracting invocation parameters in llama-index (#2076) (5cc9560)
2.4.0 (2024-01-10)
- add persistence for span evaluations (#2021) (589d482)
- ui: add filter condition snippets (#2049) (567fa54)
- Handle missing vertex candidates (#2055) (1d0475a)
- OpenAI clients are not cleaned up after calls to
llm_classify
(#2068) (3233d56) - traces: remove nan from log_evaluations (#2056) (df9ed5c)
2.3.0 (2024-01-08)
- Add demo link, examples getting started (GITBOOK-396) (e987315)
- Add Evaluating Traces Section (GITBOOK-386) (7d72029)
- Add evaluations section for results (GITBOOK-387) (2e74be0)
- Add final thoughts to evaluation (GITBOOK-405) (20eab16)
- add import statement (GITBOOK-408) (23247d7)
- add link (GITBOOK-403) (0be280a)
- eval concepts typo (GITBOOK-394) (7c80d4b)
- eval concepts typos (GITBOOK-393) (62bc99f)
- evaluation concepts typo fix (GITBOOK-390) (2cbc1dc)
- Extract Data from Spans (GITBOOK-383) (440f530)
- fix broken section link (GITBOOK-409) (fee537b)
- fix typos (GITBOOK-391) (c8f5a55)
- fix typos (GITBOOK-402) (3cd973d)
- fix typos (GITBOOK-406) (eaa9bea)
- fix typos (GITBOOK-407) (cad4820)
- Initial draft of evaluation core concept (GITBOOK-385) (67369cf)
- Log Evaluations (GITBOOK-389) (369d79d)
- No subject (GITBOOK-399) (94df884)
- Re-arrange nav (GITBOOK-398) (54a87eb)
- Remove the word golden, simplify title (GITBOOK-395) (a2233b2)
- simplify conceps (GITBOOK-384) (c38f6c2)
- Simplify examples page (GITBOOK-400) (6144158)
- Trace Evaluations Section (GITBOOK-388) (2ffa800)
- Update SECURITY.md (#2029) (363e891)
2.2.1 (2023-12-28)
- Do not retry if eval was successful when using SyncExecutor (#2016) (a869190)
- ensure float values are properly encoded by otel tracer (#2024) (b12a894)
- ensure llamaindex spans are correctly encoded (#2023) (3ca6262)
- Use separate versioning file (#2020) (f38eedf)
2.2.0 (2023-12-22)
- Add support for Google's Gemini models via Vertex python sdk (#2008) (caf826c)
- Support first-party Anthropic python SDK (#2004) (a323283)
2.1.0 (2023-12-21)
- instantiate evaluators by criteria (#1983) (9c72616)
- support function calling for run_evals (#1978) (8be325c)
- traces: add
v1/traces
HTTP endpoint to handleExportTraceServiceRequest
(3c94dea) - traces: add
v1/traces
HTTP endpoint to handleExportTraceServiceRequest
(#1968) (3c94dea) - traces: add retrieval summary to header (#2006) (8af0582)
- traces: evaluation summary on the header (#2000) (965beb0)
2.0.0 (2023-12-20)
- Update
llm_classify
andllm_generate
interfaces (#1974)
- Add async submission to
llm_generate
(#1965) (5999133) - add support for explanations to run_evals (#1975) (5143529)
- evaluation column selectors (#1932) (ed07809)
- openai streaming tool calls (#1936) (6dd14cf)
- support running multiple evals at once (#1742) (79d4473)
- Update
llm_classify
andllm_generate
interfaces (#1974) (9fd35a1)
- Add lock failsafe (#1956) (9ddbd9c)
- llama-index extra (#1958) (d9b68eb)
- LlamaIndex compatibility fix (#1940) (052349d)
- Model stability enhancements (#1939) (dca42e0)
- traces: span summary root span filter (#1981) (d286f07)
- Add anyscale tutorial (#1941) (e47c8d0)
- autogen link (#1946) (c3fb4ce)
- Clear anyscale tutorial outputs (#1942) (63580a6)
- RAG Evaluation (GITBOOK-378) (429f537)
- sync (#1947) (c72bbac)
- traces: autogen tracing tutorial (#1945) (0fd02ff)
- update rag eval notebook (#1950) (d06b8b7)
- update rag evals docs (#1954) (aa6f36a)
- Using phoenix with HuggingFace LLMs- getting started (#1916) (b446972)
1.9.0 (2023-12-11)
1.8.0 (2023-12-10)
- embeddings: audio support (#1920) (61cc550)
- openai streaming function call message support (#1914) (25279ca)
1.7.0 (2023-12-09)
- Instrument LlamaIndex streaming responses (#1901) (f46396e)
- openai async streaming instrumentation (#1900) (06d643b)
- traces: query spans into dataframes (#1910) (6b51435)
1.6.0 (2023-12-08)
- openai streaming spans show up in the ui (#1888) (ffa1d41)
- support instrumentation for openai synchronous streaming (#1879) (b6e8c73)
- traces: display document retrieval metrics on trace details (#1902) (0c35229)
- traces: filterable span and document evaluation summaries (#1880) (f90919c)
- traces: graphql query for document evaluation summary (#1874) (8a6a063)
1.5.1 (2023-12-06)
1.5.0 (2023-12-06)
- evals: Human vs AI Evals (#1850) (e96bd27)
- semantic conventions for
tool_calls
array in OpenAI ChatCompletion messages (#1837) (c079f00) - support asynchronous chat completions for openai instrumentation (#1849) (f066e10)
- traces: document retrieval metrics based on document evaluation scores (#1826) (3dfb7bd)
- traces: document retrieval metrics on trace / span tables (#1873) (733d233)
- traces: evaluation annotations on traces for associating spans with eval metrics (#1693) (a218a65)
- traces: server-side span filter by evaluation result values (#1858) (6b05f96)
- traces: span evaluation summary (aggregation metrics of scores and labels) (#1846) (5c5c3d6)
- allow streaming response to be iterated by user (#1862) (76a2443)
- trace dataset to disc (#1798) (278d344)
1.4.0 (2023-11-30)
- propagate error status codes to parent spans for improved visibility into trace exceptions (#1824) (1a234e9)
1.3.0 (2023-11-30)
- Add OpenAI Rate limiting (#1805) (115e044)
- evals: show span evaluations in trace details slideout (#1810) (4f0e4dc)
- evaluation ingestion (no user-facing feature is added) (#1764) (7c4039b)
- feature flags context (#1802) (a2732cd)
- Implement asynchronous submission for OpenAI evals (#1754) (30c011d)
- reference link correctness evaluation prompt template (#1771) (bf731df)
- traces: configurable endpoint for the exporter (#1795) (8515763)
- traces: display document evaluations alongside the document (#1823) (2ca3613)
- traces: server-side sort of spans by evaluation result (score or label) (#1812) (d139693)
- traces: show all evaluations in the table" (#1819) (2b27333)
- traces: Trace page header with latency, status, and evaluations (#1831) (1d88efd)
- enhance llama-index callback support for exception events (#1814) (8db01df)
- pin llama-index temporarily (#1806) (d6aa76e)
- remove sklearn metrics not available in sagemaker (#1791) (20ab6e5)
- traces: convert (non-list) iterables to lists during protobuf construction due to potential presence of ndarray when reading from parquet files (#1801) (ca72747)
- traces: make column selector sync'd between tabs (#1816) (125431a)
- Environment documentation (GITBOOK-370) (dbbb0a7)
- Explanations (GITBOOK-371) (5f33da3)
- No subject (GITBOOK-369) (656b5c0)
- sync for 1.3 (#1833) (4d01e83)
- update default value of variable in run_relevance_eval (GITBOOK-368) (d5bcaf8)
1.2.1 (2023-11-18)
- make the app launchable when nest_asyncio is applied (#1783) (f9d5085)
- restore process session (#1781) (34a32c3)
1.2.0 (2023-11-17)
- Add dockerfile (#1761) (4fa8929)
- evals: return partial results when llm function is interrupted (#1755) (1fb0849)
- LiteLLM model support for evals (#1675) (5f2a999)
- sagemaker nobebook support (#1772) (2c0ffbc)
1.1.1 (2023-11-16)
1.1.0 (2023-11-14)
- Evals with explanations (#1699) (2db8141)
- evals: add an output_parser to llm_generate (#1736) (6408dda)
1.0.0 (2023-11-10)
- models: openAI 1.0 (#1716)
0.1.1 (2023-11-09)
0.1.0 (2023-11-08)
- add long-context evaluators, including map reduce and refine patterns (#1710) (0c3b105)
- traces: span table column visibility controls (#1687) (559852f)