Important Change: Unitxt is Faster!
To improve Unitxt’s performance, we've made several optimizations:
-
Operator Acceleration: Many operators have been sped up by removing unnecessary deep copying in their code, enhancing runtime efficiency.
-
Caching Hugging Face Datasets: We added the option to cache Hugging Face datasets in loaders, which can prevent redundant loading operations. To enable this, you can either:
- Set it globally in code:
import unitxt unitxt.settings.disable_hf_datasets_cache = False
- Use the settings context:
with settings.context(disable_hf_datasets_cache=False): # your code
- Or set the environment variable:
export UNITXT_DISABLE_HF_DATASETS_CACHE=False
- Set it globally in code:
-
Eager Execution Mode: Running Unitxt without streaming, which can be faster in certain scenarios. Enable eager execution using the environment variable or directly in code:
unitxt.settings.use_eager_execution = True # or with settings.context(use_eager_execution=True): # your code
-
Partial Stream Loading: This feature lets you load only the necessary data instances, avoiding full dataset loads when not required. Here's an example:
from unitxt import load_dataset dataset = load_dataset( card="cards.doc_vqa.lmms_eval", template="templates.qa.with_context.title", format="formats.models.llava_interleave", loader_limit=300, streaming=True, ) print(next(iter(dataset["test"][0]))) # Loads only the first instance
Complete Example: Combining the optimizations above can lead to near 1000x faster dataset loading:
from unitxt import load_dataset, settings with settings.context( disable_hf_datasets_cache=False, use_eager_execution=True, ): dataset = load_dataset( card="cards.doc_vqa.lmms_eval", template="templates.qa.with_context.title", format="formats.models.llava_interleave", loader_limit=300, streaming=True, ) print(next(iter(dataset["test"][0]))) # Loads only the first instance
-
Execution Speed Tracking: A GitHub action has been added to monitor Unitxt’s execution speed in new pull requests, helping ensure that optimizations are maintained.
Summary
This release is focused on accelerating performance in Unitxt by introducing several key optimizations. Operator efficiency has been enhanced by removing deep copies, making operations faster. Users can now enable dataset caching for Hugging Face datasets to prevent redundant loading, configured directly in code or through environment variables. An optional eager execution mode has been added, bypassing streaming to increase speed in certain scenarios. Additionally, partial stream loading allows selective instance loading, reducing memory usage and improving response times. To maintain these improvements, a new GitHub action now monitors Unitxt’s execution speed in pull requests, ensuring consistent performance across updates.
All Changes
- Enhancements to inference engines by @lilacheden in #1243
- add post processor to convert log probs dictionary to probabilities of a specific class by @lilacheden in #1247
- CI for metrics other than main + Bugfix in RetrievalAtK by @lilacheden in #1246
- Add huggingface cache disabling option to unitxt settings by @elronbandel in #1250
- Make F1Strings faster by @elronbandel in #1248
- Fix duplicate column deletion bug in pandas serializer by @elronbandel in #1249
- revived no_deep just to compare performance by @dafnapension in #1254
- fixed scigen post-processor by @csrajmohan in #1253
- Add prediction length metric by @perlitz in #1252
- Fix faithfulness confidence intervals by @matanor in #1257
- Allow role names to be captialized in SerializeOpenAiFormatDialog by @yoavkatz in #1259
- Accelerate image example 1000X by @elronbandel in #1258
- Fix the empty few-shot target issue when using produce() by @marukaz in #1266
- fix postprocessors in turl_col_type taskcard by @csrajmohan in #1261
- Fix answer correctness confidence intervals by @matanor in #1256
- add BlueBench as a benchmark to the catalog by @shachardon in #1262
- Fix MultipleSourceLoader documentation by @marukaz in #1270
- Ignore unitxt-venv by @marukaz in #1269
- Add mmmu by @elronbandel in #1271
- A fix for a bug in metric pipeline by @elronbandel in #1268
- Added Tablebench taskcard by @csrajmohan in #1273
- Fix missing deep copy in MapInstanceValues by @yoavkatz in #1267
- Add stream name to generation of dataset by @elronbandel in #1276
- Fix demos pool inference by @elronbandel in #1278
- Fix quality github action by @elronbandel in #1281
- add operators for robustness check on tables by @csrajmohan in #1279
- Instruction in SystemFormet demo support. by @piotrhelm in #1274
- change the max_test_instances of bluebench.recipe.attaq_500 to 100 by @shachardon in #1285
- Add documentation for types and serializers by @elronbandel in #1286
- Add example for image processing with different templates by @elronbandel in #1280
- Integrate metrics team LLMaJ with current unitxt implemantation by @lilacheden in #1205
- performance profiler with visualization by @dafnapension in #1255
- Remove split arg to support old hf datasets versions by @elronbandel in #1288
- add post-processors for tablebench taskcard by @csrajmohan in #1289
- recursive copy seems safer here by @dafnapension in #1295
- Fix performance tracking action by @elronbandel in #1296
- try num of instances in nested global scores by @dafnapension in #1282
- Update version to 1.14.0 by @elronbandel in #1298
- expand performance table by @dafnapension in #1299
- Fix doc_vqa lmms_eval by @elronbandel in #1300
- prepare for int-ish group names and type names and add the exposing card by @dafnapension in #1303
- remove groups breakdowns from global score of grouped instance metrics by @dafnapension in #1306
- Update the safety metric batch size to 10 by @perlitz in #1305
New Contributors
- @piotrhelm made their first contribution in #1274
Full Changelog: 1.13.1...1.14.1