This repository has been archived by the owner on Oct 25, 2024. It is now read-only.
Intel® Extension for Transformers v1.3.1 Release
Highlights
Improvements
Examples
Bug Fixing
Validated Configurations
Highlights
- Support experimental INT4 inference on Intel GPU (ARC and PVC) with Intel Extension for PyTorch as backend
- Enhance LangChain to support new vectorstore (e.g., Qdrant)
Improvements
- Improve error code handling coverage (dd6dcb4 )
- NeuralChat document refine (aabb2fc )
- Improve Text-generation API (a4aba8 )
- Refactor transformers-like API to adapt to latest transformers version (4e6834a )
- NeuralChat integrate GGML INT4 (29bbd8 )
- Enable Qdrant vectorstore (f6b9e32 )
- Support llama series model for llava finetuning (d753cb )
Examples
- Support GGUF Q4_0, Q5_0 and Q8_0 models from HuggnigFcae (1383c7)
- Support GPTQ model inference on CPU (f4c58d0 )
- Support SOLAR-10.7B-Instruct-v1.0 model (77fb81 )
- Support magicoder model and refine load model (f29c1e )
- Support Mixstral-8x7b model (9729b6 )
- Support Phi-2 model (04f5ef6c )
- Evaluate Perplexity of NeuralSpeed (b0b381)
Bug Fixing
- Fix GPTQ load in issue ( 226e08 )
- Fix tts crash with messy retrieval input and enhance normalizer (4d8d9a )
- Support compatible stats format (c0a89c5a )
- Fix RAG example for retrieval plugin parameter change (c35d2b )
- Fix magicoder tokenizer issue and streaming redundant end format (2758d4 )
Validated Configurations
- Python 3.10
- Centos 8.4 & Ubuntu 22.04
- Intel® Extension for TensorFlow 2.13.0
- PyTorch 2.1.0+cpu
- Intel® Extension for Torch 2.1.0+cpu