diff --git a/README.md b/README.md index 6503c8f63..93c59321d 100644 --- a/README.md +++ b/README.md @@ -13,14 +13,14 @@ It implements the generative AI loop for ONNX models, including pre and post pro See documentation at https://onnxruntime.ai/docs/genai. -| Support matrix | Supported now | Under development | On the roadmap | +|Support matrix|Supported now|Under development|On the roadmap| | -------------- | ------------- | ----------------- | -------------- | | Model architectures | Gemma
Llama *
Mistral +
Phi (language + vision)
Qwen
Nemotron
Granite
AMD OLMo | Whisper | Stable diffusion | -| API | Python
C#
C/C++
Java ^ | Objective-C | | -| Platform | Linux
Windows
Mac ^
Android ^ | | iOS | -| Architecture | x86
x64
Arm64 ~ | | | -| Hardware Acceleration | CUDA
DirectML
| QNN
OpenVINO
ROCm | | -| Features | | Interactive decoding
Customization (fine-tuning) | Speculative decoding | +|API| Python
C#
C/C++
Java ^ |Objective-C|| +|Platform| Linux
Windows
Mac ^
Android ^ ||iOS ||| +|Architecture|x86
x64
Arm64 ~ |||| +|Hardware Acceleration|CUDA
DirectML
|QNN
OpenVINO
ROCm || +|Features|MultiLoRA
Continuous decoding (session continuation)^ | Constrained decoding | Speculative decoding | \* The Llama model architecture supports similar model families such as CodeLlama, Vicuna, Yi, and more.