Intel is the most traditional CPU manufacturer with many users. With the rise of machine learning and deep learning, Intel has also joined the competition for AI acceleration. For model inference, Intel not only uses GPUs and CPUs, but also uses NPUs.
We hope to deploy Phi-3.x Family on the end side, hoping to become the most important part of AI PC and Copilot PC. The loading of the model on the end side depends on the cooperation of different hardware manufacturers. This chapter mainly focuses on the application scenario of Intel OpenVINO as a quantitative model.
OpenVINO is an open-source toolkit for optimizing and deploying deep learning models from cloud to edge. It accelerates deep learning inference across various use cases, such as generative AI, video, audio, and language with models from popular frameworks like PyTorch, TensorFlow, ONNX, and more. Convert and optimize models, and deploy across a mix of Intel® hardware and environments, on-premises and on-device, in the browser or in the cloud.
Now with OpenVINO, you can quickly quantize the GenAI model in Intel hardware and accelerate the model reference.
Now OpenVINO supports quantization conversion of Phi-3.5-Vision and Phi-3.5 Instruct
Please ensure the following environment dependencies are installed, this is requirement.txt
--extra-index-url https://download.pytorch.org/whl/cpu
optimum-intel>=1.18.2
nncf>=2.11.0
openvino>=2024.3.0
transformers>=4.40
openvino-genai>=2024.3.0.0
In Terminal, please run this script
export llm_model_id = "microsoft/Phi-3.5-mini-instruct"
export llm_model_path = "your save quantizing Phi-3.5-instruct location"
optimum-cli export openvino --model {llm_model_id} --task text-generation-with-past --weight-format int4 --group-size 128 --ratio 0.6 --sym --trust-remote-code {llm_model_path}
Please run this script in Python or Jupyter lab
import requests
from pathlib import Path
from ov_phi3_vision import convert_phi3_model
import nncf
if not Path("ov_phi3_vision.py").exists():
r = requests.get(url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/notebooks/phi-3-vision/ov_phi3_vision.py")
open("ov_phi3_vision.py", "w").write(r.text)
if not Path("gradio_helper.py").exists():
r = requests.get(url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/notebooks/phi-3-vision/gradio_helper.py")
open("gradio_helper.py", "w").write(r.text)
if not Path("notebook_utils.py").exists():
r = requests.get(url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py")
open("notebook_utils.py", "w").write(r.text)
model_id = "microsoft/Phi-3.5-vision-instruct"
out_dir = Path("../model/phi-3.5-vision-128k-instruct-ov")
compression_configuration = {
"mode": nncf.CompressWeightsMode.INT4_SYM,
"group_size": 64,
"ratio": 0.6,
}
if not out_dir.exists():
convert_phi3_model(model_id, out_dir, compression_configuration)
Labs | Introduce | Go |
---|---|---|
🚀 Lab-Introduce Phi-3.5 Instruct | Learn how to use Phi-3.5 Instruct in your AI PC | Go |
🚀 Lab-Introduce Phi-3.5 Vision (image) | Learn how to use Phi-3.5 Vision to analyze image in your AI PC | Go |
🚀 Lab-Introduce Phi-3.5 Vision (video) | Learn how to use Phi-3.5 Vision to analyze image in your AI PC | Go |
-
Learn more about Intel OpenVINO https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html
-
Intel OpenVINO GitHub Repo https://github.com/openvinotoolkit/openvino.genai