Based on the Paddle-Lite backend, FastDeploy supports model inference on Huawei's Ascend NPU. For more detailed information, please refer to: Paddle Lite Deployment Example.
This document describes how to compile C++ and Python FastDeploy source code under ARM/X86_64 Linux OS environment to generate prediction libraries for Huawei Sunrise NPU as the target hardware.
For more compilation options, please refer to the FastDeploy compilation options description
- Atlas 300I Pro, see detailes at Spec Sheet
- Install the driver and firmware package (Driver and Firmware) for the Atlas 300I Pro
- Download the matching driver and firmware package at:
- https://www.hiascend.com/hardware/firmware-drivers?tag=community(Community Edition)
- https://www.hiascend.com/hardware/firmware-drivers?tag=commercial(Commercial version)
- driver:Atlas-300i-pro-npu-driver_5.1.rc2_linux-aarch64.run (aarch64 as example)
- firmware:Atlas-300i-pro-npu-firmware_5.1.rc2.run
- Installing drivers and firmware packages:
$ chmod +x *.run
$ ./Atlas-300i-pro-npu-driver_5.1.rc2_linux-aarch64.run --full
$ ./Atlas-300i-pro-npu-firmware_5.1.rc2.run --full
$ reboot
# Check the driver information to confirm successful installation
$ npu-smi info
- More system and detailed information is available in the Ascend Hardware Product Documentation
- os: ARM-Linux, X86_64-Linux
- gcc, g++, git, make, wget, python, pip, python-dev, patchelf
- cmake (version 3.10 or above recommended)
In order to ensure consistency with the FastDeploy verified build environment, it is recommended to use the Docker development environment for configuration.
On aarch64 platform,
# Download Dockerfile
$ wget https://bj.bcebos.com/fastdeploy/test/Ascend_ubuntu18.04_aarch64_5.1.rc2.Dockerfile
# Create docker images
$ docker build --network=host -f Ascend_ubuntu18.04_aarch64_5.1.rc2.Dockerfile -t paddlelite/ascend_aarch64:cann_5.1.rc2 .
# Create container
$ docker run -itd --privileged --name=ascend-aarch64 --net=host -v $PWD:/Work -w /Work --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/hisi_hdc --device /dev/devmm_svm -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver/:/usr/local/Ascend/driver/ paddlelite/ascend_aarch64:cann_5.1.rc2 /bin/bash
# Enter the container
$ docker exec -it ascend-aarch64 /bin/bash
# Verify that the Ascend environment for the container is created successfully
$ npu-smi info
On X86_64 platform,
# Download Dockerfile
$ wget https://paddlelite-demo.bj.bcebos.com/devices/huawei/ascend/intel_x86/Ascend_ubuntu18.04_x86_5.1.rc1.alpha001.Dockerfile
# Create docker images
$ docker build --network=host -f Ascend_ubuntu18.04_x86_5.1.rc1.alpha001.Dockerfile -t paddlelite/ascend_x86:cann_5.1.rc1.alpha001 .
# Create container
$ docker run -itd --privileged --name=ascend-x86 --net=host -v $PWD:/Work -w /Work --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/hisi_hdc --device /dev/devmm_svm -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver/:/usr/local/Ascend/driver/ paddlelite/ascend_x86:cann_5.1.1.alpha001 /bin/bash
# Enter the container
$ docker exec -it ascend-x86 /bin/bash
# Verify that the Ascend environment for the container is created successfully
$ npu-smi info
Once the above steps are successful, the user can start compiling FastDeploy directly from within docker.
Note:
-
If you want to use another CANN version in Docker, please update the CANN download path in the Dockerfile file, and update the corresponding driver and firmware. The current default in aarch64 Dockerfile is CANN 5.1.RC2, in x86_64 is CANN 5.1.RC1.
-
If users do not want to use docker, you can refer to Compile Environment Preparation for ARM Linux Environments or Compile Environment Preparation for X86 Linux Environments provided by Paddle Lite and configure your own compilation environment, and then download and install the proper CANN packages to complete the configuration.
After setting up the compilation environment, the compilation command is as follows.
# Download the latest source code
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy
mkdir build && cd build
# CMake configuration with Ascend
cmake -DWITH_ASCEND=ON \
-DCMAKE_INSTALL_PREFIX=fastdeploy-ascend \
-DENABLE_VISION=ON \
..
# Build FastDeploy Ascend C++ SDK
make -j8
make install
When the compilation is complete, the fastdeploy-ascend directory is created in the current build directory, indicating that the FastDeploy library has been compiled.
# Download the latest source code
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/python
export WITH_ASCEND=ON
export ENABLE_VISION=ON
python setup.py build
python setup.py bdist_wheel
#After the compilation is complete, please install the whl package in the dist folder of the current directory.
FlyCV is a high performance computer image processing library, providing better performance than other image processing libraries, especially in the ARM architecture. FastDeploy is now integrated with FlyCV, allowing users to use FlyCV on supported hardware platforms to accelerate model end-to-end inference performance. In end-to-end model inference, the pre-processing and post-processing phases are CPU computation, we recommend using FlyCV for end-to-end inference performance acceleration when you are using ARM CPU + Ascend hardware platform. See Enable FlyCV documentation for details.
-
Deploying PaddleClas Classification Model on Huawei Ascend NPU using C++ please refer to: PaddleClas Huawei Ascend NPU C++ Deployment Example
-
Deploying PaddleClas classification model on Huawei Ascend NPU using Python please refer to: PaddleClas Huawei Ascend NPU Python Deployment Example