We showcase an application that has models strategically distributed and off-loaded to the NPU and integrated GPU (iGPU), based on varying intensity of computation requirements and hardware support. The application consists of two Convolutional Neural Network (CNN) based models that run on the NPU, and one generative model that is off-loaded to the iGPU. ONNX Runtime provides support for Vitis AI EP and DirectML EP, used to demonstrate inference on the NPU and iGPU respectively. Some of these models operate concurrently, thus utilizing the accelerator resources to their full potential.
Assumes Anaconda prompt for these instructions.
- Install Ryzen AI Software using the automatic installer. This should create a conda environment that can be used for this example. Check the installation path variables
# Default location of RyzenAI software installation
echo %RYZEN_AI_INSTALLATION_PATH%
- Install dependencies for Yolov8 and RCAN:
python -m pip install -r requirements.txt
-
Download pre-quantized ONNX models from Huggingface for Yolov8 and RCAN in the same directory.
-
Install dependencies for Stable Diffusion:
python -m pip install -r stable_diffusion\requirements-common.txt
- Make sure XLNX_VART_FIRMWARE is set to point to the correct xclbin from the VOE package
echo %XLNX_VART_FIRMWARE%
- Copy vaip_config.json from the installed VOE package to this directory
copy %RYZEN_AI_INSTALLATION_PATH%\voe-4.0-win_amd64\vaip_config.json .
- Generate optimized stable diffusion model using Olive:
cd stable_diffusion
python stable_diffusion.py --provider dml --optimize
The optimized FP16 models should be generated in models\optimized-dml.
- Run the example (the following command off-loads Stable Diffusion to the iGPU and Yolov8+RCAN to the NPU):
cd ..
python pipeline.py -i test/test_img2img.mp4 --npu --provider_config vaip_config.json --igpu