From 07baa8f92210d93cb2a8d784afa99f15672ba6db Mon Sep 17 00:00:00 2001 From: lvliang-intel Date: Tue, 3 Sep 2024 14:59:50 +0800 Subject: [PATCH] Add default model for VisualQnA README (#709) * Add default model for VisualQnA README Signed-off-by: lvliang-intel --- VisualQnA/README.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/VisualQnA/README.md b/VisualQnA/README.md index 757a2657c..910deda2a 100644 --- a/VisualQnA/README.md +++ b/VisualQnA/README.md @@ -13,11 +13,21 @@ General architecture of VQA shows below: ![VQA](./assets/img/vqa.png) -This example guides you through how to deploy a [LLaVA](https://llava-vl.github.io/) (Large Language and Vision Assistant) model on Intel Gaudi2 to do visual question and answering task. The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Please visit [Habana AI products](https://habana.ai/products/) for more details. +This example guides you through how to deploy a [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT) (Open Large Multimodal Models) model on Intel Gaudi2 to do visual question and answering task. The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Please visit [Habana AI products](https://habana.ai/products/) for more details. ![llava screenshot](./assets/img/llava_screenshot1.png) ![llava-screenshot](./assets/img/llava_screenshot2.png) +# Required Models + +By default, the model is set to `llava-hf/llava-v1.6-mistral-7b-hf`. To use a different model, update the `LVM_MODEL_ID` variable in the [`set_env.sh`](./docker/gaudi/set_env.sh) file. + +``` +export LVM_MODEL_ID="llava-hf/llava-v1.6-mistral-7b-hf" +``` + +You can choose other llava-next models, such as `llava-hf/llava-v1.6-vicuna-13b-hf`, as needed. + # Deploy VisualQnA Service The VisualQnA service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors.