From 6b9dfa022e97133483f2a8d18705e1823c6ca0b3 Mon Sep 17 00:00:00 2001 From: "Teo (Timothy) Wu Haoning" <38696372+teowu@users.noreply.github.com> Date: Tue, 7 Nov 2023 13:01:11 +0800 Subject: [PATCH] Update README.md --- scripts/llava_v1.5/README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/scripts/llava_v1.5/README.md b/scripts/llava_v1.5/README.md index b6006fa..a50b67e 100644 --- a/scripts/llava_v1.5/README.md +++ b/scripts/llava_v1.5/README.md @@ -1,6 +1,8 @@ ## Training with Q-Instruct -This document provides instruction on how to train with **Q-Instruct** dataset on LLaVA-v1.5 (7B/13B). +This document provides instruction on how to train with **Q-Instruct** dataset on LLaVA-v1.5 (7B/13B), under the proposed two strategies (***mix*** and ***after***), shown as follows. + +![](strategies.png) ### Step 0: Pre-requisites @@ -89,6 +91,12 @@ Please make sure you have enough computational resources before training. #### Strategy (a): Mix with High-level Datasets +- *(Pre-requisite)* Get Pre-trained Multi-modal Projectors from original LLaVA-v1.5 + +```shell +sh scripts/llava_v1.5/get_mm_projectors.sh +``` + - 13B *(requires 8x A100 80G), 21h* ```shell