Visualizing Dialogues: Enhancing Image Selection through Dialogue Understanding with Large Language Models

Setup

conda create -n {your env name} python=3.12.2
conda activate {your env name}
pip install -r requirements.txt

# setup openai-clip
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git

Data Pre-processing

PhotoChat

# put the raw photochat dataset in your directory
python utils/data/raw_data_processor.py --raw_path {raw photochat dataset directory} --saved_path {processed data directory}

Run

Zero-shot

# Generate descriptors
python3 utils/data/llm_inference.py --src_path {data directory} --saved_path {descriptor file saved directory} --model_name {LLM model name} --task {descriptor type: query, guess, sum}

# Run zero-shot
python3 CLDiagDescriptor.py --task {choose your target task} --src_path {descriptor file saved directory} --clip_model_name {CLIP model name} --zero_shot

Fully-trained

# Generate descriptors
python3 utils/data/llm_inference.py --src_path {data directory} --saved_path {descriptor file saved directory} --model_name {LLM model name} --task {descriptor type: query, guess, sum}

# Run fully-trained
python3 CLDiagDescriptor.py --task {choose your target task} --src_path {descriptor file saved directory} --clip_model_name {CLIP model name} --n_epochs {number of epochs}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
utils		utils
CLDiagDescriptor.py		CLDiagDescriptor.py
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visualizing Dialogues: Enhancing Image Selection through Dialogue Understanding with Large Language Models

Setup

Data Pre-processing

PhotoChat

Run

Zero-shot

Fully-trained

About

Releases

Packages

Languages

MiuLab/VisualDialog

Folders and files

Latest commit

History

Repository files navigation

Visualizing Dialogues: Enhancing Image Selection through Dialogue Understanding with Large Language Models

Setup

Data Pre-processing

PhotoChat

Run

Zero-shot

Fully-trained

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages