This repository provides the official implementation of XrayPULSE:
Key feature bulletin points here
- An attempt to extend PULSE to a biomedical multimodal conversational assistant.
- XrayPULSE is fintuned on Xray-Report paired datasets in Chinese
Our model is based on PULSE. We utilize MedCLIP as our medical visual encoder and Q-former (BLIP2) following a simple linear transformation as the adapter to inject the image to PULSE. For aligning the frozen visual encoder and the LLM by the adapter, we generate Chinese-version Xray-Report paired data from free-text radiology reports of two datasets (MIMIC-CXR and OpenI) with the help of chatGPT. To facilitate research in biomedical multimodal learning, we will release the data to the public.
Installation
git clone https://github.com/openmedlab/XrayPULSE.git
cd XrayPULSE
Environment
conda env create -f env.yml
conda activate xraypulse
Prepare the pretrained weights
You can find the pretrained model weights.
The weights of PULSE would be in a single folder in a structure similar to the following:
pulse_weights
├── config.json
├── generation_config.json
├── tokenizer.json
├── tokenizer_config.json
├── special_tokens_map.json
├── pytorch_model.bin.index.json
├── pytorch_model-00001-of-00002.bin
├── pytorch_model-00002-of-00002.bin
Then, set the path of pulse_weights to "bloom_model" in the model config file "xraypulse/configs/models/xraypulse.yaml"
And add the path of the pretrained checkpoint in "demo_configs/xraypulse_demo.yaml".
Run Demo
bash run_demo.sh
This project is built upon the gaint sholders of XrayGPT. Great thanks to it!
We used medical aware image encoder from MedCLIP.
The model architecture of XrayGPT follows BLIP2.
This project is under the CC-BY-NC 4.0 license. See LICENSE for details.