Stable Diffusion XL fine-tuning with Dreambooth & Lora: how to structure local dataset for fine-tuning with ROI #6890
-
Hi! I am using https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora_sdxl.py data/metadata.jsonl With the metadata.json file as follows: The fine-tuning works. In the next step, I would like to take the fine-tuned model and train on images which show abnormality. I would be grateful, if you could tell me how shall I structure the dataset for this task? Shall I add the ROI coordinates to a prompt in similar way as here: https://huggingface.co/docs/datasets/image_dataset (section: "object detection")? Or to add the ROIs as a segmentation map in form of an image? (How to structure the folder then and the metadata.jsonl?) I would appreciate your help. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
Cc: @lhoestq from the |
Beta Was this translation helpful? Give feedback.
I think this script is to train a dreambooth model only using text and image - it doesn't seem to support bounding boxes or categories.
Maybe you can simply filter your images to keep the ones with abnormalities and use this filtered dataset instead ? You could even mention the abnormality name in the texts.