Work plan and enhancement / 工作计划和用户诉求 #194

zRzRzRzRzRzRzR · 2024-08-28T13:04:15Z

Tasks that have been identified and scheduled:

Fine-tuning support for Diffusers version models
Adaptation for CPU / NPU inference frameworks (e.g., Huawei, Intel devices)
ComfyUI adaptation work and plugin support

已经明确并排期的任务：

Diffusers 版本模型微调支持
CPU/NPU 推理框架适配（例如华为NPU、英特尔设备）
ComfyUI 适配工作及插件支持

如果你有更多诉求，欢迎在这里提出

zRzRzRzRzRzRzR · 2024-08-28T13:18:30Z

#182 #191 #47 #84 have similar content, all looking forward to the open-source CogVideoX I2V model. We are conducting research and evaluation.

#111 #186 are similar, both expecting to provide fine-tuning work for VAE. We will try to place it in the fine-tuning version, and there is a probability that it can be adapted in diffusers fine-tuning, but it will consume relatively high resources

rookiemann · 2024-08-28T13:21:15Z

5b image to video please! I2V would be lovely!

PR-Ryan · 2024-08-30T03:43:50Z

The 3D VAE model consumes significantly more memory compared to diffusion models, which is severely limiting the batch size for fine-tuning. Any suggestions or optimizations to reduce memory usage would be greatly appreciated.

zRzRzRzRzRzRzR · 2024-08-30T06:00:24Z

The 3D VAE model consumes significantly more memory compared to diffusion models, which is severely limiting the batch size for fine-tuning. Any suggestions or optimizations to reduce memory usage would be greatly appreciated.

You make a very good point. We will work together with the Diffusers team to modify the fake quantization (fakecp) process in the VAE section to optimize it for lower memory usage. Please give us some time, as we will collaborate with the Diffusers team to develop a version of the model that is fine-tuned specifically for Diffusers, which is expected to save a significant amount of memory.

KihongK · 2024-09-05T08:33:16Z

First of all, thank you for your excellent work!

The dataset format used SAT way for fine tuning & full training be the same as the format that will be used for fine-tuning Diffusers version models?

+ wrong discord link

zRzRzRzRzRzRzR · 2024-09-12T02:04:16Z

We are currently completing several tasks

Adaptation work for the I2V model, expected to be open-sourced in September
Detailed tutorial on model fine-tuning, expected to be completed in September

Work that has been completed

The model fine-tuned with SAT can be converted to a diffusers model and mounted directly. For specific usage, see here
The new Discord invitation link will be merged into the main branch today

sincerity711 · 2024-09-16T17:37:38Z

When will vertical video generation be supported?

zRzRzRzRzRzRzR · 2024-09-17T03:23:19Z

When will vertical video generation be supported?

The current model cannot generate vertical videos, such as 480x720 resolution. We are working on fine-tuning to reach this capability, but it’s still in progress. Once we have any updates, we will share them as soon as possible.

zRzRzRzRzRzRzR · 2024-09-17T03:31:56Z

Two related issues working now:

Diffusers supports CogVideoX-I2V. This PR has been merged, but the patch has not yet been released.
Fine-tuning the Diffusers version of CogVideoX-2B T2V without using the SAT model, running directly under the Diffusers framework. It can run on a single A100 GPU.
This PR is still under debugging. I am working with members of the Diffusers team to attempt fine-tuning the CogVideoX-5B and I2V models. We will provide a small dataset (a few dozen samples) for this PR, which is sufficient for LoRA fine-tuning CogVideoX.

Many thanks to @a-r-r-o-w for the help with these two tasks!

JH-Xie · 2024-09-22T15:09:25Z

when will CogVideoX-2B-I2V be released?

SanGilbert · 2024-09-25T03:37:20Z

Various resolution support. Maybe RoPE + resize data to random resolution will achieve this?
Control ability of model, more than just text prompt.

Florenyci · 2024-09-28T00:16:58Z

@zRzRzRzRzRzRzR Many thanks to you and the team! I know fine-tuning vae is not very useful, but I'm curious is there any way I can just fine-tuning decoder part?

zRzRzRzRzRzRzR · 2024-09-28T08:59:36Z

Our publicly available fine-tuning code is for the fine-tuning of the transformers part, not for vae. We indeed have not updated the training and fine-tuning parts of vae (because I have not received the corresponding permissions either).
Additionally, fine-tuning vae alone seems to have little significance for the overall model effect. If you want to try fine-tuning the diffusers model, all fine-tuning of the transformers module is already in dev, currently it is lora, and it is expected to implement SFT by early October.

Florenyci · 2024-09-30T23:41:04Z

Our publicly available fine-tuning code is for the fine-tuning of the transformers part, not for vae. We indeed have not updated the training and fine-tuning parts of vae (because I have not received the corresponding permissions either). Additionally, fine-tuning vae alone seems to have little significance for the overall model effect. If you want to try fine-tuning the diffusers model, all fine-tuning of the transformers module is already in dev, currently it is lora, and it is expected to implement SFT by early October.

@zRzRzRzRzRzRzR thank you but actually I'm asking how to fine-tuning VAE decoder, any advice?

chenshuo20 · 2024-10-01T10:18:45Z

Hi @zRzRzRzRzRzRzR, what's your plan about diffuser I2V lora fine-tune code? Thanks!

alfredplpl · 2024-10-08T05:31:33Z

@zRzRzRzRzRzRzR

Thank you for your great works!

I would like to covert a full-finetuned 2b model weight in sat into a model weight in diffusers.
My full-finetuned model weight in sat is approx. 22GB.
Therefore, I do not convert the model by your conversion code: python ../tools/convert_weight_sat2hf.py.
This model weight may includes transformer and optimizer weight and so on.
Therefore, I tried to extract transformer from the model weight only.
However, something went on.

How can I do it?

zRzRzRzRzRzRzR pinned this issue Aug 28, 2024

zRzRzRzRzRzRzR added the enhancement New feature or request label Aug 28, 2024

zRzRzRzRzRzRzR changed the title ~~Our Plan / 我们的工作计划~~ Work plan and enhancement / 工作计划和用户诉求 Aug 28, 2024

zRzRzRzRzRzRzR self-assigned this Aug 29, 2024

zRzRzRzRzRzRzR mentioned this issue Aug 30, 2024

3D VAE finetune #111

Closed

zRzRzRzRzRzRzR mentioned this issue Sep 16, 2024

hope for image to video #270

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Work plan and enhancement / 工作计划和用户诉求 #194

Work plan and enhancement / 工作计划和用户诉求 #194

zRzRzRzRzRzRzR commented Aug 28, 2024

zRzRzRzRzRzRzR commented Aug 28, 2024 •

edited

Loading

rookiemann commented Aug 28, 2024

PR-Ryan commented Aug 30, 2024 •

edited

Loading

zRzRzRzRzRzRzR commented Aug 30, 2024

KihongK commented Sep 5, 2024 •

edited

Loading

zRzRzRzRzRzRzR commented Sep 12, 2024

sincerity711 commented Sep 16, 2024

zRzRzRzRzRzRzR commented Sep 17, 2024

zRzRzRzRzRzRzR commented Sep 17, 2024

JH-Xie commented Sep 22, 2024

SanGilbert commented Sep 25, 2024

Florenyci commented Sep 28, 2024

zRzRzRzRzRzRzR commented Sep 28, 2024

Florenyci commented Sep 30, 2024

chenshuo20 commented Oct 1, 2024

alfredplpl commented Oct 8, 2024 •

edited

Loading

Work plan and enhancement / 工作计划和用户诉求 #194

Work plan and enhancement / 工作计划和用户诉求 #194

Comments

zRzRzRzRzRzRzR commented Aug 28, 2024

zRzRzRzRzRzRzR commented Aug 28, 2024 • edited Loading

rookiemann commented Aug 28, 2024

PR-Ryan commented Aug 30, 2024 • edited Loading

zRzRzRzRzRzRzR commented Aug 30, 2024

KihongK commented Sep 5, 2024 • edited Loading

zRzRzRzRzRzRzR commented Sep 12, 2024

sincerity711 commented Sep 16, 2024

zRzRzRzRzRzRzR commented Sep 17, 2024

zRzRzRzRzRzRzR commented Sep 17, 2024

JH-Xie commented Sep 22, 2024

SanGilbert commented Sep 25, 2024

Florenyci commented Sep 28, 2024

zRzRzRzRzRzRzR commented Sep 28, 2024

Florenyci commented Sep 30, 2024

chenshuo20 commented Oct 1, 2024

alfredplpl commented Oct 8, 2024 • edited Loading

zRzRzRzRzRzRzR commented Aug 28, 2024 •

edited

Loading

PR-Ryan commented Aug 30, 2024 •

edited

Loading

KihongK commented Sep 5, 2024 •

edited

Loading

alfredplpl commented Oct 8, 2024 •

edited

Loading