SD3 missing support from-single-file #8546

vladmandic · 2024-06-13T22:35:24Z

Describe the bug

StableDiffusion3Pipeline does implement from_single_file which correctly loads DiT and VAE.
however, it fails to deal with any of the text-encoders: TE1, TE2 and TE3.

if loading sd3_medium.safetensors that is understandable as that model does have any TEs baked in
if loading sd3_medium_incl_clips.safetensors expectation is that TE1 and TE2 would be loaded correctly and TE3 would be skipped. true, load does not fail, but nothing actually works. see reproduction
if loading sd3_medium_incl_clips_t5xxlfp8.safetensors expectation is same as above plus that FP8 version of TE3 would be correctly loaded - right now that is not yet done and TE3 must be loaded separately.

Reproduction

import warnings
import torch
import diffusers
import transformers
import rich.traceback

rich.traceback.install()
warnings.filterwarnings(action="ignore", category=FutureWarning)
repo_id = 'stabilityai/stable-diffusion-3-medium-diffusers'
cache_dir = '/mnt/models/Diffusers'

pipe = diffusers.StableDiffusion3Pipeline.from_single_file(
    '/mnt/models/stable-diffusion/sd3/sd3_medium_incl_clips.safetensors',
    torch_dtype = torch.float16,
    text_encoder_3 = None,
    tokenizer_3 = None,
    cache_dir = cache_dir,
)

# pipe.text_encoder = transformers.CLIPTextModelWithProjection.from_pretrained(repo_id, subfolder='text_encoder', cache_dir=cache_dir, torch_dtype=torch.float16)
# pipe.text_encoder_2 = transformers.CLIPTextModelWithProjection.from_pretrained(repo_id, subfolder='text_encoder_2', cache_dir=cache_dir, torch_dtype=torch.float16)

pipe.to('cuda')

result = pipe(
    prompt='A photo of a cat',
    width=1024,
    height=1024,
)
image = result.images[0]
image.save('test.png')

results in runtime error on pipe.to('cuda')

Cannot copy out of meta tensor;

or if model is not moved:

Tensor on device cpu is not on the expected device meta

enabling two lines that load TE1 and TE2 make model actually work without issues.

for TE3, attempting to load sd3_medium_incl_clips_t5xxlfp8 results in the same error, so loading it manually is the only way:

pipe.text_encoder_3 = transformers.T5EncoderModel.from_pretrained(
    repo_id,
    subfolder='text_encoder_3',
    quantization_config=transformers.BitsAndBytesConfig(load_in_8bit=True),
    cache_dir=cache_dir,
)
pipe.tokenizer_3 = transformers.T5TokenizerFast.from_pretrained(
    repo_id,
    subfolder='tokenizer_3',
    cache_dir=cache_dir,
)

all-in-all, this totally defeats the point of using from_single_file as loading as TE1, TE2 and TE3 have to be manually added to model by loading using from_pretrained.

Logs

No response

System Info

diffusers==0.29.0
torch==2.3.1
cuda==12.1
ubuntu==24.04

Who can help?

@yiyixuxu @sayakpaul @DN6

The text was updated successfully, but these errors were encountered:

yiyixuxu · 2024-06-13T22:41:23Z

Hi @vladmandic
we merged in this PR today #8517 which expanded support for from_single_file

vladmandic · 2024-06-13T23:04:40Z

ahh, sorry, i was looking at commit log before opening an issue and somehow missed that.
tested with te1 and te2 loaded from sd3_medium_incl_clips.safetensors, that is now working.

te3 loading from sd3_medium_incl_clips_t5xxlfp8.safetensors is failing

/home/vlado/dev/sdnext/venv/lib/python3.12/site-packages/diffusers/loaders/single_file_utils.py:892 in convert_ldm_unet_checkpoint
892 new_checkpoint[diffusers_key] = unet_state_dict[ldm_key]
KeyError: 'label_emb.0.0.weight'

also, do you have plan to release diffusers==0.29.1 patch release with all the extra work that has been going on since release?

yiyixuxu · 2024-06-14T01:39:19Z

yep, we will do a patch soon once our single-file support is "vlad-approved" 😁 and this one is in #8506

yiyixuxu · 2024-06-14T01:39:36Z

cc @DN6 for the fp8 failure

DN6 · 2024-06-17T08:22:11Z

@vladmandic Could you try installing from main? I'm able to load the FP8 checkpoint on my end.

vladmandic · 2024-06-17T13:07:30Z

@DN6 I can load it now, but it does not load TE3 at all.

pipe = diffusers.StableDiffusion3Pipeline.from_single_file('sd3_medium_incl_clips_t5xxlfp8.safetensors')
print('TE1', pipe.text_encoder)
print('TE2', pipe.text_encoder_2)
print('TE3', pipe.text_encoder_3)

you can see that from_single_file does not complain about missing TE3, it simply does not load it - its None

yiyixuxu · 2024-06-18T07:45:15Z

@DN6 I can reproduce this too

yiyixuxu · 2024-06-19T00:11:01Z

@vladmandic can you check if this works now? #8631

vladmandic · 2024-06-19T16:08:26Z

confirmed as working with that fix.

vladmandic added the bug Something isn't working label Jun 13, 2024

yiyixuxu mentioned this issue Jun 19, 2024

fix from_single_file for checkpoints with t5 #8631

Merged

yiyixuxu closed this as completed in #8631 Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SD3 missing support from-single-file #8546

SD3 missing support from-single-file #8546

vladmandic commented Jun 13, 2024 •

edited

Loading

yiyixuxu commented Jun 13, 2024

vladmandic commented Jun 13, 2024

yiyixuxu commented Jun 14, 2024

yiyixuxu commented Jun 14, 2024

DN6 commented Jun 17, 2024

vladmandic commented Jun 17, 2024

yiyixuxu commented Jun 18, 2024

yiyixuxu commented Jun 19, 2024

vladmandic commented Jun 19, 2024

SD3 missing support from-single-file #8546

SD3 missing support from-single-file #8546

Comments

vladmandic commented Jun 13, 2024 • edited Loading

Describe the bug

Reproduction

Logs

System Info

Who can help?

yiyixuxu commented Jun 13, 2024

vladmandic commented Jun 13, 2024

yiyixuxu commented Jun 14, 2024

yiyixuxu commented Jun 14, 2024

DN6 commented Jun 17, 2024

vladmandic commented Jun 17, 2024

yiyixuxu commented Jun 18, 2024

yiyixuxu commented Jun 19, 2024

vladmandic commented Jun 19, 2024

vladmandic commented Jun 13, 2024 •

edited

Loading