-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] fix variant-identification. #9253
Conversation
pipe.save_pretrained(tmpdir) | ||
|
||
with self.assertRaises(ValueError) as error: | ||
_ = self.pipeline_class.from_pretrained(tmpdir, variant=variant) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would have failed with the fixes from this PR rightfully complaining:
ValueError: The deprecation tuple ('no variant default', '0.24.0', "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available.The default model files: {'model.safetensors', 'diffusion_pytorch_model.safetensors'} will be loaded instead. Make sure to not load from `variant=fp16`if such variant modeling files are not available. Doing so will lead to an error in v0.24.0 as defaulting to non-variantmodeling files is deprecated.") should be removed since diffusers' version 0.31.0.dev0 is >= 0.24.0
We didn't have it because we never tested it. But we should be all good now.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -655,7 +655,7 @@ def test_local_save_load_index(self): | |||
out = pipe(prompt, num_inference_steps=2, generator=generator, output_type="np").images | |||
|
|||
with tempfile.TemporaryDirectory() as tmpdirname: | |||
pipe.save_pretrained(tmpdirname) | |||
pipe.save_pretrained(tmpdirname, variant=variant, safe_serialization=use_safe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been serialized with variant
and safe_serialization
otherwise the test seems wrong to me.
@@ -722,6 +721,18 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P | |||
) | |||
else: | |||
cached_folder = pretrained_model_name_or_path | |||
filenames = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe we shoud just update the _identify_model_variants
function usingvariant_compatible_siblings
and it is still not able to load variants with shared checkpoints from pipeline level
i.e. we should be able to load the fp16 variant in the transformer folder too but it is currently not
import torch
from diffusers import AutoPipelineForText2Image
repo = "fal/AuraFlow" # sharded checkpoint with variant
pipe = AutoPipelineForText2Image.from_pretrained(
repo,
variant="fp16",
torch_dtype=torch.float16,
)
print(pipe.dtype)
you get
A mixture of fp16 and non-fp16 filenames will be loaded.
Loaded fp16 filenames:
[vae/diffusion_pytorch_model.fp16.safetensors, text_encoder/model.fp16.safetensors]
Loaded non-fp16 filenames:
[transformer/diffusion_pytorch_model-00002-of-00003.safetensors, transformer/diffusion_pytorch_model-00003-of-00003.safetensors, transformer/diffusion_pytorch_model-00001-of-00003.safetensors
If this behavior is not expected, please check your folder structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @DN6 @a-r-r-o-w here too
filenames = {sibling.rfilename for sibling in info.siblings} | ||
model_filenames, variant_filenames = variant_compatible_siblings(filenames, variant=variant) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved up to raise error earlier in code.
@@ -551,6 +551,29 @@ def test_download_variant_partly(self): | |||
assert sum(f.endswith(this_format) and not f.endswith(f"{variant}{this_format}") for f in files) == 3 | |||
assert not any(f.endswith(other_format) for f in files) | |||
|
|||
def test_download_variants_with_sharded_checkpoints(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if someone has a better idea to test it out.
@yiyixuxu I have done a couple of changes. LMK what you think. |
Tests run: Will run LoRA and other important tests too. |
@yiyixuxu I think this is ready for another review. |
There's another issue we need to settle on before #9253 (comment). Since we decided to not touch
as we never supported the legacy variant sharding checkpoint format on the pipeline level. So, what is happening as a consequence of that is now we have: safetensors_variant_filenames={'text_encoder/model.fp16-00003-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'vae/diffusion_pytorch_model.fp16.safetensors'}
safetensors_model_filenames={'text_encoder/model.fp16-00003-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'unet/diffusion_pytorch_model-00002-of-00002.safetensors', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'unet/diffusion_pytorch_model-00001-of-00002.safetensors', 'vae/diffusion_pytorch_model.fp16.safetensors'} When "hf-internal-testing/tiny-stable-diffusion-pipe-variants-all-kinds" is called on if (
len(safetensors_variant_filenames) > 0
and safetensors_model_filenames != safetensors_variant_filenames
- and not is_sharded
+ and not _check_legacy_sharding_variant_format(filenames)
): But note that this will still not allow us to parse just the legacy variant shard checkpoints from the pipeline-level. In order for us to only parse the variant (legacy) files we need to adjust It's not a problem for the model-level because of how it's handled here: diffusers/src/diffusers/utils/hub_utils.py Line 413 in 28f9d84
LMK if I am missing something here. |
@sayakpaul
|
@yiyixuxu sorry about the back and forth but I think it's necessary we get this right. Our pipeline-level warning is here (as decided earlier):
It will warn as long as we're hitting:
But not for
Why?
We could maybe check if the
I am bringing this up because we'd want to make sure unified behaviour for both local and remote loading. |
this warning here will cover both case, it is not specific to if not os.path.isdir(pretrained_model_name_or_path):
...
cached_folder = cls.download(
pretrained_model_name_or_path,
)
else:
cached_folder = pretrained_model_name_or_path
# The variant filenames can have the legacy sharding checkpoint format that we check and throw
# a warning if detected.
if variant is not None and _check_legacy_sharding_variant_format(cached_folder, variant):
warn_msg = f"This serialization format is now deprecated to standardize the serialization format between `transformers` and `diffusers`. We recommend you to remove the existing files associated with the current variant ({variant}) and re-obtain them by running a `save_pretrained()`."
logger.warning(warn_msg) |
I don't think so.
model_filenames={'vae/diffusion_pytorch_model.fp16.safetensors', 'text_encoder/pytorch_model.fp16-00003-of-00004.bin', 'unet/diffusion_pytorch_model-00001-of-00002.safetensors', 'text_encoder/pytorch_model.fp16-00001-of-00004.bin', 'text_encoder/pytorch_model.fp16-00002-of-00004.bin', 'text_encoder/pytorch_model.fp16-00004-of-00004.bin', 'unet/diffusion_pytorch_model.safetensors.index.json', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'vae/diffusion_pytorch_model.fp16.bin', 'text_encoder/model.fp16-00003-of-00004.safetensors', 'text_encoder/model.safetensors.index.fp16.json', 'unet/diffusion_pytorch_model-00002-of-00002.safetensors', 'text_encoder/pytorch_model.bin.index.fp16.json', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'vae/diffusion_flax_model.msgpack', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'safety_checker/pytorch_model.fp16.bin'} variant_filenames={'vae/diffusion_pytorch_model.fp16.safetensors', 'text_encoder/pytorch_model.fp16-00003-of-00004.bin', 'text_encoder/pytorch_model.fp16-00001-of-00004.bin', 'text_encoder/pytorch_model.fp16-00002-of-00004.bin', 'text_encoder/pytorch_model.fp16-00004-of-00004.bin', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'vae/diffusion_pytorch_model.fp16.bin', 'text_encoder/model.fp16-00003-of-00004.safetensors', 'text_encoder/model.safetensors.index.fp16.json', 'text_encoder/pytorch_model.bin.index.fp16.json', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'safety_checker/pytorch_model.fp16.bin'} Notice that in
model_filenames to craft our allow_patterns :
We can further confirm this by printing the contents of each subfolder from here:
unet = ['diffusion_pytorch_model-00001-of-00002.safetensors', 'config.json', 'diffusion_pytorch_model-00002-of-00002.safetensors', 'diffusion_pytorch_model.safetensors.index.json']
text_encoder = ['model.fp16-00003-of-00004.safetensors', 'model.safetensors.index.fp16.json', 'config.json', 'model.fp16-00002-of-00004.safetensors', 'model.fp16-00001-of-00004.safetensors', 'model.fp16-00004-of-00004.safetensors']
vae = ['config.json', 'diffusion_pytorch_model.fp16.safetensors'] Hopefully, my concern is clear now. |
i see!! let's add a warning inside |
@yiyixuxu done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
tests/models/test_modeling_common.py
Outdated
("hf-internal-testing/tiny-sd-unet-sharded-latest-format-subfolde", "unet"), | ||
] | ||
) | ||
def test_variant_sharded_ckpt_loads_from_hub(self, repo_id, subfolder): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a nice test! let's also add to the @parameterized to test non-variant(if not already tested), and device_map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Added
parameterized
to have subfolder and variant testing in all the sharding tests here:
https://github.com/huggingface/diffusers/blob/main/tests/models/unets/test_models_unet_2d_condition.py -
Modified this test to have non-variant checkpoints as well.
Ran everything with "pytest tests/models/ -k "sharded" and it was green.
Commit: 1190f7d
Shipping this since I have an approval. |
* fix variant-idenitification. * fix variant * fix sharded variant checkpoint loading. * Apply suggestions from code review * fixes. * more fixes. * remove print. * fixes * fixes * comments * fixes * apply suggestions. * hub_utils.py * fix test * updates * fixes * fixes * Apply suggestions from code review Co-authored-by: YiYi Xu <[email protected]> * updates. * removep patch file. --------- Co-authored-by: YiYi Xu <[email protected]>
What does this PR do?
See: https://huggingface.slack.com/archives/C065E480NN9/p1724387504059169
diffusers/src/diffusers/pipelines/pipeline_utils.py
Line 1282 in 960c149
cached_folder
is local here:diffusers/src/diffusers/pipelines/pipeline_utils.py
Line 724 in 960c149
variant_compatible_siblings()
to cater to shared checkpoints in the diffusers format.accelerate
installation #9429Some in-line comments.