[Core] fix variant-identification. #9253

sayakpaul · 2024-08-23T08:39:39Z

What does this PR do?

See: https://huggingface.slack.com/archives/C065E480NN9/p1724387504059169

Deprecates the previous variant + sharding checkpoint format to standardize it between transformers and diffusers (discussed in [Core] fix variant-identification. #9253 (comment)).
Adds variants serialization and loading tests.
Raises error instead of this warning:

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 1282 in 960c149

if len(variant_filenames) == 0 and variant is not None:
Add a similar error when cached_folder is local here:

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 724 in 960c149

cached_folder = pretrained_model_name_or_path
Properly handles the behaviour of variant_compatible_siblings() to cater to shared checkpoints in the diffusers format.
Some CI failures should be resolved by [CI] updates to the CI report naming, and accelerate installation #9429

Some in-line comments.

sayakpaul · 2024-08-23T08:41:12Z

tests/pipelines/test_pipelines_common.py

+            pipe.save_pretrained(tmpdir)
+
+            with self.assertRaises(ValueError) as error:
+                _ = self.pipeline_class.from_pretrained(tmpdir, variant=variant)


This would have failed with the fixes from this PR rightfully complaining:

ValueError: The deprecation tuple ('no variant default', '0.24.0', "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available.The default model files: {'model.safetensors', 'diffusion_pytorch_model.safetensors'} will be loaded instead. Make sure to not load from `variant=fp16`if such variant modeling files are not available. Doing so will lead to an error in v0.24.0 as defaulting to non-variantmodeling files is deprecated.") should be removed since diffusers' version 0.31.0.dev0 is >= 0.24.0

We didn't have it because we never tested it. But we should be all good now.

HuggingFaceDocBuilderDev · 2024-08-23T08:45:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul · 2024-08-23T09:53:11Z

tests/pipelines/test_pipelines.py

@@ -655,7 +655,7 @@ def test_local_save_load_index(self):
                out = pipe(prompt, num_inference_steps=2, generator=generator, output_type="np").images

                with tempfile.TemporaryDirectory() as tmpdirname:
-                    pipe.save_pretrained(tmpdirname)
+                    pipe.save_pretrained(tmpdirname, variant=variant, safe_serialization=use_safe)


This should have been serialized with variant and safe_serialization otherwise the test seems wrong to me.

yiyixuxu · 2024-08-23T19:08:37Z

src/diffusers/pipelines/pipeline_utils.py

@@ -722,6 +721,18 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P
            )
        else:
            cached_folder = pretrained_model_name_or_path
+            filenames = []


I think maybe we shoud just update the _identify_model_variants function usingvariant_compatible_siblings

and it is still not able to load variants with shared checkpoints from pipeline level

i.e. we should be able to load the fp16 variant in the transformer folder too but it is currently not

import torch from diffusers import AutoPipelineForText2Image repo = "fal/AuraFlow" # sharded checkpoint with variant pipe = AutoPipelineForText2Image.from_pretrained( repo, variant="fp16", torch_dtype=torch.float16, ) print(pipe.dtype)

you get

A mixture of fp16 and non-fp16 filenames will be loaded. Loaded fp16 filenames: [vae/diffusion_pytorch_model.fp16.safetensors, text_encoder/model.fp16.safetensors] Loaded non-fp16 filenames: [transformer/diffusion_pytorch_model-00002-of-00003.safetensors, transformer/diffusion_pytorch_model-00003-of-00003.safetensors, transformer/diffusion_pytorch_model-00001-of-00003.safetensors If this behavior is not expected, please check your folder structure.

cc @DN6 @a-r-r-o-w here too

src/diffusers/pipelines/pipeline_loading_utils.py

sayakpaul · 2024-08-27T06:34:33Z

src/diffusers/pipelines/pipeline_utils.py

-            filenames = {sibling.rfilename for sibling in info.siblings}
-            model_filenames, variant_filenames = variant_compatible_siblings(filenames, variant=variant)
-


This was moved up to raise error earlier in code.

src/diffusers/pipelines/pipeline_utils.py

sayakpaul · 2024-08-27T06:35:28Z

tests/pipelines/test_pipelines.py

@@ -551,6 +551,29 @@ def test_download_variant_partly(self):
                assert sum(f.endswith(this_format) and not f.endswith(f"{variant}{this_format}") for f in files) == 3
                assert not any(f.endswith(other_format) for f in files)

+    def test_download_variants_with_sharded_checkpoints(self):


LMK if someone has a better idea to test it out.

sayakpaul · 2024-08-27T06:51:38Z

@yiyixuxu I have done a couple of changes. LMK what you think.

src/diffusers/pipelines/pipeline_loading_utils.py

sayakpaul · 2024-09-10T08:36:18Z

Tests run: pytest tests/ -k "sharded"

Will run LoRA and other important tests too.

sayakpaul · 2024-09-11T04:13:55Z

@yiyixuxu I think this is ready for another review.

src/diffusers/pipelines/pipeline_utils.py

src/diffusers/models/modeling_utils.py

sayakpaul · 2024-09-24T05:40:14Z

@yiyixuxu

There's another issue we need to settle on before #9253 (comment).

Since we decided to not touch

diffusers/src/diffusers/pipelines/pipeline_loading_utils.py

Line 138 in 28f9d84

    
           def variant_compatible_siblings(filenames, variant=None) -> Union[List[os.PathLike], str]:

as we never supported the legacy variant sharding checkpoint format on the pipeline level. So, what is happening as a consequence of that is now we have:

safetensors_variant_filenames={'text_encoder/model.fp16-00003-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'vae/diffusion_pytorch_model.fp16.safetensors'}

safetensors_model_filenames={'text_encoder/model.fp16-00003-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'unet/diffusion_pytorch_model-00002-of-00002.safetensors', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'unet/diffusion_pytorch_model-00001-of-00002.safetensors', 'vae/diffusion_pytorch_model.fp16.safetensors'}

When "hf-internal-testing/tiny-stable-diffusion-pipe-variants-all-kinds" is called on DiffusionPipeline.from_pretrained() with variant="fp16". So this is why it'd fail the safetensors_model_filenames != safetensors_variant_filenames check. What we could is this (with slight modifications made to _check_legacy_sharding_variant_format())

if (
    len(safetensors_variant_filenames) > 0
    and safetensors_model_filenames != safetensors_variant_filenames
-    and not is_sharded
+    and not _check_legacy_sharding_variant_format(filenames)
):

But note that this will still not allow us to parse just the legacy variant shard checkpoints from the pipeline-level. In order for us to only parse the variant (legacy) files we need to adjust variant_compatible_siblings(), IMO.

It's not a problem for the model-level because of how it's handled here:

diffusers/src/diffusers/utils/hub_utils.py

Line 413 in 28f9d84

def _get_checkpoint_shard_files(

LMK if I am missing something here.

yiyixuxu · 2024-09-24T19:03:59Z

@sayakpaul
since we have never actually supported loading that checkpoint from pipeine-level and already deprecating it (for loading from model level), we do not need to start supporting it now. a check and warn to ask them to save the checkpoint to correct format is sufficient. The check fail as expected

But note that this will still not allow us to parse just the legacy variant shard checkpoints from the pipeline-level. In

sayakpaul · 2024-09-25T03:04:28Z

@yiyixuxu sorry about the back and forth but I think it's necessary we get this right.

Our pipeline-level warning is here (as decided earlier):

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 739 in b6794ed

    
           if variant is not None and _check_legacy_sharding_variant_format(cached_folder, variant):

It will warn as long as we're hitting:

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 735 in b6794ed

cached_folder = pretrained_model_name_or_path

But not for

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 717 in b6794ed

cached_folder = cls.download(

Why?
Because we're not downloading the (legacy) variant sharded checkpoint files as variant_compatible_siblings() is not returning them:

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 1263 in b6794ed

    
           model_filenames, variant_filenames = variant_compatible_siblings(filenames, variant=variant)

We could maybe check if the filenames here

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 1263 in b6794ed

    
           model_filenames, variant_filenames = variant_compatible_siblings(filenames, variant=variant)

correspond to the legacy format and throw a warning from there? Would that be okay?

I am bringing this up because we'd want to make sure unified behaviour for both local and remote loading.

yiyixuxu · 2024-09-25T03:25:13Z

this warning here will cover both case, it is not specific to cached_folder = pretrained_model_name_or_path

        if not os.path.isdir(pretrained_model_name_or_path):
            ...
            cached_folder = cls.download(
                pretrained_model_name_or_path,
            )
        else:
            cached_folder = pretrained_model_name_or_path

        # The variant filenames can have the legacy sharding checkpoint format that we check and throw
        # a warning if detected.
        if variant is not None and _check_legacy_sharding_variant_format(cached_folder, variant):
            warn_msg = f"This serialization format is now deprecated to standardize the serialization format between `transformers` and `diffusers`. We recommend you to remove the existing files associated with the current variant ({variant}) and re-obtain them by running a `save_pretrained()`."
            logger.warning(warn_msg)

sayakpaul · 2024-09-25T03:49:44Z

I don't think so.

cls.download() will NOT download the (legacy) variant sharded checkpoint files because of what I mentioned in #9253 (comment). To confirm that I printed the files that are getting downloaded and here's the log:

model_filenames={'vae/diffusion_pytorch_model.fp16.safetensors', 'text_encoder/pytorch_model.fp16-00003-of-00004.bin', 'unet/diffusion_pytorch_model-00001-of-00002.safetensors', 'text_encoder/pytorch_model.fp16-00001-of-00004.bin', 'text_encoder/pytorch_model.fp16-00002-of-00004.bin', 'text_encoder/pytorch_model.fp16-00004-of-00004.bin', 'unet/diffusion_pytorch_model.safetensors.index.json', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'vae/diffusion_pytorch_model.fp16.bin', 'text_encoder/model.fp16-00003-of-00004.safetensors', 'text_encoder/model.safetensors.index.fp16.json', 'unet/diffusion_pytorch_model-00002-of-00002.safetensors', 'text_encoder/pytorch_model.bin.index.fp16.json', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'vae/diffusion_flax_model.msgpack', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'safety_checker/pytorch_model.fp16.bin'}

variant_filenames={'vae/diffusion_pytorch_model.fp16.safetensors', 'text_encoder/pytorch_model.fp16-00003-of-00004.bin', 'text_encoder/pytorch_model.fp16-00001-of-00004.bin', 'text_encoder/pytorch_model.fp16-00002-of-00004.bin', 'text_encoder/pytorch_model.fp16-00004-of-00004.bin', 'text_encoder/model.fp16-00004-of-00004.safetensors', 'vae/diffusion_pytorch_model.fp16.bin', 'text_encoder/model.fp16-00003-of-00004.safetensors', 'text_encoder/model.safetensors.index.fp16.json', 'text_encoder/pytorch_model.bin.index.fp16.json', 'text_encoder/model.fp16-00001-of-00004.safetensors', 'text_encoder/model.fp16-00002-of-00004.safetensors', 'safety_checker/model.fp16.safetensors', 'safety_checker/pytorch_model.fp16.bin'}

Notice that in model_filenames we're not picking the (legacy) variants associated to the UNet because

diffusers/src/diffusers/pipelines/pipeline_loading_utils.py

Line 138 in b52684c

    
           def variant_compatible_siblings(filenames, variant=None) -> Union[List[os.PathLike], str]:

is unable to match them with regex. We use model_filenames to craft our allow_patterns:

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 1321 in b52684c

allow_patterns = list(model_filenames)

We can further confirm this by printing the contents of each subfolder from here:

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 1446 in b52684c

cached_folder = snapshot_download(

unet = ['diffusion_pytorch_model-00001-of-00002.safetensors', 'config.json', 'diffusion_pytorch_model-00002-of-00002.safetensors', 'diffusion_pytorch_model.safetensors.index.json']
text_encoder = ['model.fp16-00003-of-00004.safetensors', 'model.safetensors.index.fp16.json', 'config.json', 'model.fp16-00002-of-00004.safetensors', 'model.fp16-00001-of-00004.safetensors', 'model.fp16-00004-of-00004.safetensors']
vae = ['config.json', 'diffusion_pytorch_model.fp16.safetensors']

Hopefully, my concern is clear now.

yiyixuxu · 2024-09-25T04:56:21Z

i see!! let's add a warning inside download too then
make sure _check_legacy_sharding_variant_format accepts files or folder

sayakpaul · 2024-09-26T12:24:19Z

@yiyixuxu done.

yiyixuxu

thanks!

tests/models/test_modeling_common.py

yiyixuxu · 2024-09-25T02:09:40Z

tests/models/test_modeling_common.py

+            ("hf-internal-testing/tiny-sd-unet-sharded-latest-format-subfolde", "unet"),
+        ]
+    )
+    def test_variant_sharded_ckpt_loads_from_hub(self, repo_id, subfolder):


this is a nice test! let's also add to the @parameterized to test non-variant(if not already tested), and device_map

Added parameterized to have subfolder and variant testing in all the sharding tests here:
https://github.com/huggingface/diffusers/blob/main/tests/models/unets/test_models_unet_2d_condition.py

Modified this test to have non-variant checkpoints as well.

Ran everything with "pytest tests/models/ -k "sharded" and it was green.

Commit: 1190f7d

src/diffusers/pipelines/pipeline_utils.py

Co-authored-by: YiYi Xu <[email protected]>

sayakpaul · 2024-09-28T04:27:20Z

Shipping this since I have an approval.

* fix variant-idenitification. * fix variant * fix sharded variant checkpoint loading. * Apply suggestions from code review * fixes. * more fixes. * remove print. * fixes * fixes * comments * fixes * apply suggestions. * hub_utils.py * fix test * updates * fixes * fixes * Apply suggestions from code review Co-authored-by: YiYi Xu <[email protected]> * updates. * removep patch file. --------- Co-authored-by: YiYi Xu <[email protected]>

fix variant-idenitification.

6b379a9

sayakpaul requested review from DN6, yiyixuxu and a-r-r-o-w August 23, 2024 08:39

sayakpaul changed the title ~~[Core] fix variant-idenitification.~~ [Core] fix variant-identification. Aug 23, 2024

sayakpaul commented Aug 23, 2024

View reviewed changes

sayakpaul added 2 commits August 23, 2024 15:16

fix variant

f155ec7

Merge branch 'main' into variant-tests

3f36e59

sayakpaul commented Aug 23, 2024

View reviewed changes

yiyixuxu reviewed Aug 23, 2024

View reviewed changes

sayakpaul added 2 commits August 27, 2024 12:02

fix sharded variant checkpoint loading.

91253e8

Merge branch 'main' into variant-tests

dd5941e

sayakpaul commented Aug 27, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_loading_utils.py Outdated Show resolved Hide resolved

sayakpaul commented Aug 27, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

sayakpaul commented Aug 27, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

sayakpaul commented Aug 27, 2024

View reviewed changes

Apply suggestions from code review

564b8b4

sayakpaul requested a review from yiyixuxu August 30, 2024 21:29

Merge branch 'main' into variant-tests

fdd0435

yiyixuxu reviewed Sep 4, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_loading_utils.py Outdated Show resolved Hide resolved

sayakpaul added 2 commits September 10, 2024 07:45

Merge branch 'main' into variant-tests

d5cad9e

fixes.

c0b1ceb

sayakpaul added 3 commits September 10, 2024 19:28

more fixes.

247dd93

remove print.

b024a6d

Merge branch 'main' into variant-tests

fdfdc5f

yiyixuxu reviewed Sep 21, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Show resolved Hide resolved

sayakpaul added 3 commits September 21, 2024 09:47

Merge branch 'main' into variant-tests

f2ab3de

Merge branch 'main' into variant-tests

10baa9d

fixes

25ac01f

sayakpaul requested a review from yiyixuxu September 23, 2024 07:25

yiyixuxu reviewed Sep 23, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

src/diffusers/models/modeling_utils.py Outdated Show resolved Hide resolved

Merge branch 'main' into variant-tests

bac62ac

Merge branch 'main' into variant-tests

b6794ed

sayakpaul added 2 commits September 26, 2024 17:09

Merge branch 'main' into variant-tests

fcb4e39

fixes

4c0c5d2

sayakpaul requested a review from yiyixuxu September 26, 2024 12:24

Merge branch 'main' into variant-tests

0b1c2a6

yiyixuxu approved these changes Sep 27, 2024

View reviewed changes

yiyixuxu reviewed Sep 27, 2024

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved

sayakpaul and others added 4 commits September 28, 2024 08:51

Apply suggestions from code review

8ad6b23

Co-authored-by: YiYi Xu <[email protected]>

updates.

1190f7d

removep patch file.

59cfefb

Merge branch 'main' into variant-tests

d72f5c1

sayakpaul merged commit 1154243 into main Sep 28, 2024
18 checks passed

sayakpaul deleted the variant-tests branch September 28, 2024 04:27

DN6 mentioned this pull request Nov 5, 2024

Improve downloads of sharded variants #9869

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] fix variant-identification. #9253

[Core] fix variant-identification. #9253

sayakpaul commented Aug 23, 2024 •

edited

Loading

sayakpaul Aug 23, 2024

HuggingFaceDocBuilderDev commented Aug 23, 2024

sayakpaul Aug 23, 2024

yiyixuxu Aug 23, 2024 •

edited

Loading

yiyixuxu Aug 23, 2024

sayakpaul Aug 27, 2024

sayakpaul Aug 27, 2024

sayakpaul commented Aug 27, 2024

sayakpaul commented Sep 10, 2024 •

edited

Loading

sayakpaul commented Sep 11, 2024

sayakpaul commented Sep 24, 2024

yiyixuxu commented Sep 24, 2024 •

edited

Loading

sayakpaul commented Sep 25, 2024

yiyixuxu commented Sep 25, 2024

sayakpaul commented Sep 25, 2024

yiyixuxu commented Sep 25, 2024

sayakpaul commented Sep 26, 2024

yiyixuxu left a comment

yiyixuxu Sep 25, 2024

sayakpaul Sep 28, 2024 •

edited

Loading

sayakpaul commented Sep 28, 2024

		filenames = {sibling.rfilename for sibling in info.siblings}
		model_filenames, variant_filenames = variant_compatible_siblings(filenames, variant=variant)

[Core] fix variant-identification. #9253

[Core] fix variant-identification. #9253

Conversation

sayakpaul commented Aug 23, 2024 • edited Loading

What does this PR do?

sayakpaul Aug 23, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Aug 23, 2024

sayakpaul Aug 23, 2024

Choose a reason for hiding this comment

yiyixuxu Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

yiyixuxu Aug 23, 2024

Choose a reason for hiding this comment

sayakpaul Aug 27, 2024

Choose a reason for hiding this comment

sayakpaul Aug 27, 2024

Choose a reason for hiding this comment

sayakpaul commented Aug 27, 2024

sayakpaul commented Sep 10, 2024 • edited Loading

sayakpaul commented Sep 11, 2024

sayakpaul commented Sep 24, 2024

yiyixuxu commented Sep 24, 2024 • edited Loading

sayakpaul commented Sep 25, 2024

yiyixuxu commented Sep 25, 2024

sayakpaul commented Sep 25, 2024

yiyixuxu commented Sep 25, 2024

sayakpaul commented Sep 26, 2024

yiyixuxu left a comment

Choose a reason for hiding this comment

yiyixuxu Sep 25, 2024

Choose a reason for hiding this comment

sayakpaul Sep 28, 2024 • edited Loading

Choose a reason for hiding this comment

sayakpaul commented Sep 28, 2024

sayakpaul commented Aug 23, 2024 •

edited

Loading

yiyixuxu Aug 23, 2024 •

edited

Loading

sayakpaul commented Sep 10, 2024 •

edited

Loading

yiyixuxu commented Sep 24, 2024 •

edited

Loading

sayakpaul Sep 28, 2024 •

edited

Loading