Use huggingface_hub helper function to split state dict #31091

SunMarc · 2024-05-28T14:17:28Z

What does this PR do ?

This PR uses the helper function from huggingface-hub to split a state dict into shards instead of shard_checkpoint. This will make maintenance easier for all HF libraries. Similar PR have been created in accelerate and diffusers.

HuggingFaceDocBuilderDev · 2024-05-28T14:38:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

LysandreJik

This looks good in theory to me; @Wauplin, could you give this a quick look as well?

…f_hub

SunMarc · 2024-05-31T12:16:12Z

Could you make get_storage_id public/documented since it is used in multiple places (not only when cleaning the state dict) @Wauplin ? Thanks !

Wauplin

Looks good to me as well! Thanks for taking care of this @SunMarc :)

If I'm not mistaken there are similar methods for tf / flax in transformers that could benefit from it. Can be done in separate PRs though.

SunMarc · 2024-05-31T12:22:56Z

If I'm not mistaken there are similar methods for tf / flax in transformers that could benefit from it. Can be done in separate PRs though.

I'll have a look and do it in another PR after this PR gets merged and tested !

Just a quick question @Wauplin , the current helper function in huggingface-hub introduced a regression in transformers (not able to handle kiB, miB). Is is fine ? If so, I can modify the failing test : test_checkpoint_sharding_local_bin

Wauplin · 2024-05-31T12:46:53Z

Could you make get_storage_id public/documented since it is used in multiple places (not only when cleaning the state dict) @Wauplin ?

@SunMarc Done in huggingface/huggingface_hub#2304

he current helper function in huggingface-hub introduced a regression in transformers (not able to handle kiB, miB). Is is fine ?

I'm not a transformers's maintainer but I would assume it's fine. I'm quite convinced those parameters are never used (expect for the default value) and even less using kib, mib, etc. notation. If others feel strongly about it, we could somehow mitigate it directly in transformers (typically with a max_size.lower().replace("mib", "mb").replace("gib", "gb") + a deprecation warning). But I'd prefer not to update huggingface_hub for those.

…f_hub

SunMarc · 2024-06-10T15:39:03Z

I'm not a transformers's maintainer but I would assume it's fine. I'm quite convinced those parameters are never used (expect for the default value) and even less using kib, mib, etc. notation. If others feel strongly about it, we could somehow mitigate it directly in transformers (typically with a max_size.lower().replace("mib", "mb").replace("gib", "gb") + a deprecation warning). But I'd prefer not to update huggingface_hub for those.

WDYT @amyeroberts ?

amyeroberts · 2024-06-10T17:58:59Z

but I would assume it's fine. I'm quite convinced those parameters are never used (expect for the default value) and even less using kib, mib, etc. notation. If others feel strongly about it, we could somehow mitigate it directly in transformers (typically with a max_size.lower().replace("mib", "mb").replace("gib", "gb") + a deprecation warning). But I'd prefer not to update huggingface_hub for those.

@Wauplin @SunMarc Sounds good to me! Agreed, I don't think we need to handle a deprecation cycle here

…f_hub

…31091) * shard saving from hf hub * index = None * fix tests * indent

* shard saving from hf hub * index = None * fix tests * indent

SunMarc added 2 commits May 28, 2024 16:09

shard saving from hf hub

5ac5d4b

index = None

6abab8e

LysandreJik reviewed May 31, 2024

View reviewed changes

Merge remote-tracking branch 'upstream/main' into shard_saving_from_h…

1a2e8ee

…f_hub

Wauplin approved these changes May 31, 2024

View reviewed changes

Wauplin mentioned this pull request May 31, 2024

Make get_torch_storage_id public huggingface/huggingface_hub#2304

Merged

Merge remote-tracking branch 'upstream/main' into shard_saving_from_h…

3cb9136

…f_hub

SunMarc requested a review from amyeroberts June 10, 2024 15:38

SunMarc added 3 commits June 11, 2024 17:09

fix tests

04e05c7

indent

70748f5

Merge remote-tracking branch 'upstream/main' into shard_saving_from_h…

c519611

…f_hub

amyeroberts approved these changes Jun 11, 2024

View reviewed changes

SunMarc merged commit 254b25a into main Jun 12, 2024
21 checks passed

SunMarc deleted the shard_saving_from_hf_hub branch June 12, 2024 12:10

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 14, 2024

Use huggingface_hub helper function to split state dict (huggingface#…

d4cd9d0

…31091) * shard saving from hf hub * index = None * fix tests * indent

itazap pushed a commit that referenced this pull request Jun 17, 2024

Use huggingface_hub helper function to split state dict (#31091)

b408f88

* shard saving from hf hub * index = None * fix tests * indent

itazap pushed a commit that referenced this pull request Jun 17, 2024

Use huggingface_hub helper function to split state dict (#31091)

5a6a6a8

* shard saving from hf hub * index = None * fix tests * indent

itazap pushed a commit that referenced this pull request Jun 17, 2024

Use huggingface_hub helper function to split state dict (#31091)

c320bbc

* shard saving from hf hub * index = None * fix tests * indent

itazap pushed a commit that referenced this pull request Jun 18, 2024

Use huggingface_hub helper function to split state dict (#31091)

d1cf295

* shard saving from hf hub * index = None * fix tests * indent

itazap pushed a commit that referenced this pull request Jun 20, 2024

Use huggingface_hub helper function to split state dict (#31091)

c9cf763

* shard saving from hf hub * index = None * fix tests * indent

SunMarc mentioned this pull request Jul 1, 2024

Fix serialization for offloaded model #31727

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use huggingface_hub helper function to split state dict #31091

Use huggingface_hub helper function to split state dict #31091

SunMarc commented May 28, 2024

HuggingFaceDocBuilderDev commented May 28, 2024

LysandreJik left a comment

SunMarc commented May 31, 2024

Wauplin left a comment

SunMarc commented May 31, 2024 •

edited

Loading

Wauplin commented May 31, 2024

SunMarc commented Jun 10, 2024

amyeroberts commented Jun 10, 2024

Use huggingface_hub helper function to split state dict #31091

Use huggingface_hub helper function to split state dict #31091

Conversation

SunMarc commented May 28, 2024

What does this PR do ?

HuggingFaceDocBuilderDev commented May 28, 2024

LysandreJik left a comment

Choose a reason for hiding this comment

SunMarc commented May 31, 2024

Wauplin left a comment

Choose a reason for hiding this comment

SunMarc commented May 31, 2024 • edited Loading

Wauplin commented May 31, 2024

SunMarc commented Jun 10, 2024

amyeroberts commented Jun 10, 2024

SunMarc commented May 31, 2024 •

edited

Loading