-
Notifications
You must be signed in to change notification settings - Fork 26.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix serialization for offloaded model #31727
Conversation
Could you implement a test that ensures that the previous faulty behavior has been fixed? :) |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Done ! Do you know where I can see the results of the test for test_modeling_utils.py file @ydshieh ? I didn't see the following test |
Hi @SunMarc Let me check, but we might have a problem of some test files not included 😭 |
Sounds good ! Maybe we can put all the utils python file into a separate folder to not have them mixed with the common test files. |
I confirmed that we have some tests not running on daily CI.
Sound a quick solution. It's just we have to make sure there is not subclass of |
Great ! |
Here it is #31730 but I have to see how CI is going. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for the changes @SunMarc
What does this PR do ?
This PR fixes the serialization of offloaded model. With this PR, the logic behind saving offloaded model is triggered only when we use more than one device that contain either "cpu" or "disk" device. Moreover, we fix a small bug that was introduced with this PR. We rename the
state_dict
variable toshard_state_dict
. I also fixed the testtest_save_offloaded_model
.Example:
Fixes #31685