-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phi3 conversion OOM on A100 #44
Comments
Hi @a8nova thanks for reporting the issue! There is a known issue for high memory usage during the conversion process, which may kill the conversion script. Which phi-3 version are you converting? What's the size of phi-3 checkpoint you are using? A colab free instance may only have 12GB free RAM, which isn't enough. Do you happen to have:
We are still actively working on fixing the memory issue, and sorry for the inconvenience! |
Also from the conversion log, it seems the memory consumption is from CUDA. Are you able to try |
Hi @haozha111 - Thank you for the quick response.
Let me try setting |
I am also getting OOM when running with
|
got it, do you mind to update your branch w/ phi-3, and we can fork and try converting it. thanks! |
Changes in the phi3 branch are up to date, you should be able to checkout and run the conversion script. Note that I also had to make changes to loader.py and feed_forward.py. Please let me know if you run into any issues. Thank you! |
Hi @haozha111 @vamsimanchala - Any updates on this? Thanks! |
hi @a8nova , we are making good progress on this issue, and it requires some fixes in our converter stack. We plan to give an update on this issue soon in the coming weeks, thanks for your patience! |
Hi, I am also encountering the same issue. Although I cannot share the model details, it appears to be getting killed at the same point as seen in the logs above. I am looking forward to a fix for this issue. Thanks! |
hi @mitsunami , Are you trying to convert from colab pro instance, or a local Linux workstation, and how much memory do you have? We are making great progress on reducing the converter memory issue and we will give an update on this issue soon, thanks for your patience! |
Hi @haozha111, |
Hi @mitsunami, We recently landed some changes. Can you please exercise the conversion to TFLite and let us know if things look good. Thank you for your patience, |
Description of the bug:
I wanted to convert phi3, I made the necessary changes in my own fork main...a8nova:ai-edge-torch:phi3 but OOM killer is nuking my process
Full error attached:
phi3_conversion_error.txt
Actual vs expected behavior:
The OOM is nuking the conversion script on a Colab A100.
Any other information you'd like to share?
The text was updated successfully, but these errors were encountered: