You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For very large models, multiple GPU may be needed for quantization but max_memory arg appears to be broken. Everything should be handled by accelerate and there should be no need for this arg. Investigate.
We will remove this. Advanced users should just use and pass device_map fo accelerate. We should not be a arg modifying proxy for acclerate. Passing accelerate config args to accelerate is cleaner, more powerful, and we dont have to maintain compat.
For very large models, multiple GPU may be needed for quantization but
max_memory
arg appears to be broken. Everything should be handled byaccelerate
and there should be no need for this arg. Investigate.Originally posted by @Xu-Chen in #48 (comment)
The text was updated successfully, but these errors were encountered: