-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for GPTNeoX models #32
Commits on Sep 27, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 7baa4f7 - Browse repository at this point
Copy the full SHA 7baa4f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 41977df - Browse repository at this point
Copy the full SHA 41977dfView commit details
Commits on Sep 28, 2023
-
[fix] some of the bugs preventing fine-tune run
+ There's still bugs in the attention dimensions mismatch
Configuration menu - View commit details
-
Copy full SHA for 9c9d0a2 - Browse repository at this point
Copy the full SHA 9c9d0a2View commit details -
[fix] dimesion discrepancy between attention mask and the query length
+ group batch attention is skipped to avoid this problem for now
Configuration menu - View commit details
-
Copy full SHA for a5111ef - Browse repository at this point
Copy the full SHA a5111efView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5862050 - Browse repository at this point
Copy the full SHA 5862050View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0cf0dfd - Browse repository at this point
Copy the full SHA 0cf0dfdView commit details
Commits on Sep 30, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 1532c4b - Browse repository at this point
Copy the full SHA 1532c4bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6fdffbb - Browse repository at this point
Copy the full SHA 6fdffbbView commit details -
Configuration menu - View commit details
-
Copy full SHA for fe97f86 - Browse repository at this point
Copy the full SHA fe97f86View commit details
Commits on Oct 2, 2023
-
[add] torch autocast for flash attention safety
+ flash attention only supports in fp16/bf16
Configuration menu - View commit details
-
Copy full SHA for 9e30a15 - Browse repository at this point
Copy the full SHA 9e30a15View commit details -
[fix] HF built-in rotary embedding is not compatible with flash-atten…
…tion + cos/sin cache tensor is not trained parameter, so it's not autocast along with other model parameters through `torch_dtype`.
Configuration menu - View commit details
-
Copy full SHA for 3f9c47c - Browse repository at this point
Copy the full SHA 3f9c47cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b21e949 - Browse repository at this point
Copy the full SHA b21e949View commit details -
[rollback] torch.cuda autocast causes half precision error
+ Works fine without the torch.cuda autocast context, so rollback.
Configuration menu - View commit details
-
Copy full SHA for b224273 - Browse repository at this point
Copy the full SHA b224273View commit details
Commits on Oct 3, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 9123e42 - Browse repository at this point
Copy the full SHA 9123e42View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7203de2 - Browse repository at this point
Copy the full SHA 7203de2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8a11ef8 - Browse repository at this point
Copy the full SHA 8a11ef8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 02e4c1c - Browse repository at this point
Copy the full SHA 02e4c1cView commit details