forked from NVIDIA/NeMo
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* make nemo recognize sequence_parallel_size Signed-off-by: xren <[email protected]> * add helper functions to set up SP running in TE Signed-off-by: xren <[email protected]> * slice seq length for a specific rank Signed-off-by: Xiaowei Ren <[email protected]> * fix data_parallel_size calculation Signed-off-by: Xiaowei Ren <[email protected]> * minor change Signed-off-by: Xiaowei Ren <[email protected]> * add missing argument of self Signed-off-by: Xiaowei Ren <[email protected]> * pass sp_global_ranks to TE transformer layer Signed-off-by: Xiaowei Ren <[email protected]> * fix nsys setting Signed-off-by: Xiaowei Ren <[email protected]> * fix seq_len calculation Signed-off-by: xren <[email protected]> * fix attn_mask split across seq-length dim Signed-off-by: xren <[email protected]> * code update of input split Signed-off-by: xren <[email protected]> * fix loss calculation Signed-off-by: xren <[email protected]> * fix loss_mask_sum calculation Signed-off-by: xren <[email protected]> * fix losss calculation Signed-off-by: xren <[email protected]> * rename sequence_parallelism to context_parallelism Signed-off-by: xren <[email protected]> * minor change Signed-off-by: xren <[email protected]> * fix loss_mask_sum calculation Signed-off-by: xren <[email protected]> * make sure do not call megatron-core parallel_state while cp_size is 1 Signed-off-by: xren <[email protected]> * slice position embedding for different CP rank Signed-off-by: xren <[email protected]> * fix mising property decorator Signed-off-by: xren <[email protected]> * typo fix Signed-off-by: xren <[email protected]> * fix rpe_bias CP slicing Signed-off-by: xren <[email protected]> * code style fix Signed-off-by: xren <[email protected]> * fix loss_mask_sum calculation Signed-off-by: xren <[email protected]> * do not load attention mask if it's not needed Signed-off-by: Xiaowei Ren <[email protected]> * bug fix Signed-off-by: xren <[email protected]> * fix ubuf size with CP > 1 Signed-off-by: Xiaowei Ren <[email protected]> * address naming confusion of mixed dp and cp Signed-off-by: xren <[email protected]> * rewrite cp code by assuming with_context_parallel=False Signed-off-by: xren <[email protected]> * pop context_parallel from dist opt kwargs Signed-off-by: xren <[email protected]> * make sure amax reduction group is aware of context parallelism Signed-off-by: xren <[email protected]> * remove use_fp8 from initialize_model_parallel Signed-off-by: xren <[email protected]> * make implementaitons of setup_transformer_engine_tp_groups and setup_transformer_engine_cp_running consistent Signed-off-by: xren <[email protected]> * cp function renaming Signed-off-by: xren <[email protected]> * make loss logging broadcast aware of cp Signed-off-by: xren <[email protected]> * fix a typo Signed-off-by: Xiaowei Ren <[email protected]> * var name fix Signed-off-by: Xiaowei Ren <[email protected]> * import transformer layer specs from MCore Signed-off-by: Xiaowei Ren <[email protected]> * upgrade MCore version Signed-off-by: Xiaowei Ren <[email protected]> * add add context_parallel into the kwargs of dist opt Signed-off-by: Xiaowei Ren <[email protected]> * remove redundant cp check Signed-off-by: Xiaowei Ren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code style fix Signed-off-by: Xiaowei Ren <[email protected]> * recover docker file Signed-off-by: Xiaowei Ren <[email protected]> * fix seq_length of CP Signed-off-by: Xiaowei Ren <[email protected]> * recover seq-length which has been fixed in mcore Signed-off-by: Xiaowei Ren <[email protected]> * function name fix Signed-off-by: Xiaowei Ren <[email protected]> --------- Signed-off-by: xren <[email protected]> Signed-off-by: Xiaowei Ren <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
76a712a
commit 58d6bce
Showing
12 changed files
with
226 additions
and
56 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.