Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the megatron config lr scheduler default and fix to change partitions script #8094

Merged
merged 3 commits into from
Dec 29, 2023

Conversation

shan18
Copy link
Member

@shan18 shan18 commented Dec 28, 2023

Two fixes:

  • Change the default: optim.sched.constant_steps=0
  • Resolving the issue where the change partition script throws an error when sequence_parallel=true in config when converting the model to TP=1.

Signed-off-by: Shantanu Acharya <[email protected]>
@shan18 shan18 requested a review from titu1994 December 28, 2023 14:28
@github-actions github-actions bot added the NLP label Dec 28, 2023
@shan18 shan18 requested a review from bmwshop December 28, 2023 14:30
bmwshop
bmwshop previously approved these changes Dec 28, 2023
Copy link
Collaborator

@bmwshop bmwshop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

titu1994
titu1994 previously approved these changes Dec 28, 2023
@titu1994
Copy link
Collaborator

jenkins

Signed-off-by: Shantanu Acharya <[email protected]>
@shan18 shan18 dismissed stale reviews from titu1994 and bmwshop via 339ed4e December 28, 2023 20:38
@shan18 shan18 requested a review from bmwshop December 28, 2023 20:38
Copy link
Collaborator

@bmwshop bmwshop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the convert_nemo_gpt_to_mcore.py. fix!

@titu1994 titu1994 merged commit 7faeee8 into main Dec 29, 2023
14 checks passed
@titu1994 titu1994 deleted the config-fix branch December 29, 2023 17:38
pzelasko pushed a commit to pzelasko/NeMo that referenced this pull request Jan 3, 2024
…titions script (NVIDIA#8094)

* fix the config file issues

Signed-off-by: Shantanu Acharya <[email protected]>

* disable sequence parallel for mcore

Signed-off-by: Shantanu Acharya <[email protected]>

---------

Signed-off-by: Shantanu Acharya <[email protected]>
Signed-off-by: Piotr Żelasko <[email protected]>
ssh-meister pushed a commit to ssh-meister/NeMo that referenced this pull request Feb 15, 2024
…titions script (NVIDIA#8094)

* fix the config file issues

Signed-off-by: Shantanu Acharya <[email protected]>

* disable sequence parallel for mcore

Signed-off-by: Shantanu Acharya <[email protected]>

---------

Signed-off-by: Shantanu Acharya <[email protected]>
Signed-off-by: Sasha Meister <[email protected]>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
…titions script (NVIDIA#8094)

* fix the config file issues

Signed-off-by: Shantanu Acharya <[email protected]>

* disable sequence parallel for mcore

Signed-off-by: Shantanu Acharya <[email protected]>

---------

Signed-off-by: Shantanu Acharya <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants