-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor quantization pipeline updates #8924
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jenkins |
janekl
force-pushed
the
jlasek/quantization_updates
branch
from
April 16, 2024 13:48
09e2692
to
0132eeb
Compare
oyilmaz-nvidia
previously approved these changes
Apr 16, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
jenkins |
janekl
force-pushed
the
jlasek/quantization_updates
branch
3 times, most recently
from
April 17, 2024 13:55
e1de5d2
to
a9254ed
Compare
oyilmaz-nvidia
approved these changes
Apr 17, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We can merge once the CI passes.
Signed-off-by: Jan Lasek <[email protected]>
Signed-off-by: Jan Lasek <[email protected]>
Signed-off-by: Jan Lasek <[email protected]>
janekl
force-pushed
the
jlasek/quantization_updates
branch
from
April 18, 2024 08:12
a9254ed
to
9e5a418
Compare
marcromeyn
pushed a commit
that referenced
this pull request
Apr 22, 2024
* Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Marc Romeyn <[email protected]>
marcromeyn
added a commit
that referenced
this pull request
Apr 22, 2024
* Adding MegatronParallel Signed-off-by: Marc Romeyn <[email protected]> * Minor quantization pipeline updates (#8924) * Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix converter (#8960) Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix memory leak at loss func (#8868) * PR #8803: Update embedding init prototype to match mc Signed-off-by: Jaemin Choi <[email protected]> * PR #8810: Fix import of get_gpt_layer_ammo_spec Signed-off-by: Jaemin Choi <[email protected]> * PR #8853: Fix memory leak at loss func Signed-off-by: Jaemin Choi <[email protected]> --------- Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * PP support in LoRA merge script (#8934) * initial commit Signed-off-by: Chen Cui <[email protected]> * enable pp support for merge script and fix output precision Signed-off-by: Chen Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove incomplete script for next release Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Mingyuanm/sdxl export (#8926) * Move cached embedding devices and dtype for onnx export consistency Signed-off-by: Mingyuan Ma <[email protected]> * Add old trt export/inference script, currently not working in latest container. Signed-off-by: Mingyuan Ma <[email protected]> * Add NeMo TRT inference pipeline and quatization workflow Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add guards to avoid undefined variables Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add conversion script from hf sdxl to nemo sdxl Signed-off-by: Mingyuan Ma <[email protected]> * Update quantize pipeline to adapt to variable image dimension Signed-off-by: Mingyuan Ma <[email protected]> * update sdxl pipeline to be aware of additional emb channels Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add guards for potential local var Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copyright header Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update calib prompt file path Signed-off-by: Mingyuan Ma <[email protected]> * Update file paths Signed-off-by: Mingyuan Ma <[email protected]> * minor update Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update default quantization config Signed-off-by: Mingyuan Ma <[email protected]> * remove unused imports/vars Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Avoid unpacking NeMo checkpoints before exporting to TRT-LLM (#8866) * Replaced unpacking of nemo checkpoints on export with a VFS-like TarPath object. Signed-off-by: Alexey Panteleev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the signature of ZarrPathStore.__delitem__ Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * update (#8978) Signed-off-by: eharper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * change the condition for get qkv tensor from linear_qkv output (#8965) Signed-off-by: HuiyingLi <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Update Latest News (#8837) * Update Latest News Adds links to articles on * NeMo framework on GKE * Responsible Gen AI using NeMo and Picasso * NeMo powering Amazon Titan foundation models Signed-off-by: Shashank Verma <[email protected]> * Minor updates to latest news in README * Remove bullets * Editing text for clarity Signed-off-by: Shashank Verma <[email protected]> * Format latest news as a dropdown list * Uses embedded html to format news to dropdown, hiding lengthy details * Fixes formatting of the title Signed-off-by: Shashank Verma <[email protected]> * Add break to improve readability of latest news image Signed-off-by: Shashank Verma <[email protected]> * Add LLM and MM section in latest news Signed-off-by: Shashank Verma <[email protected]> * Add margin in latest news expandable lists Signed-off-by: Shashank Verma <[email protected]> * Remove styling of expandable list * Github appears to not render styled elements when embedded as raw html in rst Signed-off-by: Shashank Verma <[email protected]> * Fold the first news item by default Signed-off-by: Shashank Verma <[email protected]> --------- Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix incorrect link to latest news in README (#8985) Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * make unit tests works Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * add pytest-mock to unit test reqs Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Enable using hybrid asr models in CTC Segmentation tool (#8828) * enable using hybrid asr models in ctc segmentation tool Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Add safety checks for 'data' key in MegatronGPTModel cfg (#8991) Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * address some comments Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * TDT confidence fix (#8982) * tdt confidence fix --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Address PR comments Signed-off-by: Marc Romeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: eharper <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Alexey Panteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]>
xingyaoww
pushed a commit
to xingyaoww/NeMo
that referenced
this pull request
Apr 23, 2024
* Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]>
xingyaoww
pushed a commit
to xingyaoww/NeMo
that referenced
this pull request
Apr 23, 2024
* Adding MegatronParallel Signed-off-by: Marc Romeyn <[email protected]> * Minor quantization pipeline updates (NVIDIA#8924) * Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix converter (NVIDIA#8960) Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix memory leak at loss func (NVIDIA#8868) * PR NVIDIA#8803: Update embedding init prototype to match mc Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8810: Fix import of get_gpt_layer_ammo_spec Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8853: Fix memory leak at loss func Signed-off-by: Jaemin Choi <[email protected]> --------- Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * PP support in LoRA merge script (NVIDIA#8934) * initial commit Signed-off-by: Chen Cui <[email protected]> * enable pp support for merge script and fix output precision Signed-off-by: Chen Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove incomplete script for next release Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Mingyuanm/sdxl export (NVIDIA#8926) * Move cached embedding devices and dtype for onnx export consistency Signed-off-by: Mingyuan Ma <[email protected]> * Add old trt export/inference script, currently not working in latest container. Signed-off-by: Mingyuan Ma <[email protected]> * Add NeMo TRT inference pipeline and quatization workflow Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add guards to avoid undefined variables Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add conversion script from hf sdxl to nemo sdxl Signed-off-by: Mingyuan Ma <[email protected]> * Update quantize pipeline to adapt to variable image dimension Signed-off-by: Mingyuan Ma <[email protected]> * update sdxl pipeline to be aware of additional emb channels Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add guards for potential local var Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copyright header Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update calib prompt file path Signed-off-by: Mingyuan Ma <[email protected]> * Update file paths Signed-off-by: Mingyuan Ma <[email protected]> * minor update Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update default quantization config Signed-off-by: Mingyuan Ma <[email protected]> * remove unused imports/vars Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Avoid unpacking NeMo checkpoints before exporting to TRT-LLM (NVIDIA#8866) * Replaced unpacking of nemo checkpoints on export with a VFS-like TarPath object. Signed-off-by: Alexey Panteleev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the signature of ZarrPathStore.__delitem__ Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * update (NVIDIA#8978) Signed-off-by: eharper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * change the condition for get qkv tensor from linear_qkv output (NVIDIA#8965) Signed-off-by: HuiyingLi <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Update Latest News (NVIDIA#8837) * Update Latest News Adds links to articles on * NeMo framework on GKE * Responsible Gen AI using NeMo and Picasso * NeMo powering Amazon Titan foundation models Signed-off-by: Shashank Verma <[email protected]> * Minor updates to latest news in README * Remove bullets * Editing text for clarity Signed-off-by: Shashank Verma <[email protected]> * Format latest news as a dropdown list * Uses embedded html to format news to dropdown, hiding lengthy details * Fixes formatting of the title Signed-off-by: Shashank Verma <[email protected]> * Add break to improve readability of latest news image Signed-off-by: Shashank Verma <[email protected]> * Add LLM and MM section in latest news Signed-off-by: Shashank Verma <[email protected]> * Add margin in latest news expandable lists Signed-off-by: Shashank Verma <[email protected]> * Remove styling of expandable list * Github appears to not render styled elements when embedded as raw html in rst Signed-off-by: Shashank Verma <[email protected]> * Fold the first news item by default Signed-off-by: Shashank Verma <[email protected]> --------- Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix incorrect link to latest news in README (NVIDIA#8985) Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * make unit tests works Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * add pytest-mock to unit test reqs Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Enable using hybrid asr models in CTC Segmentation tool (NVIDIA#8828) * enable using hybrid asr models in ctc segmentation tool Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Add safety checks for 'data' key in MegatronGPTModel cfg (NVIDIA#8991) Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * address some comments Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * TDT confidence fix (NVIDIA#8982) * tdt confidence fix --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Address PR comments Signed-off-by: Marc Romeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: eharper <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Alexey Panteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]>
alxzhang-amazon
pushed a commit
to alxzhang-amazon/NeMo
that referenced
this pull request
Apr 26, 2024
* Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]>
alxzhang-amazon
pushed a commit
to alxzhang-amazon/NeMo
that referenced
this pull request
Apr 26, 2024
* Adding MegatronParallel Signed-off-by: Marc Romeyn <[email protected]> * Minor quantization pipeline updates (NVIDIA#8924) * Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix converter (NVIDIA#8960) Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix memory leak at loss func (NVIDIA#8868) * PR NVIDIA#8803: Update embedding init prototype to match mc Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8810: Fix import of get_gpt_layer_ammo_spec Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8853: Fix memory leak at loss func Signed-off-by: Jaemin Choi <[email protected]> --------- Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * PP support in LoRA merge script (NVIDIA#8934) * initial commit Signed-off-by: Chen Cui <[email protected]> * enable pp support for merge script and fix output precision Signed-off-by: Chen Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove incomplete script for next release Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Mingyuanm/sdxl export (NVIDIA#8926) * Move cached embedding devices and dtype for onnx export consistency Signed-off-by: Mingyuan Ma <[email protected]> * Add old trt export/inference script, currently not working in latest container. Signed-off-by: Mingyuan Ma <[email protected]> * Add NeMo TRT inference pipeline and quatization workflow Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add guards to avoid undefined variables Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add conversion script from hf sdxl to nemo sdxl Signed-off-by: Mingyuan Ma <[email protected]> * Update quantize pipeline to adapt to variable image dimension Signed-off-by: Mingyuan Ma <[email protected]> * update sdxl pipeline to be aware of additional emb channels Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add guards for potential local var Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copyright header Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update calib prompt file path Signed-off-by: Mingyuan Ma <[email protected]> * Update file paths Signed-off-by: Mingyuan Ma <[email protected]> * minor update Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update default quantization config Signed-off-by: Mingyuan Ma <[email protected]> * remove unused imports/vars Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Avoid unpacking NeMo checkpoints before exporting to TRT-LLM (NVIDIA#8866) * Replaced unpacking of nemo checkpoints on export with a VFS-like TarPath object. Signed-off-by: Alexey Panteleev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the signature of ZarrPathStore.__delitem__ Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * update (NVIDIA#8978) Signed-off-by: eharper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * change the condition for get qkv tensor from linear_qkv output (NVIDIA#8965) Signed-off-by: HuiyingLi <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Update Latest News (NVIDIA#8837) * Update Latest News Adds links to articles on * NeMo framework on GKE * Responsible Gen AI using NeMo and Picasso * NeMo powering Amazon Titan foundation models Signed-off-by: Shashank Verma <[email protected]> * Minor updates to latest news in README * Remove bullets * Editing text for clarity Signed-off-by: Shashank Verma <[email protected]> * Format latest news as a dropdown list * Uses embedded html to format news to dropdown, hiding lengthy details * Fixes formatting of the title Signed-off-by: Shashank Verma <[email protected]> * Add break to improve readability of latest news image Signed-off-by: Shashank Verma <[email protected]> * Add LLM and MM section in latest news Signed-off-by: Shashank Verma <[email protected]> * Add margin in latest news expandable lists Signed-off-by: Shashank Verma <[email protected]> * Remove styling of expandable list * Github appears to not render styled elements when embedded as raw html in rst Signed-off-by: Shashank Verma <[email protected]> * Fold the first news item by default Signed-off-by: Shashank Verma <[email protected]> --------- Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix incorrect link to latest news in README (NVIDIA#8985) Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * make unit tests works Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * add pytest-mock to unit test reqs Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Enable using hybrid asr models in CTC Segmentation tool (NVIDIA#8828) * enable using hybrid asr models in ctc segmentation tool Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Add safety checks for 'data' key in MegatronGPTModel cfg (NVIDIA#8991) Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * address some comments Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * TDT confidence fix (NVIDIA#8982) * tdt confidence fix --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Address PR comments Signed-off-by: Marc Romeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: eharper <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Alexey Panteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]>
galv
pushed a commit
to galv/NeMo
that referenced
this pull request
Apr 29, 2024
* Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]>
galv
pushed a commit
to galv/NeMo
that referenced
this pull request
Apr 29, 2024
* Adding MegatronParallel Signed-off-by: Marc Romeyn <[email protected]> * Minor quantization pipeline updates (NVIDIA#8924) * Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix converter (NVIDIA#8960) Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix memory leak at loss func (NVIDIA#8868) * PR NVIDIA#8803: Update embedding init prototype to match mc Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8810: Fix import of get_gpt_layer_ammo_spec Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8853: Fix memory leak at loss func Signed-off-by: Jaemin Choi <[email protected]> --------- Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * PP support in LoRA merge script (NVIDIA#8934) * initial commit Signed-off-by: Chen Cui <[email protected]> * enable pp support for merge script and fix output precision Signed-off-by: Chen Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove incomplete script for next release Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Mingyuanm/sdxl export (NVIDIA#8926) * Move cached embedding devices and dtype for onnx export consistency Signed-off-by: Mingyuan Ma <[email protected]> * Add old trt export/inference script, currently not working in latest container. Signed-off-by: Mingyuan Ma <[email protected]> * Add NeMo TRT inference pipeline and quatization workflow Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add guards to avoid undefined variables Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add conversion script from hf sdxl to nemo sdxl Signed-off-by: Mingyuan Ma <[email protected]> * Update quantize pipeline to adapt to variable image dimension Signed-off-by: Mingyuan Ma <[email protected]> * update sdxl pipeline to be aware of additional emb channels Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add guards for potential local var Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copyright header Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update calib prompt file path Signed-off-by: Mingyuan Ma <[email protected]> * Update file paths Signed-off-by: Mingyuan Ma <[email protected]> * minor update Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update default quantization config Signed-off-by: Mingyuan Ma <[email protected]> * remove unused imports/vars Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Avoid unpacking NeMo checkpoints before exporting to TRT-LLM (NVIDIA#8866) * Replaced unpacking of nemo checkpoints on export with a VFS-like TarPath object. Signed-off-by: Alexey Panteleev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the signature of ZarrPathStore.__delitem__ Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * update (NVIDIA#8978) Signed-off-by: eharper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * change the condition for get qkv tensor from linear_qkv output (NVIDIA#8965) Signed-off-by: HuiyingLi <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Update Latest News (NVIDIA#8837) * Update Latest News Adds links to articles on * NeMo framework on GKE * Responsible Gen AI using NeMo and Picasso * NeMo powering Amazon Titan foundation models Signed-off-by: Shashank Verma <[email protected]> * Minor updates to latest news in README * Remove bullets * Editing text for clarity Signed-off-by: Shashank Verma <[email protected]> * Format latest news as a dropdown list * Uses embedded html to format news to dropdown, hiding lengthy details * Fixes formatting of the title Signed-off-by: Shashank Verma <[email protected]> * Add break to improve readability of latest news image Signed-off-by: Shashank Verma <[email protected]> * Add LLM and MM section in latest news Signed-off-by: Shashank Verma <[email protected]> * Add margin in latest news expandable lists Signed-off-by: Shashank Verma <[email protected]> * Remove styling of expandable list * Github appears to not render styled elements when embedded as raw html in rst Signed-off-by: Shashank Verma <[email protected]> * Fold the first news item by default Signed-off-by: Shashank Verma <[email protected]> --------- Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix incorrect link to latest news in README (NVIDIA#8985) Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * make unit tests works Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * add pytest-mock to unit test reqs Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Enable using hybrid asr models in CTC Segmentation tool (NVIDIA#8828) * enable using hybrid asr models in ctc segmentation tool Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Add safety checks for 'data' key in MegatronGPTModel cfg (NVIDIA#8991) Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * address some comments Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * TDT confidence fix (NVIDIA#8982) * tdt confidence fix --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Address PR comments Signed-off-by: Marc Romeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: eharper <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Alexey Panteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]>
suiyoubi
pushed a commit
that referenced
this pull request
May 2, 2024
* Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Ao Tang <[email protected]>
suiyoubi
pushed a commit
that referenced
this pull request
May 2, 2024
* Adding MegatronParallel Signed-off-by: Marc Romeyn <[email protected]> * Minor quantization pipeline updates (#8924) * Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix converter (#8960) Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix memory leak at loss func (#8868) * PR #8803: Update embedding init prototype to match mc Signed-off-by: Jaemin Choi <[email protected]> * PR #8810: Fix import of get_gpt_layer_ammo_spec Signed-off-by: Jaemin Choi <[email protected]> * PR #8853: Fix memory leak at loss func Signed-off-by: Jaemin Choi <[email protected]> --------- Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * PP support in LoRA merge script (#8934) * initial commit Signed-off-by: Chen Cui <[email protected]> * enable pp support for merge script and fix output precision Signed-off-by: Chen Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove incomplete script for next release Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Mingyuanm/sdxl export (#8926) * Move cached embedding devices and dtype for onnx export consistency Signed-off-by: Mingyuan Ma <[email protected]> * Add old trt export/inference script, currently not working in latest container. Signed-off-by: Mingyuan Ma <[email protected]> * Add NeMo TRT inference pipeline and quatization workflow Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add guards to avoid undefined variables Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add conversion script from hf sdxl to nemo sdxl Signed-off-by: Mingyuan Ma <[email protected]> * Update quantize pipeline to adapt to variable image dimension Signed-off-by: Mingyuan Ma <[email protected]> * update sdxl pipeline to be aware of additional emb channels Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add guards for potential local var Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copyright header Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update calib prompt file path Signed-off-by: Mingyuan Ma <[email protected]> * Update file paths Signed-off-by: Mingyuan Ma <[email protected]> * minor update Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update default quantization config Signed-off-by: Mingyuan Ma <[email protected]> * remove unused imports/vars Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Avoid unpacking NeMo checkpoints before exporting to TRT-LLM (#8866) * Replaced unpacking of nemo checkpoints on export with a VFS-like TarPath object. Signed-off-by: Alexey Panteleev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the signature of ZarrPathStore.__delitem__ Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * update (#8978) Signed-off-by: eharper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * change the condition for get qkv tensor from linear_qkv output (#8965) Signed-off-by: HuiyingLi <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Update Latest News (#8837) * Update Latest News Adds links to articles on * NeMo framework on GKE * Responsible Gen AI using NeMo and Picasso * NeMo powering Amazon Titan foundation models Signed-off-by: Shashank Verma <[email protected]> * Minor updates to latest news in README * Remove bullets * Editing text for clarity Signed-off-by: Shashank Verma <[email protected]> * Format latest news as a dropdown list * Uses embedded html to format news to dropdown, hiding lengthy details * Fixes formatting of the title Signed-off-by: Shashank Verma <[email protected]> * Add break to improve readability of latest news image Signed-off-by: Shashank Verma <[email protected]> * Add LLM and MM section in latest news Signed-off-by: Shashank Verma <[email protected]> * Add margin in latest news expandable lists Signed-off-by: Shashank Verma <[email protected]> * Remove styling of expandable list * Github appears to not render styled elements when embedded as raw html in rst Signed-off-by: Shashank Verma <[email protected]> * Fold the first news item by default Signed-off-by: Shashank Verma <[email protected]> --------- Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix incorrect link to latest news in README (#8985) Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * make unit tests works Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * add pytest-mock to unit test reqs Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Enable using hybrid asr models in CTC Segmentation tool (#8828) * enable using hybrid asr models in ctc segmentation tool Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Add safety checks for 'data' key in MegatronGPTModel cfg (#8991) Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * address some comments Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * TDT confidence fix (#8982) * tdt confidence fix --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Address PR comments Signed-off-by: Marc Romeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: eharper <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Alexey Panteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Signed-off-by: Ao Tang <[email protected]>
rohitrango
pushed a commit
to rohitrango/NeMo
that referenced
this pull request
Jun 25, 2024
* Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]>
rohitrango
pushed a commit
to rohitrango/NeMo
that referenced
this pull request
Jun 25, 2024
* Adding MegatronParallel Signed-off-by: Marc Romeyn <[email protected]> * Minor quantization pipeline updates (NVIDIA#8924) * Detect 'arcname' prefix in utils when handling .nemo tarball Signed-off-by: Jan Lasek <[email protected]> * Address megatron_amp_O2 = True case in quantization Signed-off-by: Jan Lasek <[email protected]> * Add Megatron-LM to PYTHONPATH correctly in Jenkinsfile Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix converter (NVIDIA#8960) Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix memory leak at loss func (NVIDIA#8868) * PR NVIDIA#8803: Update embedding init prototype to match mc Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8810: Fix import of get_gpt_layer_ammo_spec Signed-off-by: Jaemin Choi <[email protected]> * PR NVIDIA#8853: Fix memory leak at loss func Signed-off-by: Jaemin Choi <[email protected]> --------- Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * PP support in LoRA merge script (NVIDIA#8934) * initial commit Signed-off-by: Chen Cui <[email protected]> * enable pp support for merge script and fix output precision Signed-off-by: Chen Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove incomplete script for next release Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Mingyuanm/sdxl export (NVIDIA#8926) * Move cached embedding devices and dtype for onnx export consistency Signed-off-by: Mingyuan Ma <[email protected]> * Add old trt export/inference script, currently not working in latest container. Signed-off-by: Mingyuan Ma <[email protected]> * Add NeMo TRT inference pipeline and quatization workflow Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add guards to avoid undefined variables Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add conversion script from hf sdxl to nemo sdxl Signed-off-by: Mingyuan Ma <[email protected]> * Update quantize pipeline to adapt to variable image dimension Signed-off-by: Mingyuan Ma <[email protected]> * update sdxl pipeline to be aware of additional emb channels Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add guards for potential local var Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copyright header Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update calib prompt file path Signed-off-by: Mingyuan Ma <[email protected]> * Update file paths Signed-off-by: Mingyuan Ma <[email protected]> * minor update Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update default quantization config Signed-off-by: Mingyuan Ma <[email protected]> * remove unused imports/vars Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Avoid unpacking NeMo checkpoints before exporting to TRT-LLM (NVIDIA#8866) * Replaced unpacking of nemo checkpoints on export with a VFS-like TarPath object. Signed-off-by: Alexey Panteleev <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed the signature of ZarrPathStore.__delitem__ Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * update (NVIDIA#8978) Signed-off-by: eharper <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * change the condition for get qkv tensor from linear_qkv output (NVIDIA#8965) Signed-off-by: HuiyingLi <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Update Latest News (NVIDIA#8837) * Update Latest News Adds links to articles on * NeMo framework on GKE * Responsible Gen AI using NeMo and Picasso * NeMo powering Amazon Titan foundation models Signed-off-by: Shashank Verma <[email protected]> * Minor updates to latest news in README * Remove bullets * Editing text for clarity Signed-off-by: Shashank Verma <[email protected]> * Format latest news as a dropdown list * Uses embedded html to format news to dropdown, hiding lengthy details * Fixes formatting of the title Signed-off-by: Shashank Verma <[email protected]> * Add break to improve readability of latest news image Signed-off-by: Shashank Verma <[email protected]> * Add LLM and MM section in latest news Signed-off-by: Shashank Verma <[email protected]> * Add margin in latest news expandable lists Signed-off-by: Shashank Verma <[email protected]> * Remove styling of expandable list * Github appears to not render styled elements when embedded as raw html in rst Signed-off-by: Shashank Verma <[email protected]> * Fold the first news item by default Signed-off-by: Shashank Verma <[email protected]> --------- Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Fix incorrect link to latest news in README (NVIDIA#8985) Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * make unit tests works Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Marc Romeyn <[email protected]> * add pytest-mock to unit test reqs Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * Enable using hybrid asr models in CTC Segmentation tool (NVIDIA#8828) * enable using hybrid asr models in ctc segmentation tool Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Add safety checks for 'data' key in MegatronGPTModel cfg (NVIDIA#8991) Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * address some comments Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> * TDT confidence fix (NVIDIA#8982) * tdt confidence fix --------- Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Marc Romeyn <[email protected]> * Address PR comments Signed-off-by: Marc Romeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Jaemin Choi <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: eharper <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Shashank Verma <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Shriya Palsamudram <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Alexey Panteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Shashank Verma <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
Minor updates to quantization pipeline.
Collection: NLP
Changelog
megatron_amp_O2 = True
caseJenkins CI
To run Jenkins, a NeMo User with write access must comment
jenkins
on the PR.Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information