You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py)
#10478
Closed
Desperadoze opened this issue
Sep 13, 2024
· 2 comments
docker pull the nvcr.io/nvidia/nemo:24.07 ,try to use /opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py pretrain,but have
ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py)
the same traing.sh use nemo24.03 images don't have this error.
Traceback (most recent call last):
File "/opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py", line 23, in
from nemo.collections.nlp.models.language_modeling.megatron_gpt_model import MegatronGPTModel
File "/opt/NeMo/nemo/collections/nlp/init.py", line 15, in
from nemo.collections.nlp import data, losses, models, modules
File "/opt/NeMo/nemo/collections/nlp/data/init.py", line 15, in
from nemo.collections.nlp.data.data_utils import *
File "/opt/NeMo/nemo/collections/nlp/data/data_utils/init.py", line 15, in
from nemo.collections.nlp.data.data_utils.data_preprocessing import *
File "/opt/NeMo/nemo/collections/nlp/data/data_utils/data_preprocessing.py", line 28, in
from nemo.utils import logging
File "/opt/NeMo/nemo/utils/init.py", line 32, in
from nemo.utils.lightning_logger_patch import add_memory_handlers_to_pl_logger
File "/opt/NeMo/nemo/utils/lightning_logger_patch.py", line 18, in
import pytorch_lightning as pl
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/init.py", line 27, in
from pytorch_lightning.callbacks import Callback # noqa: E402
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/init.py", line 29, in
from pytorch_lightning.callbacks.pruning import ModelPruning
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/pruning.py", line 32, in
from pytorch_lightning.core.module import LightningModule
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/init.py", line 16, in
from pytorch_lightning.core.module import LightningModule
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/module.py", line 63, in
from pytorch_lightning.trainer import call
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/init.py", line 17, in
from pytorch_lightning.trainer.trainer import Trainer
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 38, in
from lightning_fabric.utilities.imports import _TORCH_GREATER_EQUAL_2_0
ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py)
Environment overview (please complete the following information)
Docker version 24.0.5, build ced0996
docker pull nvcr.io/nvidia/nemo:24.07
Additional context
Add any other context about the problem here.
Example: H100 & A800 CUDA12.2\12.4\12.6 same error
The text was updated successfully, but these errors were encountered:
Describe the bug
docker pull the nvcr.io/nvidia/nemo:24.07 ,try to use /opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py pretrain,but have
ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py)
the same traing.sh use nemo24.03 images don't have this error.
Steps/Code to reproduce bug
srun --output /xxxxxxxx.out --error xxxxxxxx.err --container-image dockerd://nvcr.io/nvidia/nemo:24.07 --container-mounts /gpfsprd/:/gpfsprd/,/apps/idx_files/:/apps/idx_files/ --no-container-mount-home --mpi=pmi2 --container-writable bash -c "
CUDA_DEVICE_MAX_CONNECTIONS=1 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python3 -u /opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py
--config-path=xxxxxxxx
--config-name=******.yaml
Expected behavior
Traceback (most recent call last):
File "/opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py", line 23, in
from nemo.collections.nlp.models.language_modeling.megatron_gpt_model import MegatronGPTModel
File "/opt/NeMo/nemo/collections/nlp/init.py", line 15, in
from nemo.collections.nlp import data, losses, models, modules
File "/opt/NeMo/nemo/collections/nlp/data/init.py", line 15, in
from nemo.collections.nlp.data.data_utils import *
File "/opt/NeMo/nemo/collections/nlp/data/data_utils/init.py", line 15, in
from nemo.collections.nlp.data.data_utils.data_preprocessing import *
File "/opt/NeMo/nemo/collections/nlp/data/data_utils/data_preprocessing.py", line 28, in
from nemo.utils import logging
File "/opt/NeMo/nemo/utils/init.py", line 32, in
from nemo.utils.lightning_logger_patch import add_memory_handlers_to_pl_logger
File "/opt/NeMo/nemo/utils/lightning_logger_patch.py", line 18, in
import pytorch_lightning as pl
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/init.py", line 27, in
from pytorch_lightning.callbacks import Callback # noqa: E402
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/init.py", line 29, in
from pytorch_lightning.callbacks.pruning import ModelPruning
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/pruning.py", line 32, in
from pytorch_lightning.core.module import LightningModule
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/init.py", line 16, in
from pytorch_lightning.core.module import LightningModule
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/module.py", line 63, in
from pytorch_lightning.trainer import call
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/init.py", line 17, in
from pytorch_lightning.trainer.trainer import Trainer
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 38, in
from lightning_fabric.utilities.imports import _TORCH_GREATER_EQUAL_2_0
ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py)
Environment overview (please complete the following information)
Docker version 24.0.5, build ced0996
docker pull nvcr.io/nvidia/nemo:24.07
Additional context
Add any other context about the problem here.
Example: H100 & A800 CUDA12.2\12.4\12.6 same error
The text was updated successfully, but these errors were encountered: