ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py) #10478

Desperadoze · 2024-09-13T03:06:37Z

Describe the bug

docker pull the nvcr.io/nvidia/nemo:24.07 ,try to use /opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py pretrain,but have

ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py)

the same traing.sh use nemo24.03 images don't have this error.

Steps/Code to reproduce bug

srun --output /xxxxxxxx.out --error xxxxxxxx.err --container-image dockerd://nvcr.io/nvidia/nemo:24.07 --container-mounts /gpfsprd/:/gpfsprd/,/apps/idx_files/:/apps/idx_files/ --no-container-mount-home --mpi=pmi2 --container-writable bash -c "

CUDA_DEVICE_MAX_CONNECTIONS=1 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python3 -u /opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py
--config-path=xxxxxxxx
--config-name=******.yaml

Expected behavior

Traceback (most recent call last):
File "/opt/NeMo/examples/nlp/language_modeling/megatron_gpt_pretraining.py", line 23, in
from nemo.collections.nlp.models.language_modeling.megatron_gpt_model import MegatronGPTModel
File "/opt/NeMo/nemo/collections/nlp/init.py", line 15, in
from nemo.collections.nlp import data, losses, models, modules
File "/opt/NeMo/nemo/collections/nlp/data/init.py", line 15, in
from nemo.collections.nlp.data.data_utils import *
File "/opt/NeMo/nemo/collections/nlp/data/data_utils/init.py", line 15, in
from nemo.collections.nlp.data.data_utils.data_preprocessing import *
File "/opt/NeMo/nemo/collections/nlp/data/data_utils/data_preprocessing.py", line 28, in
from nemo.utils import logging
File "/opt/NeMo/nemo/utils/init.py", line 32, in
from nemo.utils.lightning_logger_patch import add_memory_handlers_to_pl_logger
File "/opt/NeMo/nemo/utils/lightning_logger_patch.py", line 18, in
import pytorch_lightning as pl
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/init.py", line 27, in
from pytorch_lightning.callbacks import Callback # noqa: E402
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/init.py", line 29, in
from pytorch_lightning.callbacks.pruning import ModelPruning
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/pruning.py", line 32, in
from pytorch_lightning.core.module import LightningModule
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/init.py", line 16, in
from pytorch_lightning.core.module import LightningModule
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/module.py", line 63, in
from pytorch_lightning.trainer import call
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/init.py", line 17, in
from pytorch_lightning.trainer.trainer import Trainer
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 38, in
from lightning_fabric.utilities.imports import _TORCH_GREATER_EQUAL_2_0
ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py)

Environment overview (please complete the following information)

Docker version 24.0.5, build ced0996
docker pull nvcr.io/nvidia/nemo:24.07

Additional context

Add any other context about the problem here.
Example: H100 & A800 CUDA12.2\12.4\12.6 same error

github-actions · 2024-10-26T01:57:18Z

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions · 2024-11-02T01:58:05Z

This issue was closed because it has been inactive for 7 days since being marked as stale.

Desperadoze added the bug Something isn't working label Sep 13, 2024

elliottnv assigned athitten Sep 25, 2024

github-actions bot added the stale label Oct 26, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py) #10478

ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py) #10478

Desperadoze commented Sep 13, 2024

github-actions bot commented Oct 26, 2024

github-actions bot commented Nov 2, 2024

ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py) #10478

ImportError: cannot import name '_TORCH_GREATER_EQUAL_2_0' from 'lightning_fabric.utilities.imports' (/usr/local/lib/python3.10/dist-packages/lightning_fabric/utilities/imports.py) #10478

Comments

Desperadoze commented Sep 13, 2024

github-actions bot commented Oct 26, 2024

github-actions bot commented Nov 2, 2024