RuntimeError: stack expects each tensor to be equal size (when using lhotse shar data sets) #10382

riqiang-dp · 2024-09-06T22:28:44Z

Describe the bug

Hi @pzelasko, I got the following error using lhotse shar dataset settings to train a hybrid CTC transducer model. What does this error suggest? What are the entries in this stack in this case? I'm not sure if this is a bug or something wrong with my dataset.

I'm guessing if it can be due to audio having two channels? Although I think it's also unlikely that I have that in my data.

Thank you!

  File "***lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 308, in _wor
ker_loop                                                                                                                                                       
    data = fetcher.fetch(index)                                                                                                                                
           ^^^^^^^^^^^^^^^^^^^^                                                                                                                                
  File "/***lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 41, in fetch 
    data = next(self.dataset_iter)                                                                                                                             
           ^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                             
  File "***lib/python3.11/site-packages/lhotse/dataset/iterable_dataset.py", line 100, in __n
ext__                                                                                                                                                          
    return self.dataset[sampled]                                                                                                                               
           ~~~~~~~~~~~~^^^^^^^^^                                                                                                                               
  File "***lib/python3.11/site-packages/nemo/collections/asr/data/audio_to_text_lhotse.py", l
ine 52, in __getitem__                                                                                                                                         
    audio, audio_lens, cuts = self.load_audio(cuts)                                                                                                            
                              ^^^^^^^^^^^^^^^^^^^^^                                                                                                            
  File "***lib/python3.11/site-packages/lhotse/dataset/input_strategies.py", line 224, in __c
all__                                                                                                                                                          
    return collate_audio(                                                                                                                                      
           ^^^^^^^^^^^^^^                                                                                                                                      
  File "***lib/python3.11/site-packages/lhotse/dataset/collation.py", line 202, in collate_au
dio                                                                                                                                                            
    audios = torch.stack(audios)                                                                                                                               
             ^^^^^^^^^^^^^^^^^^^                                                                                                                               
RuntimeError: stack expects each tensor to be equal size, but got [203680] at entry 0 and [2, 203680] at entry 28

Steps/Code to reproduce bug
As described above.

Expected behavior

Tensors should all just have one dimension?

Environment overview (please complete the following information)

N/A

Environment details

N/A

Additional context

N/A

The text was updated successfully, but these errors were encountered:

riqiang-dp · 2024-09-23T15:57:13Z

Fixed, found stereo audio marked as single channel

riqiang-dp added the bug Something isn't working label Sep 6, 2024

riqiang-dp closed this as completed Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: stack expects each tensor to be equal size (when using lhotse shar data sets) #10382

RuntimeError: stack expects each tensor to be equal size (when using lhotse shar data sets) #10382

riqiang-dp commented Sep 6, 2024

riqiang-dp commented Sep 23, 2024

RuntimeError: stack expects each tensor to be equal size (when using lhotse shar data sets) #10382

RuntimeError: stack expects each tensor to be equal size (when using lhotse shar data sets) #10382

Comments

riqiang-dp commented Sep 6, 2024

riqiang-dp commented Sep 23, 2024