Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataloader.dataset.cuts() #11

Closed
EmreOzkose opened this issue Aug 19, 2021 · 4 comments · Fixed by #12
Closed

dataloader.dataset.cuts() #11

EmreOzkose opened this issue Aug 19, 2021 · 4 comments · Fixed by #12

Comments

@EmreOzkose
Copy link
Contributor

Hello,

I trained a tdnn-lstm model successfully. However when I want to continue with decoding, this error occurs:

$ python tdnn_lstm_ctc/decode.py
2021-08-19 09:24:07,705 INFO [decode.py:319] Decoding started
2021-08-19 09:24:07,705 INFO [decode.py:320] {'exp_dir': PosixPath('tdnn_lstm_ctc/exp2'), 'lang_dir': PosixPath('data/lang_phone'), 'lm_dir': PosixPath('data/lm'), 'feature_dim': 80, 'subsampling_factor': 3, 'search_beam': 20, 'output_beam': 5, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'method': '1best', 'num_paths': 30, 'epoch': 9, 'avg': 5, 'feature_dir': PosixPath('data/fbank'), 'max_duration': 500.0, 'bucketing_sampler': False, 'num_buckets': 30, 'concatenate_cuts': True, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'full_libri': False}
2021-08-19 09:24:07,974 INFO [lexicon.py:96] Loading pre-compiled data/lang_phone/Linv.pt
2021-08-19 09:24:08,077 INFO [decode.py:329] device: cuda:0
2021-08-19 09:24:20,422 INFO [decode.py:387] averaging ['tdnn_lstm_ctc/exp2/epoch-5.pt', 'tdnn_lstm_ctc/exp2/epoch-6.pt', 'tdnn_lstm_ctc/exp2/epoch-7.pt', 'tdnn_lstm_ctc/exp2/epoch-8.pt', 'tdnn_lstm_ctc/exp2/epoch-9.pt']
Traceback (most recent call last):
  File "tdnn_lstm_ctc/decode.py", line 419, in <module>
    main()
  File "path/to/env/miniconda3/envs/k2/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "tdnn_lstm_ctc/decode.py", line 402, in main
    results_dict = decode_dataset(
  File "tdnn_lstm_ctc/decode.py", line 239, in decode_dataset
    tot_num_cuts = len(dl.dataset.cuts)
AttributeError: 'K2SpeechRecognitionDataset' object has no attribute 'cuts'

The line is that: tot_num_cuts = len(dl.dataset.cuts)

I checked source code and there is not a .cuts attiribute/function. I think number of cuts should be equal to number of sample when each segment is audio itself. So, we can remove .cuts and tot_num_cuts = len(dl.dataset) should be okey for Librispeech?

@EmreOzkose EmreOzkose changed the title dataloader.cut() dataloader.dataset.cuts() Aug 19, 2021
@csukuangfj
Copy link
Collaborator

Sorry, will commit a fix. I was using an old version of lhotse.

@EmreOzkose
Copy link
Contributor Author

Thank you for quick reply :).

@csukuangfj
Copy link
Collaborator

@EmreOzkose
Please pull the latest master. It's been fixed in #12

@EmreOzkose
Copy link
Contributor Author

You are really fast team :) Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants