Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem during preparing G #23

Closed
EmreOzkose opened this issue Aug 23, 2021 · 3 comments
Closed

Problem during preparing G #23

EmreOzkose opened this issue Aug 23, 2021 · 3 comments

Comments

@EmreOzkose
Copy link
Contributor

Hi, I am trying to train a tdnn-lstm model on a Turkish data. I run an experiment successfully once (including decoding part) with smaller langauge model. Then I tried a new corpus for language modeling. During preparing G step, I got this error:

2021-08-23 11:35:45 (prepare.sh:118:main) Stage 6: Compile HLG
2021-08-23 11:35:46,389 INFO [compile_hlg.py:126] Processing data/lang_phone
2021-08-23 11:35:46,859 INFO [lexicon.py:99] Converting L.pt to Linv.pt
2021-08-23 11:35:47,640 INFO [compile_hlg.py:48] Building ctc_topo. max_token_id: 52
2021-08-23 11:35:47,826 INFO [compile_hlg.py:57] Loading G_3_gram.fst.txt
2021-08-23 11:38:15,955 INFO [compile_hlg.py:68] Intersecting L and G
2021-08-23 11:51:37,889 INFO [compile_hlg.py:70] LG shape: (301909252, None)
2021-08-23 11:51:37,889 INFO [compile_hlg.py:72] Connecting LG
2021-08-23 11:51:37,889 INFO [compile_hlg.py:74] LG shape after k2.connect: (301909252, None)
2021-08-23 11:51:37,889 INFO [compile_hlg.py:76] <class 'torch.Tensor'>
2021-08-23 11:51:37,889 INFO [compile_hlg.py:77] Determinizing LG
2021-08-23 12:09:11,585 INFO [compile_hlg.py:80] <class '_k2.RaggedInt'>
2021-08-23 12:09:11,585 INFO [compile_hlg.py:82] Connecting LG after k2.determinize
2021-08-23 12:09:11,585 INFO [compile_hlg.py:85] Removing disambiguation symbols on LG
[F] /usr/share/miniconda/envs/k2/conda-bld/k2_1628135473078/work/k2/csrc/tensor.cu:159:k2::Tensor::Tensor(k2::Dtype, const k2::Shape&, k2::RegionPtr, int32_t) Check failed: int64_t(impl_->byte_offset) + begin_elem * element_size >= 0 (-1246502780 vs. 0) 


[ Stack-Trace: ]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/libk2_log.so(k2::internal::GetStackTrace()+0x4c) [0x7fb12e7c76bc]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/libk2context.so(k2::Tensor::Tensor(k2::Dtype, k2::Shape const&, std::shared_ptr<k2::Region>, int)+0x6da) [0x7fb12ed24aca]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/libk2context.so(k2::Array2<int>::Col(int)+0x13a) [0x7fb12ecd03ba]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/libk2context.so(+0x27e0d9) [0x7fb12ecc40d9]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/libk2context.so(k2::Index(k2::RaggedShape&, int, k2::Array1<int> const&, k2::Array1<int>*)+0x1da) [0x7fb12ecc649a]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/_k2.cpython-38-x86_64-linux-gnu.so(+0xc5395) [0x7fb134ced395]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/_k2.cpython-38-x86_64-linux-gnu.so(+0xabc90) [0x7fb134cd3c90]
/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/_k2.cpython-38-x86_64-linux-gnu.so(+0x1dfcf) [0x7fb134c45fcf]
python3(PyCFunction_Call+0x54) [0x555e1b3b3914]
python3(_PyObject_MakeTpCall+0x31e) [0x555e1b3b6ebe]
python3(_PyEval_EvalFrameDefault+0x52f6) [0x555e1b458986]
python3(_PyFunction_Vectorcall+0x1a6) [0x555e1b43b646]
python3(_PyEval_EvalFrameDefault+0x947) [0x555e1b453fd7]
python3(_PyFunction_Vectorcall+0x1a6) [0x555e1b43b646]
python3(_PyEval_EvalFrameDefault+0x947) [0x555e1b453fd7]
python3(_PyEval_EvalCodeWithName+0x2c3) [0x555e1b43a433]
python3(_PyFunction_Vectorcall+0x378) [0x555e1b43b818]
python3(_PyEval_EvalFrameDefault+0x1822) [0x555e1b454eb2]
python3(_PyFunction_Vectorcall+0x1a6) [0x555e1b43b646]
python3(_PyEval_EvalFrameDefault+0x4d33) [0x555e1b4583c3]
python3(_PyFunction_Vectorcall+0x1a6) [0x555e1b43b646]
python3(_PyEval_EvalFrameDefault+0x947) [0x555e1b453fd7]
python3(_PyFunction_Vectorcall+0x1a6) [0x555e1b43b646]
python3(_PyEval_EvalFrameDefault+0x947) [0x555e1b453fd7]
python3(_PyEval_EvalCodeWithName+0x2c3) [0x555e1b43a433]
python3(PyEval_EvalCodeEx+0x39) [0x555e1b43b499]
python3(PyEval_EvalCode+0x1b) [0x555e1b4d6ecb]
python3(+0x252f63) [0x555e1b4d6f63]
python3(+0x26f033) [0x555e1b4f3033]
python3(+0x274022) [0x555e1b4f8022]
python3(PyRun_SimpleFileExFlags+0x1b2) [0x555e1b4f8202]
python3(Py_RunMain+0x36d) [0x555e1b4f877d]
python3(Py_BytesMain+0x39) [0x555e1b4f8939]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7fb25ee04b97]
python3(+0x1e8f39) [0x555e1b46cf39]

Traceback (most recent call last):
  File "./local/compile_hlg.py", line 140, in <module>
    main()
  File "./local/compile_hlg.py", line 128, in main
    HLG = compile_HLG(lang_dir)
  File "./local/compile_hlg.py", line 92, in compile_HLG
    LG = k2.remove_epsilon(LG)
  File "/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/k2/fsa_algo.py", line 562, in remove_epsilon
    out_fsa = k2.utils.fsa_from_unary_function_ragged(fsa, ragged_arc, arc_map,
  File "/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/k2/utils.py", line 515, in fsa_from_unary_function_ragged
    new_value = index(value, arc_map)
  File "/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/k2/ops.py", line 335, in index
    return index_ragged(src, indexes)
  File "/path/to/miniconda3/envs/k2/lib/python3.8/site-packages/k2/ops.py", line 283, in index_ragged
    return _k2.index(src, indexes)
RuntimeError: Some bad things happed.

How can I solve that? My arpa files are 2.1gb and 6.2gb for 3 gram and 4 gram respectively. Could it be related to size issue? My language models are prepared via Kenlm.

Is it relevant to icefall, or should I ask on K2 repository?

@csukuangfj
Copy link
Collaborator

Seems that the G is too large and there is an overflow in k2.

@danpovey
Copy link
Collaborator

Hm. I'd like to see some debug information about what's in shape that is causing this; I suspect error may have been earlier, and may have been fixable.
We should actually support loading Kenlm LM's onto GPU and using them via a deterministic-FST interaface. But that will take some time.

@EmreOzkose
Copy link
Contributor Author

I tried to reproduce this case for getting shapes, but it takes too long (2-3 hours) and the script is just killled:

2021-08-24 09:29:22 (prepare.sh:44:main) pre_file_dir: /path/to/icefall/egs/sestek/ASR/pre_files
2021-08-24 09:29:22 (prepare.sh:119:main) Stage 6: Compile HLG
2021-08-24 09:29:23,508 INFO [compile_hlg.py:128] Processing data/lang_phone
2021-08-24 09:29:24,062 INFO [lexicon.py:96] Loading pre-compiled data/lang_phone/Linv.pt
2021-08-24 09:29:24,289 INFO [compile_hlg.py:48] Building ctc_topo. max_token_id: 52
2021-08-24 09:29:24,489 INFO [compile_hlg.py:53] Loading pre-compiled G_3_gram
2021-08-24 09:29:30,649 INFO [compile_hlg.py:68] Intersecting L and G
2021-08-24 09:42:25,292 INFO [compile_hlg.py:70] LG shape: (301909252, None)
2021-08-24 09:42:25,292 INFO [compile_hlg.py:72] Connecting LG
(301909252, None)
2021-08-24 09:42:25,292 INFO [compile_hlg.py:75] LG shape after k2.connect: (301909252, None)
(301909252, None)
2021-08-24 09:42:25,292 INFO [compile_hlg.py:77] <class 'torch.Tensor'>
2021-08-24 09:42:25,292 INFO [compile_hlg.py:78] Determinizing LG
2021-08-24 09:58:24,804 INFO [compile_hlg.py:81] <class '_k2.RaggedInt'>
(214860596, None)
2021-08-24 09:58:24,804 INFO [compile_hlg.py:83] Connecting LG after k2.determinize
2021-08-24 09:58:24,804 INFO [compile_hlg.py:86] Removing disambiguation symbols on LG
(214860596, None)
(214860596, None)
(214860596, None)
./prepare.sh: line 126: 23181 Killed                  ./local/compile_hlg.py --lang-dir data/lang_phone
2021-08-24 10:04:32,846 INFO [compile_hlg.py:128] Processing data/lang_bpe_5000
2021-08-24 10:04:33,354 INFO [lexicon.py:96] Loading pre-compiled data/lang_bpe_5000/Linv.pt
2021-08-24 10:04:33,526 INFO [compile_hlg.py:48] Building ctc_topo. max_token_id: 4999
2021-08-24 10:04:34,405 INFO [compile_hlg.py:53] Loading pre-compiled G_3_gram
2021-08-24 10:04:41,941 INFO [compile_hlg.py:68] Intersecting L and G
2021-08-24 10:07:20,234 INFO [compile_hlg.py:70] LG shape: (80262949, None)
2021-08-24 10:07:20,234 INFO [compile_hlg.py:72] Connecting LG
(80262949, None)
2021-08-24 10:07:20,234 INFO [compile_hlg.py:75] LG shape after k2.connect: (80262949, None)
(80262949, None)
2021-08-24 10:07:20,234 INFO [compile_hlg.py:77] <class 'torch.Tensor'>
2021-08-24 10:07:20,234 INFO [compile_hlg.py:78] Determinizing LG
./prepare.sh: line 122: 27422 Killed                  ./local/compile_hlg.py --lang-dir $lang_dir
'lang_bpe/lang_bpe_5000' -> 'lang_bpe_5000'

I tried to count via irstlm instead of kenlm. When I pruned kenlm model, it was reduced to 2.1 gb from 5 gb for 3-gram. However I pruned irstlm model (with threshold 3e-7) and now sizes are 68M and 85M for 3-gram and 4-gram respectively.

In the end, preparing G is done successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants