You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The wdl_8gpu.py script execution has halted and training cannot proceed.
To Reproduce
Steps to reproduce the behavior:
bash preprocess.sh 0 ./criteo_data nvt 0 1 1
2.Entering the hugectr docker and execute "HUGECTR_LOG_LEVEL=3 python samples/wdl/wdl_8gpu.py"
Expected behavior
Training steps halt after dump log:
"[HCTR][09:37:51.908][INFO][RK0][main]: Training source file: ./criteo_data/train/_file_list.txt
[HCTR][09:37:51.908][INFO][RK0][main]: Evaluation source file: ./criteo_data/val/_file_list.txt"
Screenshots
Some issue log dumpped like this
Environment (please complete the following information):
Hi @redzhang1990, I think the behavior is as expected.
The solver::i64_input_key is defaulted to False. However, data preprocessed via preprocess.sh will output int64_t which requires i64_input_key=True in Solver.
See the note :
i64_input_key: For the Parquet format dataset generated by NVTabular, only I64 is allowed.
Describe the bug
The wdl_8gpu.py script execution has halted and training cannot proceed.
To Reproduce
Steps to reproduce the behavior:
2.Entering the hugectr docker and execute "HUGECTR_LOG_LEVEL=3 python samples/wdl/wdl_8gpu.py"
Expected behavior
Training steps halt after dump log:
"[HCTR][09:37:51.908][INFO][RK0][main]: Training source file: ./criteo_data/train/_file_list.txt
[HCTR][09:37:51.908][INFO][RK0][main]: Evaluation source file: ./criteo_data/val/_file_list.txt"
Screenshots
Some issue log dumpped like this
Environment (please complete the following information):
Additional context
After add "i64_input_key=True," into slover in wdl_8gpu.py, this issue fixed.
The text was updated successfully, but these errors were encountered: