You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ES 模型单机多卡训练
1、执行 xparl start --port 8837 --cpu_num 48
2、执行 fleetrun train.py
报错如下:
[07-13 15:41:57 Thread-12 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 19
[07-13 15:41:57 Thread-52 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 18
[07-13 15:41:57 Thread-50 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 17
[07-13 15:41:58 Thread-60 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 16
The text was updated successfully, but these errors were encountered:
ES 模型单机多卡训练
1、执行 xparl start --port 8837 --cpu_num 48
2、执行 fleetrun train.py
报错如下:
[07-13 15:41:57 Thread-12 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 19
[07-13 15:41:57 Thread-52 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 18
[07-13 15:41:57 Thread-50 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 17
[07-13 15:41:58 Thread-60 @client.py:301] ERR [xparl] lost connection with a job, current actor num: 16
The text was updated successfully, but these errors were encountered: