Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not Get PsKey for the training with fault towerance mode #2969

Closed
Yancey1989 opened this issue Jul 19, 2017 · 3 comments · Fixed by #2999
Closed

Can not Get PsKey for the training with fault towerance mode #2969

Yancey1989 opened this issue Jul 19, 2017 · 3 comments · Fixed by #2999
Assignees

Comments

@Yancey1989
Copy link
Contributor

Yancey1989 commented Jul 19, 2017

I start up master, pserver and trainer in a Docker container, but the trainer can not get the PServer address from etcd, the error logs as below:

['/work/data/uci_housing_train-*-of-*']
ERRO[0000] Get task failed, sleep 3 seconds and continue, no more available task
I0719 08:12:24.602708   824 Util.cpp:166] commandline:
I0719 08:12:24.607319   824 GradientMachine.cpp:85] Initing parameters..
I0719 08:12:24.607365   824 GradientMachine.cpp:92] Init parameters done.
INFO[0000] Connected to etcd: localhost:2379

I0719 08:12:24.962303   824 NewRemoteParameterUpdater.cpp:68] paddle_begin_init_params start
I0719 08:12:24.962774   824 NewRemoteParameterUpdater.cpp:71] old param config: name: "___fc_layer_0__.w0"
size: 13
initial_mean: 0
initial_std: 0.27735009811261457
dims: 13
dims: 1
initial_strategy: 0
initial_smart: true
para_id: 0
INFO[0000] Get psKey= /ps/0 error, context canceled

ERRO[0003] Get task failed, sleep 3 seconds and continue, no more available task
ERRO[0006] Get task failed, sleep 3 seconds and continue, no more available task
ERRO[0009] Get task failed, sleep 3 seconds and continue, no more available task
INFO[0010] Get psKey= /ps/0 error, context canceled
@helinwang
Copy link
Contributor

@Yancey1989
Copy link
Contributor Author

Cool, I will close this issue.

@helinwang
Copy link
Contributor

@Yancey1989 Thanks for closing the issue, let's close the issue when it's fixed in develop branch :p

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants