-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timeout for command execution #194
Conversation
6709fd8
to
069d134
Compare
Signed-off-by: Madhu Rajanna <[email protected]>
069d134
to
ad230f3
Compare
What is this PR trying to achieve? |
I am a bit nervous on adding timeout on all ceph commands. There could be unforeseen regressions, especially when the command is slow yet still progressing and a global timeout occurs. I understand there are various cases rbd may hang if no proper timeout is given, and that causes troubles. If we can identify these situations and selectively apply rbd timeout options, we are in a better position to improve the overall quality. What do you think @Madhu-1 ? |
@gman0 if cli commands get struck due to any reason, we may end-up in having infinite waiting. |
yes you are correct, to fix this we can increase the timeout to some large value may be 2 min?
@rootfs can you give the list of commands which may cause hang, so that I can apply timeout for that commands, I was thinking as we are executing the CLI commands we are kind of having an external dependency, not sure which command hangs which doesn't. |
rbd map has a mount_timeout option, cephfs client mount timeout can only be set in config file though (per http://docs.ceph.com/docs/mimic/cephfs/client-config-ref/). @gman0 do you know any other options? |
@Madhu-1 @gman0 @rootfs Yeah, these timeouts are tricky always. What I have seen is that, the timeout could happen in many layers in the chain and cause some inconsistency between actions or it may land into a state that its not rolled back properly which cause issues later. So, if we apply timeouts we should make sure the rollback has been applied for a consistent state. |
@Madhu-1 the code suggests all it does is it returns an error if the command has run for more than @rootfs and for the kernel client, mount.ceph...i see the timeout for FUSE defaults to 300s, but kernel client has only 60s to mount. At the very least they should be set to the value same imho. |
@rootfs @gman0 as you guys suggested can we set the client timeout for ceph mount and client mount timeout to let me know if am missing anything |
I think the grpc timeouts will come in to play for most of this anyway no? |
We're working on moving all command executions to go-ceph library calls, this can probably be closed? |
am closing this one as this will be handled in go ceph i believe |
Syncing latest changes from devel for ceph-csi
Signed-off-by: Madhu Rajanna [email protected]