Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout for command execution #194

Closed
wants to merge 1 commit into from

Conversation

Madhu-1
Copy link
Collaborator

@Madhu-1 Madhu-1 commented Feb 13, 2019

Signed-off-by: Madhu Rajanna [email protected]

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Feb 13, 2019

@rootfs @gman0 PTAL.

do I need to update readme about this new configuration?

@Madhu-1 Madhu-1 force-pushed the exec-timeout branch 2 times, most recently from 6709fd8 to 069d134 Compare February 13, 2019 05:34
@gman0
Copy link
Contributor

gman0 commented Feb 13, 2019

What is this PR trying to achieve?

@rootfs
Copy link
Member

rootfs commented Feb 13, 2019

I am a bit nervous on adding timeout on all ceph commands. There could be unforeseen regressions, especially when the command is slow yet still progressing and a global timeout occurs.

I understand there are various cases rbd may hang if no proper timeout is given, and that causes troubles. If we can identify these situations and selectively apply rbd timeout options, we are in a better position to improve the overall quality. What do you think @Madhu-1 ?

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Feb 15, 2019

@gman0 if cli commands get struck due to any reason, we may end-up in having infinite waiting.
This PR adds the timeout to all CLI commands

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Feb 15, 2019

I am a bit nervous on adding timeout on all ceph commands. There could be unforeseen regressions, especially when the command is slow yet still progressing and a global timeout occurs.

yes you are correct, to fix this we can increase the timeout to some large value may be 2 min?
but am not sure does CSI side car containers will wait for that long?

I understand there are various cases rbd may hang if no proper timeout is given, and that causes troubles. If we can identify these situations and selectively apply rbd timeout options, we are in a better position to improve the overall quality. What do you think @Madhu-1?

@rootfs can you give the list of commands which may cause hang, so that I can apply timeout for that commands,

I was thinking as we are executing the CLI commands we are kind of having an external dependency, not sure which command hangs which doesn't.

@rootfs
Copy link
Member

rootfs commented Feb 15, 2019

rbd map has a mount_timeout option, cephfs client mount timeout can only be set in config file though (per http://docs.ceph.com/docs/mimic/cephfs/client-config-ref/). @gman0 do you know any other options?

@humblec
Copy link
Collaborator

humblec commented Feb 15, 2019

@Madhu-1 @gman0 @rootfs Yeah, these timeouts are tricky always. What I have seen is that, the timeout could happen in many layers in the chain and cause some inconsistency between actions or it may land into a state that its not rolled back properly which cause issues later. So, if we apply timeouts we should make sure the rollback has been applied for a consistent state.

@gman0
Copy link
Contributor

gman0 commented Feb 15, 2019

@Madhu-1 the code suggests all it does is it returns an error if the command has run for more than Timeout seconds...how would that prevent the process to run indefinitely? If the idea was to actually kill the process - i'm not too fond of that either. As already mentioned, this approach would give us no guarantees of the state of the operation, things would get inconsistent.

@rootfs and for the kernel client, mount.ceph...i see the timeout for FUSE defaults to 300s, but kernel client has only 60s to mount. At the very least they should be set to the value same imho.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Feb 18, 2019

@rootfs @gman0 as you guys suggested can we set the client timeout for ceph mount and client mount timeout to 60 seconds.
for rbd mount use mount_timeout option set to 60 sec.

let me know if am missing anything

@j-griffith
Copy link
Contributor

I think the grpc timeouts will come in to play for most of this anyway no?

@nixpanic
Copy link
Member

nixpanic commented May 7, 2020

We're working on moving all command executions to go-ceph library calls, this can probably be closed?

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented May 14, 2020

am closing this one as this will be handled in go ceph i believe

@Madhu-1 Madhu-1 closed this May 14, 2020
nixpanic pushed a commit to nixpanic/ceph-csi that referenced this pull request Nov 22, 2023
Syncing latest changes from devel for ceph-csi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants