Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify diff between connection settings timeout and open_timeout #1470

Merged
merged 5 commits into from
Dec 22, 2022

Conversation

Yu-Chieh-Henry-Yang
Copy link
Contributor

Description

This PR aims to clarify the difference between the config settings timeout and open_timeout.

I am very confused about the difference between timeout and open_timeout.

Originally, the comment for these 2 config settings says:

  #   - `:timeout`  open/read timeout Integer in seconds
  #   - `:open_timeout` - read timeout Integer in seconds
  • open_timeout is described as read timeout Integer in seconds but the previous line has the word open in it, which suggests that read is something other than open which is confusing because I expect open_timeout to represent open.
  • timeout is described as open/read which is slightly ambiguous because I find it hard to tell which one of the following it means:
    • open plus read
    • open or read

Also when I was looking at this, I don't really know about TCP connection phase and reading phase and whether they are exclusive of each so searching online to clarify this is difficult as well. I found it difficult to clarify whether timeout actually includes the time in open_timeout and I have asked a question in here: #1469.

The reason that I think timeout includes open_timeout is because:

  1. I saw faraday gem using the typhoeus adapter in this file:
    dependency 'typhoeus/adapters/faraday'
  2. faraday's timeout is represented by timeout_ms in typhoeus and faraday's open_timeout is represented by connecttimeout_ms in typhoeus according to this file
  3. There is documentation regarding timeout in the typhoeus gem README.md. It says

timeout is the time limit for the entire request in seconds.
connecttimeout is the time limit for just the connection phase, again in seconds.

Copy link
Contributor

@yykamei yykamei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Env#request comes from this code, so it seems to be an instance of RequestOptions.

Env.new(http_method, body, connection.build_exclusive_url(path, params),
options, headers, connection.ssl, connection.parallel_manager)

In the first place, the description of Env#request doesn't tell the all attributes of RequestOptions, such as read_timeout or write_timeout. Ideally, I think RequestOptions should have its own attribute documents, and it might be good for this request to just tell return [RequestOptions].

However, it might require more work to add full explanations to RequestOptions. How about adding read_timeout and write_timeout as well?

Note open_timeout, read_timeout, and write_timeout will be referred by #request_timeout, and it depends on adapters how they will be used. For example, faraday-net_http sets open_timeout here, but #request_timeout will fall back to timeout when open_timeout is not set. I could be wrong, but "total timeout for both the connection-opening phase and the reading phase" might not be correct.

@Yu-Chieh-Henry-Yang
Copy link
Contributor Author

Yu-Chieh-Henry-Yang commented Dec 19, 2022

Thank you for looking into this and the comment.

I could be wrong, but "total timeout for both the connection-opening phase and the reading phase" might not be correct.

When I created this PR, I am not certain about what it does as well. If this might not be correct, do you know what would be correct instead? To be specific, let's say we have the following phase when we make a request using faraday:

  • Phase 1, the connection phase
  • After phase 1, phase 2, the reading phase

Does timeout only represent the phase 2 and not (phase 1 + phase 2)?

Does open_timeout only represent phase 1? (If yes, why does it says read timeout Integer in seconds instead of open timeout Integer in seconds in the original code comment?)

Please let me know if have a mistake in understanding anywhere.

@yykamei
Copy link
Contributor

yykamei commented Dec 19, 2022

Does timeout only represent the phase 2 and not (phase 1 + phase 2)?

timeout is just the default value for open_timeout, read_timeout, and write_timeout. If only timeout is set, open_timeout, read_timeout, and write_timeout will be set with the value of timeout.

This is where timeout is used in Faraday:

options[key] || options[:timeout]

In faraday-net_http, read_timeout, write_timeout, and open_timeout are set with #request_timeout.

https://github.com/lostisland/faraday-net_http/blob/0f9970b095ef7ab847883809ad500f453094c88f/lib/faraday/adapter/net_http.rb#L152-L170

Does open_timeout only represent phase 1? (If yes, why does it says read timeout Integer in seconds instead of open timeout Integer in seconds in the original code comment?)

Yes in faraday-net_http, but I'm not sure why the code comment says "read timeout Integer in seconds". I think "open timeout Integer in seconds" is correct 👍

@Yu-Chieh-Henry-Yang
Copy link
Contributor Author

For typhoeus adapter:

  1. faraday's timeout is represented by timeout_ms in typhoeus and faraday's open_timeout is represented by connecttimeout_ms in typhoeus according to this file
  2. There is documentation regarding timeout in the typhoeus gem README.md. It says

timeout is the time limit for the entire request in seconds.
connecttimeout is the time limit for just the connection phase, again in seconds.

so I think the understanding around open_timeout is pretty consistent whereas the use and meaning of timeout varies between different adapters.

I will update the code for my PR to reflect that.

@Yu-Chieh-Henry-Yang
Copy link
Contributor Author

In the future, it's probably better if we can expose open_timeout, read_timeout, write_timeout and total_timeout and default_timeout as configurable options so that adapters can use the correctly named options.

But for now I have expanded the explanation comment to point out that there is an uncertainty on how the timeout option is used.

Copy link
Contributor

@yykamei yykamei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation looks good 👍

@iMacTia
Copy link
Member

iMacTia commented Dec 20, 2022

OK so it looks like we have some outdated documentation in this file.
We actually support 4 types of timeouts 😄

  • timeout is the time limit for the entire request
  • open_timeout is the time limit for just the connection phase (e.g. handshake)
  • read_timeout is the time limit for the first response byte received from the server
  • write_timoutis the time limit for the client to send the request to the server

The revised documentation should note all 4 timeout types and then make clear that open_timeout, read_timeout and write_timeout are only supported in some adapters.

@Yu-Chieh-Henry-Yang
Copy link
Contributor Author

Thanks for the explanation @iMacTia

According to the comment in #1470 (comment), it seems like the timeout value may be used differently in different adapter like faraday-net_http. For faraday-net_http, it seems to use timeout as the default value of open_timeout, read_timeout and write_timeout.

Would you say that it is using the timeout incorrectly? (I'm asking this just trying to understand timeout correctly)

faraday's #request_timeout method falls back to timeout value as we can see in here:

options[key] || options[:timeout]

When both open_timeout and read_timeout are set via #request_timeout method, this cause the total timeout from open_timeout and read_timeout added together to exceed the value of timeout, should we perhaps remove the ability for this #request_timeout method to fall back to timeout? (Maybe in a future major release, since this is breaking change)

@Yu-Chieh-Henry-Yang
Copy link
Contributor Author

I have updated the comment again according to our discussion, if the formatting or explanation is incorrect, please let me know. I'm happy to change them.

@iMacTia
Copy link
Member

iMacTia commented Dec 21, 2022

I haven't touched that code in a while, so a couple of questions come up on the Net::HTTP internals:

  1. Does it support a global timeout config? If not, then what we're doing is the most sensible thing and closest to what the user might possibly want.
  2. When does the read timeout begin to count? If it counts from the connection opening, then yeah we could potentially double the timeout. But maybe it's counting from the request start?

@iMacTia
Copy link
Member

iMacTia commented Dec 21, 2022

If Net::HTTP does indeed support a global timeout, then I agree we should set that instead, but I'd be surprised we're not using that already 🤔

Copy link
Member

@iMacTia iMacTia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to the docs are good 🙌!
We can continue discussing adapters implementation and the need to fix them, but this is the expected behaviour

@Yu-Chieh-Henry-Yang
Copy link
Contributor Author

I see, thanks for the response, I don't know much about how the Net::HTTP works but this has answered my question.

I don't have the permission to merge this PR but I'm happy for this PR to be merged

@iMacTia iMacTia merged commit a34861a into lostisland:main Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants