Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable timeouts and curate default timeout settings #1873

Merged
merged 1 commit into from
Aug 18, 2017

Conversation

m3co-code
Copy link
Contributor

This PR adds some configurable timeouts to Traefik. Namely it introduces:

  • ForwardingTimeouts -> Timeouts for forwarding requests to the Backend Server's. In specific
    • DialTimeout
    • ResponseHeaderTimeout
  • RespondingTimeouts -> Timeouts for the http.Server instances that Traefik opens.
    • ReadTimeout
    • WriteTimeout
    • Move of IdleTimeout into this configuration section with proper deprecation for uniformity.

The motivation for this PR are the suboptimal default settings of go itself and it is important to make the load balancer robust against bad behaving clients / backends in a microservice environment.

In order to introduce this configuration options I had to first clean up the global modification of the http.DefaultTransport. I think a global modification of it is in general considered a bad practice and leads to unintended side-effects. To do this I added a createHTTPTransport method in the server that will produce a http.Transport with a very similar configuration as the 'http.DefaultTransport'. Apart of the removal form global modification this has another big benefit, given TLS is configured we create a new http.Transport for that and before this PR this Transport did not have any timeouts configured at all, not even those from the http.DefaultTransport.

I know that this PR is largish and timeouts in general are really difficult to test. Let me know when I can support reviewers in any way, usually I am also hanging around in the Traefik slack channel. I tried to split the commits logically so that at least those should give some guideline of the introduced changes and hopefully make reviewing easier.

For background information: There are very good articles regarding timeouts in golang. Here some links:

@emilevauge
Copy link
Member

@marco-jantke Thanks a lot for this one, awesome job !
Articles linked are really good.
Design LGTM

Copy link
Contributor

@ldez ldez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, few comments

@@ -54,11 +61,13 @@ type GlobalConfiguration struct {
DefaultEntryPoints DefaultEntryPoints `description:"Entrypoints to be used by frontends that do not specify any entrypoint"`
ProvidersThrottleDuration flaeg.Duration `description:"Backends throttle duration: minimum duration between 2 events from providers before applying a new configuration. It avoids unnecessary reloads if multiples events are sent in a short amount of time."`
MaxIdleConnsPerHost int `description:"If non-zero, controls the maximum idle (keep-alive) to keep per-host. If zero, DefaultMaxIdleConnsPerHost is used"`
IdleTimeout flaeg.Duration `description:"maximum amount of time an idle (keep-alive) connection will remain idle before closing itself."`
IdleTimeout flaeg.Duration `description:"(Deprecated) see RespondingTimeouts.IdleTimeout"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add // Deprecated ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

docs/toml.md Outdated
#
# IdleTimeout
#
# Deprecated - see [respondingTimeouts] section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we keep the description of what the parameter means until we truly remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes sense. Not sure why I removed it in the first place.

docs/toml.md Outdated
@@ -328,6 +326,65 @@ To write JSON format logs, specify `json` as the format:
# interval = "30s"
```

## Responding timeouts
```
# Timeout settings for the http servers Traefik starts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/http/HTTP/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

docs/toml.md Outdated
#
[respondingTimeouts]

# ReadTimeout is the maximum duration for reading the entire request, including the body.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should say readTimeout instead of ReadTimeout. I know it's the beginning of a sentence, but we're referring to an option name (i.e., a proper name) here.

Similar for the other added options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

docs/toml.md Outdated
#
# readTimeout = "5s"

# WriteTimeout is the maximum duration before timing out writes of the response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if "writes of the response" is clear enough. Does that refer to the time until the response is returned? If so, is it the complete response or the first byte only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another explanatory sentence. Please check again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

docs/toml.md Outdated

## Forwarding timeouts
```
# Timeout settings for requests forwarded to the Backend Servers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Backend Servers/backend servers/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

server/server.go Outdated
// When RespondingTimeouts.IdleTimout is configured and its not the default value, always use that setting
if server.globalConfiguration.RespondingTimeouts != nil && time.Duration(server.globalConfiguration.RespondingTimeouts.IdleTimeout) != time.Duration(DefaultIdleTimeout) {
idleTimeout = time.Duration(server.globalConfiguration.RespondingTimeouts.IdleTimeout)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we outsource the timeout setup into a dedicated function? WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Outsourced it.

server/server.go Outdated
}
return &http.Transport{
TLSClientConfig: config,
// getRoundTripper will either use server.defaultForwardingRoundTripperor create a new one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space missing at defaultForwardingRoundTripperor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

RespondingTimeouts: &RespondingTimeouts{
IdleTimeout: flaeg.Duration(10 * time.Second),
ReadTimeout: flaeg.Duration(10 * time.Second),
WriteTimeout: flaeg.Duration(10 * time.Second),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably choose slightly different numbers for the three timeouts just to rule out the possibility that timeout values are inadvertently reused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, adapted the test.

globalConfig: GlobalConfiguration{
IdleTimeout: flaeg.Duration(45 * time.Second),
RespondingTimeouts: &RespondingTimeouts{
IdleTimeout: flaeg.Duration(80 * time.Second),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we document somewhere that the new option will override the old one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope we did not so far. I added a note in toml.md at the deprecated option.

err = try.GetRequest("http://127.0.0.1:8080/api/providers", 60*time.Second, try.BodyContains("Path:/dialTimeout"))
c.Assert(err, checker.IsNil)

// This simulates a DialTimeout to the BackendServer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/BackendServer/backend server/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

@m3co-code
Copy link
Contributor Author

m3co-code commented Aug 16, 2017

@timoreimann thanks for review iteration nr1 :) I implemented your feedback. PTAL again.

@ldez ldez added the size/L label Aug 16, 2017
Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Marco. 👏

Copy link
Contributor

@ldez ldez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@nmengin nmengin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for this PR.
Only one suggestion.

server/server.go Outdated
if globalConfig.RespondingTimeouts != nil && time.Duration(globalConfig.RespondingTimeouts.IdleTimeout) != time.Duration(DefaultIdleTimeout) {
idleTimeout = time.Duration(globalConfig.RespondingTimeouts.IdleTimeout)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marco-jantke

Is there a specific reason to check parameters in this order or can we simplify the initialization like described below?

        var idleTimeout int64
	// When RespondingTimeouts.IdleTimout is configured, always use that setting
	if globalConfig.RespondingTimeouts != nil {
		idleTimeout = time.Duration(globalConfig.RespondingTimeouts.IdleTimeout)
	} else if globalConfig.IdleTimeout != nil {
                 // Backwards compatibility for deprecated IdleTimeout
                idleTimeout = time.Duration(globalConfig.IdleTimeout)
        } else {
                 // Default value if neither the deprecated IdleTimeout nor the new RespondingTimeouts.IdleTimout are configured
	        idleTimeout = time.Duration(DefaultIdleTimeout)
        }

The initialization priority stays as now :

  • New parameter
  • Deprecated parameter
  • Default value

When the deprecated parameter will be able to be deleted, we'll just have to delete the else if block.

What is your mind about this suggestion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I think its much clearer than before. I just had to tweak the implementation a bit because globalConfig.IdleTimeout is a flaeg.Duration and can not be compared to nil.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to update flaeg to the last version.

Copy link
Contributor

@nmengin nmengin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏

@m3co-code
Copy link
Contributor Author

I rebased the PR to one commit and made an update to the latest master version. Should be good to merge :)

The following are the commit messages / description of the original
commits before squashing.

remove modifications of global http.DefaultTransport

and create an http.Transport that is used for forwarding to the
backends. We get a couple benefits from this:
- it is in general bad practice to modify global state as before and
  could lead to unintended side-effects
- it is possible to set settings on the Transport that only for the
  communication to the backends.
- the same Transport settings (like timeouts) are used when using TLS to
  a backend

add ForwardingTimeouts and add docs

add server Timeouts and add docs

move IdleTimeout to Timeouts section and deprecated former option

rename Timeouts config to RespondingTimeouts

add unit test for RespondingTimeouts

add integration test for ForwardingTimeouts

add Deprecated comment

improve documentation of timeout settings

use different timeouts in tests to make it more expressive

extract building of server timeouts to own function

simplify idleTimeout loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants