Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IdleConnTimeout to Traefik's http.server settings #1340

Merged
merged 10 commits into from
Apr 4, 2017

Conversation

bparli
Copy link
Contributor

@bparli bparli commented Mar 24, 2017

Without enforcing a timeout Traefik is susceptible to resource leakage, particularly when deployed as a public facing proxy exposed to the Internet.

Make sure IdleTimeout is configurable via the traefik.toml file

Set the default to be 180 seconds

Ben Parli and others added 4 commits March 24, 2017 10:14
Without such a timeout there is a risk of resource leakage from piling up connections, particularly when exposing Traefik to the Internet.

Set the default to be 180 seconds
Without enforcing a timeout Traefik is susceptible to resource leakage, particularly when deployed as a public facing proxy exposed to the Internet.

Set the default to be 180 seconds
@bparli
Copy link
Contributor Author

bparli commented Mar 24, 2017

Also, I think this should address #1322

@timoreimann timoreimann self-requested a review March 24, 2017 22:16
@timoreimann
Copy link
Contributor

@bparli thanks for the contribution -- a pretty important one I should say.

Do I see correctly that we need to wait for #1342 before we can merge this?

@timoreimann
Copy link
Contributor

I'm also pretty confident this PR will

fix #1322

@bparli
Copy link
Contributor Author

bparli commented Mar 24, 2017

@timoreimann IMO this isn't blocked by #1342; that was just an issue I noticed when testing this. This setting will be much easier for folks to configure after that issue is fixed though.

@timoreimann
Copy link
Contributor

timoreimann commented Mar 24, 2017

@bparli:

IMO this isn't blocked by #1342; that was just an issue I noticed when testing this. This setting will be much easier for folks to configure after that issue is fixed though.

IIUC, configuring is still possible because users may specify (nanoseconds) integers. One thing we have to take into consideration though is that if we use a time.ParseDuration-based approach in the future, specifications missing a unit are going to be rejected. What this means is that accepting unit-less specs now will force us to continue providing support for it unless we want to make a breaking change.

WDYT?

@bparli
Copy link
Contributor Author

bparli commented Mar 25, 2017

Thats a fair point, @timoreimann. My thinking was its already broke for the ProvidersThrottleDuration setting anyway, but you're right, better and cleaner to fix #1342 first. And it looks like you've already picked that one up.

@errm
Copy link
Contributor

errm commented Mar 25, 2017

LGTM, but waiting for #1342 does make sense ...

Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new duration was just merged into master. :-)

Could we also make a quick addition to the toml.md and samples.md files? Thanks!

@bparli
Copy link
Contributor Author

bparli commented Apr 3, 2017

Sure thing @timoreimann, added some content to toml.md and examples.md. I also put ProvidersThrottleDuration in the examples.md for good measure.

Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more tiny little change, then I'm totally happy. :-)

docs/toml.md Outdated
@@ -67,6 +67,14 @@
#
# ProvidersThrottleDuration = "2s"

# IdleTimeout: maximum amount of time an idle (keep-alive) connection will remain idle before closing itself.
# This is set to enforce closing of stale client connections.
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please copy the description of the valid input values from here?

@bparli
Copy link
Contributor Author

bparli commented Apr 3, 2017

Np, good catch to match the ProvidersThrottleDuration language

Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, this looks perfect now. 👍

I'll merge once CI goes green.

@timoreimann
Copy link
Contributor

Ah sorry, can't merge right away, we need one more pair of maintainer eyes.

@emilevauge could you have a look please?

Copy link
Member

@emilevauge emilevauge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thanks a lot @bparli :)
Squashing your commits and merging

@emilevauge emilevauge merged commit c9d2349 into traefik:master Apr 4, 2017
emilevauge pushed a commit that referenced this pull request Apr 4, 2017
* Add IdleTimeout setting to http.server

Without such a timeout there is a risk of resource leakage from piling up connections, particularly when exposing Traefik to the Internet.

Set the default to be 180 seconds

* Add IdleConnTimeout to Traefik's http.server settings

Without enforcing a timeout Traefik is susceptible to resource leakage, particularly when deployed as a public facing proxy exposed to the Internet.

Set the default to be 180 seconds

* tweak

* Update configuration.go

* add some documentation for the idletimeout setting

* need to cast idletimeout

* update doc to refect format specifics
@Yggdrasil
Copy link
Contributor

🥇 Thanks for reporting and fixing this @bparli. This issue led to some production downtime for us until we understood what was happening. We've resorted to restarting Traefik twice a week as a workaround. I'll be soooo happy when this is released.

@bparli
Copy link
Contributor Author

bparli commented Apr 7, 2017

My pleasure, happy to contribute

@Yggdrasil
Copy link
Contributor

Would it be possible to mention this fix in the Changelog for 1.3.0? I couldn't find it in the RC's and I think it's worth a mention to inform those impacted by the leak.

Below's a sweet example of the difference this fix has made for us on one of our public facing webservers. On this server you can see memory usage climbing from the increasing number of open sockets, until Traefik's restarted. On May 16th around midnight we deployed v1.3.0-rc1, after which memory usage has been stable around 30MB. 😍

schermafbeelding 2017-05-19 om 16 43 09

@ldez ldez added this to the 1.3 milestone May 19, 2017
@ldez
Copy link
Contributor

ldez commented May 20, 2017

@Yggdrasil the changelog is fixed. 😃 See #1642.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement a new or improved feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants