core: close provider/provisioner connections when not used anymore #2406

svanharmelen · 2015-06-19T19:55:00Z

Currently Terraform is leaking goroutines and with that memory. I know strictly speaking this maybe isn’t a real concern for Terraform as it’s mostly used as a short running command line executable.

But there are a few of us out there that are using Terraform in some long running processes and then this starts to become a problem.

Next to that it’s of course good programming practise to clean up resources when they're not needed anymore. So even for the standard command line use case, this seems an improvement in resource management.

Personally I see no downsides as the primary connection to the plugin is kept alive (the plugin is not killed) and only unused connections that will never be used again are closed to free up any related
goroutines and memory.

Currently Terraform is leaking goroutines and with that memory. I know strictly speaking this maybe isn’t a real concern for Terraform as it’s mostly used as a short running command line executable. But there are a few of us out there that are using Terraform in some long running processes and then this starts to become a problem. Next to that it’s of course good programming practise to clean up resources when they're not needed anymore. So even for the standard command line use case, this seems an improvement in resource management. Personally I see no downsides as the primary connection to the plugin is kept alive (the plugin is not killed) and only unused connections that will never be used again are closed to free up any related goroutines and memory.

svanharmelen · 2015-06-19T20:12:49Z

@mitchellh @phinze I suppose you guys would want to have a look at this one 😉 If one of you could have a look/review and share your thoughts that would be great!

Thanks!

svanharmelen · 2015-06-23T05:35:36Z

Ping @mitchellh and @phinze for a review on this one. Doesn't seem something that has a big impact, yet it improves the general code by properly releasing any unused resourses...

Would be really great if this could be added before releasing 0.6.0.

Thx!

phinze · 2015-06-23T19:34:46Z

rpc/mux_broker.go

 func (m *muxBroker) NextId() uint32 {
-	return atomic.AddUint32(&m.nextId, 1)
+	return rand.Uint32()


This change introduces a (very low) possibility of the ID repeating, no? Which would be problematic for an identifier?

It are pseudo-random numbers and there are 4.294.967.295 of numbers to hand out. That means it would be extremely unlikely to get the same number within just a few numbers of each other.

And an ID may very wel be reused as soon as the previous command that was using the ID is closed by the new close provider. Having just a counter going up would mean there is hard limit of the maximum amount of commands that can be executed. And even through 4.294.967.295 is quite a lot, if you use TF in an always on setup it will hit that limit sooner or later.

But if we think this could be an issue, the only additional thing I can think of is to make a map with generated numbers and a timestamp (something like a map[uint32]Time) and then whenever a new number is generated we first check if the number was already handed out. But then additionally it should also delete entries that are older then x time, where x would be something we would agree on. I would say if you would like to make it extremely safe make it 60 minutes. There is no single apply/plan/refresh that takes 60 minutes right? Doing the cleanup could then be triggered only when NextId() is called, so it doesn't require a scheduled thing of it's own to periodically check and cleanup.

Sounds like maybe a good add-on, let me know what you think and if you think it's needed...

Talked about this on IRC, but mirroring here - I think we're fine to punt on this for now.

Will have a look tomorrow at how much impact a solution as discussed would have. The more I think about it the more I believe it wouldn't hurt and would make 99,99% a 100% which is always better of course 😉

But if so, I'll open a new PR for that one...

I would feel better about reinstating the atomic increment. Packer/Terraform (both use this system, but not the same code) open a LOT of channels. Hundreds. While the chance is low, if it did conflict, it would be disastrous. LIkely an immediate panic at some point.

The atomic increment isn't a hard limit, we just expect it to wrap at some point (being a signed integer).

That's a good point and worth potentially opening in Yamux. But I think on our side we use it for other reasons.

I'll open an issue there to discuss this, but I can imagine use cases where this check would actually make sense. For Terraform it doesn't as the channels aren't usually used for more then a couple of minutes...

So would something like this be a workaround until (if) the logic in yamux can be changed?

func (m *muxBroker) NextId() uint32 { if atomic.LoadUint32(&m.nextid) == math.MaxUint32-10 { atomic.StoreUint32(&m.nextid, 0) } return atomic.AddUint32(&m.nextId, 1) }

Maybe still a little dirty, but this would do the trick right?

Edit: With some comments explaining why that is done of course.

+1 This seems like a good compromise for now.

Sorry for not picking this up today. Had a horrable (non productive) day with a broken macbook.

Will do some tests and make a PR for this tomorrow so we can get it in (if all is well of course) before shipping 0.6.0...

Thanks, yeah, that would be ideal!

phinze · 2015-06-23T19:45:48Z

Thanks for taking this on @svanharmelen!

Overall this LGTM - two inline questions / nits.

Removed the `DotOrigin()` func after review of the PR. With this change the PR is up-to-spec to be merged (as just discussed with Paul in IRC).

…ners core: close provider/provisioner connections when not used anymore

core: reverting a few lines from PR #2406

ghost · 2020-05-02T01:58:13Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

svanharmelen added the core label Jun 22, 2015

svanharmelen mentioned this pull request Jun 23, 2015

core: add a function to load JSON directly #2435

Merged

phinze reviewed Jun 23, 2015
View reviewed changes

A close provider should not be an origin node

110cf8b

Removed the `DotOrigin()` func after review of the PR. With this change the PR is up-to-spec to be merged (as just discussed with Paul in IRC).

svanharmelen pushed a commit that referenced this pull request Jun 23, 2015

Merge pull request #2406 from svanharmelen/b-close-providers-provisio…

01a9cee

…ners core: close provider/provisioner connections when not used anymore

svanharmelen merged commit 01a9cee into hashicorp:master Jun 23, 2015

svanharmelen deleted the b-close-providers-provisioners branch June 24, 2015 16:21

svanharmelen pushed a commit that referenced this pull request Jun 25, 2015

Merge pull request #2486 from svanharmelen/f-stream-ids

50be642

core: reverting a few lines from PR #2406

phinze mentioned this pull request Jun 25, 2015

Templates not working with latest master branch #2495

Closed

phinze mentioned this pull request Jul 29, 2015

Terraform hangs on plan/apply #2884

Closed

phinze mentioned this pull request Jan 18, 2016

Documentation of the (close) Transformers #4707

Closed

ghost locked and limited conversation to collaborators May 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: close provider/provisioner connections when not used anymore #2406

core: close provider/provisioner connections when not used anymore #2406

svanharmelen commented Jun 19, 2015

svanharmelen commented Jun 19, 2015

svanharmelen commented Jun 23, 2015

phinze Jun 23, 2015

svanharmelen Jun 23, 2015

phinze Jun 23, 2015

svanharmelen Jun 23, 2015

mitchellh Jun 23, 2015

mitchellh Jun 23, 2015

svanharmelen Jun 23, 2015

phinze Jun 23, 2015

svanharmelen Jun 24, 2015

mitchellh Jun 24, 2015

phinze commented Jun 23, 2015

ghost commented May 2, 2020

core: close provider/provisioner connections when not used anymore #2406

core: close provider/provisioner connections when not used anymore #2406

Conversation

svanharmelen commented Jun 19, 2015

svanharmelen commented Jun 19, 2015

svanharmelen commented Jun 23, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phinze commented Jun 23, 2015

ghost commented May 2, 2020