Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: channel buffer pools #20868

Closed
networkimprov opened this issue Jun 30, 2017 · 9 comments
Closed

proposal: channel buffer pools #20868

networkimprov opened this issue Jun 30, 2017 · 9 comments
Labels
FrozenDueToAge Proposal v2 An incompatible library change
Milestone

Comments

@networkimprov
Copy link

When creating a large number of buffered channels for similar kinds of messages, instead of giving each channel a buffer size, it would be far more memory efficient to pool the buffers:

pool := make([]T1, 1e6) // alternatively make([]byte, n)
ch1 := make(chan T1, pool)
ch2 := make(chan T2, pool) // perhaps channels for diff types can share the pool?
// both block on write when the pool is exhausted

The above makes buffered objects editable, which may be undesirable. An alternative is to use a pointer as the name of an internal buffer:

var pool int = 1e6
ch1 := make(chan T, &pool) // does make([]T, *pool); labels slice with &pool
ch2 := make(chan T, &pool) // buffers to slice created above; *pool not checked

A channel buffer pool is particularly valuable where one can forecast the overall traffic for the set of channels (e.g. it's limited by network bandwidth), but not the load on any one channel. Any switchboard-style application where each incoming message is broadcast to multiple receivers fits that profile, for instance a chatroom server. This proposal might also solve the problem motivating #20352.

There is a workaround for the lack of buffer pools, but it's expensive: for each required channel, dedicate a goroutine-plus-two-channels to allow elastic channel buffers.

@rsc
Copy link
Contributor

rsc commented Jul 6, 2017

It seems like this would be a very productive source of bugs and deadlocks.

What is the problem you are trying to solve?
Why is the problem important to solve?
How does it arise in practice?
Why is this the right solution?

@networkimprov
Copy link
Author

networkimprov commented Jul 6, 2017

Hi Russ. The motivation for this proposal is an app with potentially zillions of channels which are intermittently drained, where the average buffered queue is <10 items, but can spike to a few thousand. If every channel buffer were sized to 5,000 elements, it's >40GB of buffers for 1M chan int instances, but actual memory usage is <80MB.

So in today's Go we must either use the goroutine-plus-two-channels method to dynamically allocate channels (which is expensive but transparent to producer & consumer), or keep a pool of large-buffer channels which are switched in when the normal channels block on write. A dedicated channel buffer pool would be memory-efficient, inexpensive, and transparent.

Thinking more broadly, channels provide a nice interface (make, write, read, select, close) which app-defined services might like to implement so they can serve any channel client. So a scheme to define channel plug-ins is a more general solution to the problem I describe.

Regarding deadlocks, did you have a scenario in mind? You would lock+unlock the internal buffer pool before/after locking the channel to add/remove an element, so I'm not seeing where you have the lock+lock sequence necessary to create a deadlock.

@cznic
Copy link
Contributor

cznic commented Jul 7, 2017

@networkimprov

The motivation for this proposal is an app with potentially millions of channels which are intermittently drained, where the average buffered queue is <10 items, but can spike to a few thousand.

...

(The app is several weeks from its initial release; I will add real-world data points to this discussion as soon as I have some.)

Big number of goroutines have valid use cases. Millions of channels not so much IMO. Can you please share more details about the soon-to-be-released app? To be honest, I would like to try to evaluate if the design, using millions of channels, makes sense or not in this case.

@networkimprov
Copy link
Author

networkimprov commented Jul 7, 2017

Here "intermittently drained" means the goroutine for each channel isn't always up. The channel is for each user who's connected in the past ~24h. Its producers are any of the active users. (Actives are 1-2% of the total, ~15K, at a given moment).

Even given a smaller number of channels, or workarounds that approximate a channel buffer pool, I believe my point about the memory inefficiency of dedicated channel buffers remains valid.

@networkimprov
Copy link
Author

@cznic, any more thoughts?

Here "intermittently drained" means the goroutine for each channel isn't always up. The channel is for each user who's connected in the past ~24h. Its producers are any of the active users. (Actives are 1-2% of the total, ~15K, at a given moment).

Even given a smaller number of channels, or workarounds that approximate a channel buffer pool, I believe my point about the memory inefficiency of dedicated channel buffers remains valid.

@cznic
Copy link
Contributor

cznic commented Jul 13, 2017

My impression is that the proposal's motivation is to fix a problem that was mistakenly inserted into the app's design.

@networkimprov
Copy link
Author

That is a general statement of opinion, not an analysis by which the proposal can be evaluated!

Leaving buffered channels in place for intermittently active consumers means the queues in question do not have to be re-created from disk each time the consumer resumes.

@cznic
Copy link
Contributor

cznic commented Jul 13, 2017

You've asked for thoughts and I delivered. You dismissed that for not being analysis. If you'd have asked for analysis, I'd have to ask for the app source code in the first place. Let's forget it, okay?

@rsc rsc added the v2 An incompatible library change label Jul 17, 2017
@ianlancetaylor
Copy link
Contributor

Related to #20352, though this one does have a limit. We're going to give this the same answer. If you need to handle spikes in traffic, use mechanisms designed for that. We don't want to complicate the language-provided channels by letting them share a pool of memory. That seems likely to lead to deadlocks, and significant complicates the implementation in the normal case for an approach that will be used very rarely.

If we had generic types, unlimited channels could be implemented in a library with full type safety. A library would also make it possible to improve the implementation easily over time as we learn more. As various people said above, putting unlimited channels into the language seems like an attractive nuisance; most programs do need backpressure.

Closing on the assumption that we will get some type of generics. We can reopen if we decide that that is definitely not happening.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests

5 participants