Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared futures regularly allocate 3kb #5990

Closed
stuhood opened this issue Jun 20, 2018 · 5 comments
Closed

Shared futures regularly allocate 3kb #5990

stuhood opened this issue Jun 20, 2018 · 5 comments

Comments

@stuhood
Copy link
Member

stuhood commented Jun 20, 2018

From profiling memory usage on OSX (using Instruments, which has become useful since we switched to the system allocator via cdylib), it appears that Shared futures very regularly (perhaps all of them) result in a 3kb memory allocation for one of their internal HashMaps. This appears to account for about 1/3 of our allocations.

3kb-alloc

@stuhood
Copy link
Member Author

stuhood commented Jun 21, 2018

After rust-lang/futures-rs#1041, this allocation seems to fit in a 320 byte slot:
320-byte

@illicitonion
Copy link
Contributor

Nice!

Are we going to end up forking futures for a bit? IIRC bumping to 0.2.* is going to be non-trivial because of transitive deps (in particular, grpcio)

@stuhood
Copy link
Member Author

stuhood commented Jun 21, 2018

@illicitonion : My patch to use the PR was... awkward, but maybe not too bad? It's possible that most of the diff could be pruned away: master...twitter:stuhood/futures-shared-allocs

@stuhood
Copy link
Member Author

stuhood commented Jun 21, 2018

But I think that since this isn't a high priority now (and since there are potentially some different approaches available than the "sorted Vec" approach I tried), waiting to see what happens upstream for a bit should be ok.

stuhood pushed a commit that referenced this issue Jul 3, 2018
### Problem

#4558 introduces a new state for a `Node`: it may be `dirty`, which requires that we remember the previous value of a `Node` while simultaneously computing its next value. I began implementing this state machine with our existing `future::Shared` usage, but ran into the need to either hide values in the closure of the running future, or store values atomically outside of the future's value. This complexity represents a classic case for a state machine.

Additionally, we have seen that `future::Shared` has a relatively high cost in terms of complexity and memory (see #5990).

### Solution

In order to make #4558 easier to implement correctly (and to reduce memory usage), implement the storage of a `Node`'s value in the `Graph` via a state machine.

One of the largest complexities of `Shared` is that it needs to elect one of the waiters of the `Shared` future as the one who will actually do the work of `poll`ing the wrapped future. To avoid that complexity, we introduce usage of the `tokio` `Runtime` for _all_ `Node` executions. This allows waiters to stay very simple, and receive the result value via a `oneshot::channel`.

### Result

#4558 should be much easier to implement. I'll likely wait to land this until #4558 is further along. 

This change reduces memory usage about 200MB further than the fix from #5990 (down to about 2.5GB for `./pants list ::` in Twitter's repo).  Unfortunately, in its current form it also represents a performance regression of about 10% for that same usecase. Although I believe I have a handle on why, I'd like to defer fixing that regression until after #4558 is implemented.

This also fixes #3695, as since we now fail slowly, we are able to render multiple distinct failures at once.
@stuhood
Copy link
Member Author

stuhood commented Jul 3, 2018

This is "fixed"/avoided by #6013. Closing.

@stuhood stuhood closed this as completed Jul 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants