Async fn doubles argument size #62958

4teap · 2019-07-24T22:47:05Z

Generator optimization in #60187 by @tmandry reused generator locals, but arguments are still duplicated whenever used across yield points.

For example (playground):

#![feature(async_await)]

async fn wait() {}

async fn test(arg: [u8; 8192]) {
    wait().await;
    drop(arg);
}

fn main() {
    println!("{}", std::mem::size_of_val(&test([0; 8192])));
}

Expected: 8200
Actual: 16392

When passing in futures, the future size can grow exponentially (playground):

#![feature(async_await)]

async fn test(_arg: [u8; 8192]) {}

async fn use_future(fut: impl std::future::Future<Output = ()>) {
    fut.await
}

fn main() {
    println!(
        "{}",
        std::mem::size_of_val(&use_future(use_future(use_future(use_future(use_future(
            use_future(use_future(use_future(use_future(use_future(test(
                [0; 8192]
            ))))))
        ))))))
    );
}

Expected: 8236
Actual: 8396796

I didn't find any note on this. But given how common arguments are used, I think it might be useful if they are included in the optimization.

tmandry · 2019-07-31T02:35:55Z

Implementation notes: In the compiler, these arguments get converted to generator upvars, which are then immediately moved into locals at the top of the generator MIR. The values are afterward used from these local vars.

We don't support overlapping upvar storage. @cramertj had a suggestion which makes sense to me, which is to make upvars part of the generator Unresumed variant. Then when we move them into locals in the first resume, we'll be able to overlap those locals with the upvar storage. As long as we hold invariant that every upvar is moved into a local at the top of the generator resume function, this should always work.

I know there's some code which makes assumptions about where upvars live in closures and generators. I'm not sure how much will have to change if we do this.

cc @eddyb @mw @Zoxc

eddyb · 2019-08-09T13:49:41Z

@tmandry I think we can just add them as regular locals, and whether they will only be in Unresumed would depend on whether they are moved or not.
Keep in mind you can do e.g. async fn async_drop<T>(_: T) {}.

nikomatsakis · 2019-08-13T17:12:17Z

Check-in from async-await wg meeting: marking as "deferred" as this does not block stabilization (though it'd still be great to fix!).

steveklabnik · 2020-04-09T18:46:58Z

Triage: first playpen now reports 16386, second reports 8390655

tmandry · 2020-04-21T23:10:40Z

In theory I think we can get away with not treating upvars specially, and remapping them to locals immediately, inside the generator transform.

We should remove prefix_tys and audit any uses of upvar_tys to make sure they don't make an assumption of where upvars are stored in the generator layout. In particular, the generator layout code will need to change so it doesn't handle upvars specially anymore (the upvar types should be included as field types in GeneratorLayout now).

There are probably some details I'm forgetting, but hopefully that's enough to get started.

Based on a couple of sources: - https://swatinem.de/blog/future-size/ - rust-lang#62958

…est, r=compiler-errors Add tests to assert current behavior of large future sizes Based on a couple of sources: - https://swatinem.de/blog/future-size/ - rust-lang#62958

…te-in-print-type-sizes, r=compiler-errors Extend `-Z print-type-sizes` to distinguish generator upvars+locals from "normal" fields. For example, for this code: ```rust async fn wait() {} async fn test(arg: [u8; 8192]) { wait().await; drop(arg); } async fn test_ideal(_rg: [u8; 8192]) { wait().await; // drop(arg); } fn main() { let gen_t = test([0; 8192]); let gen_i = test_ideal([0; 8192]); println!("expect {}, got: {}", std::mem::size_of_val(&gen_i), std::mem::size_of_val(&gen_t)); } ``` the `-Z print-type-sizes` output used to start with: ``` print-type-size type: `[async fn [email protected]:3:32: 6:2]`: 16386 bytes, alignment: 1 bytes print-type-size discriminant: 1 bytes print-type-size variant `Suspend0`: 16385 bytes print-type-size field `.arg`: 8192 bytes, offset: 0 bytes, alignment: 1 bytes print-type-size field `.arg`: 8192 bytes print-type-size field `.__awaitee`: 1 bytes ... print-type-size type: `std::mem::ManuallyDrop<[u8; 8192]>`: 8192 bytes, alignment: 1 bytes print-type-size field `.value`: 8192 bytes ... ``` but with this change, it now instead prints: ``` print-type-size type: `[async fn [email protected]:3:32: 6:2]`: 16386 bytes, alignment: 1 bytes print-type-size discriminant: 1 bytes print-type-size variant `Suspend0`: 16385 bytes print-type-size upvar `.arg`: 8192 bytes, offset: 0 bytes, alignment: 1 bytes print-type-size local `.arg`: 8192 bytes print-type-size local `.__awaitee`: 1 bytes ... print-type-size type: `std::mem::ManuallyDrop<[u8; 8192]>`: 8192 bytes, alignment: 1 bytes print-type-size field `.value`: 8192 bytes ``` (spawned off of investigation of rust-lang#62958 )

Based on a couple of sources: - https://swatinem.de/blog/future-size/ - rust-lang#62958

…est, r=compiler-errors Add tests to assert current behavior of large future sizes Based on a couple of sources: - https://swatinem.de/blog/future-size/ - rust-lang#62958

This adds one more test that should track improvements to generator layout, like rust-lang#62958 and rust-lang#62575. In particular, this test highlights suboptimal layout, as the storage for the argument future is not being reused across its usage as `upvar`, `local` and `awaitee` (being polled to completion).

Add test for Future inflating arg size to 3x This adds one more test that should track improvements to generator layout, like rust-lang#62958 and rust-lang#62575. In particular, this test highlights suboptimal layout, as the storage for the argument future is not being reused across its usage as `upvar`, `local` and `awaitee` (being polled to completion). This is on top of rust-lang#107692 (as those would conflict with each other) It is a minimal repro for code mentioned in moka-rs/moka#212 (comment) (CC `@tatsuya6502)`

This patch adds a few tests to assert the current behavior when passing data across an await point. This will help to test out an upcoming fix for the issue of arguments in async functions growing in size because of the generator upvar that is generated when we desugar the async function. See rust-lang/rust#62958 Also relates to rust-lang/rust#107500

pnkfelix · 2023-02-16T04:02:21Z

Okay, I have now prototyped a pair of simple MIR optimizations that, when applied prior to the generator layout pass in MIR, will deal with the examples shown on this issue. They are over in this branch here: https://github.com/pnkfelix/rust/tree/copy-prop-upvar-to-elim-local-for-62958

I will report back more tomorrow or Friday after I get a chance to run the optimizations through their paces against the Rust test suite. (And I will enlist @bryangarza 's help in coming up pathological tests.)

cjhopman · 2023-02-16T21:24:15Z

What surprised me about this is that these two produce differently sized futures:

async fn a<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> () {
    f.await;
    g.await;
}

fn b<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> impl Future<Output = ()> {
    async move {
        f.await;
        g.await;
    }
}

See: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e2f68fb25d5bf33511a35b9c343efb28

pnkfelix · 2023-02-17T18:30:25Z

@cjhopman indeed, the example you give there shows one fundamental piece of the problem.

Its because this:

async fn a<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> () {
    f.await;
    g.await;
}

desugars to:

fn a_desugared<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> impl Future<Output = ()> {
    async move {
        let f = f;
        let g = g;
        f.await;
        g.await;
    }
}

(and then those let <local> = <upvar>; get compiled in a very naive way that causes the layouts to blow up.)

Its not the only problem here, but its an easy one to address in the compiler.

This patch adds a few tests to assert the current behavior when passing data across an await point. This will help to test out an upcoming fix for the issue of arguments in async functions growing in size because of the generator upvar that is generated when we desugar the async function. See rust-lang/rust#62958 Also relates to rust-lang/rust#107500 Co-authored-by: Ralf Jung <[email protected]>

Add tests for moving data across await point This patch adds a few tests to assert the current behavior when passing data across an await point. This will help to test out an upcoming fix for the issue of arguments in async functions growing in size because of the generator upvar that is generated when we desugar the async function. See rust-lang/rust#62958 Also relates to rust-lang/rust#107500 FYI `@oli-obk` `@pnkfelix`

This patch adds a few tests to assert the current behavior when passing data across an await point. This will help to test out an upcoming fix for the issue of arguments in async functions growing in size because of the generator upvar that is generated when we desugar the async function. See rust-lang#62958 Also relates to rust-lang#107500 Co-authored-by: Ralf Jung <[email protected]>

Add tests for moving data across await point This patch adds a few tests to assert the current behavior when passing data across an await point. This will help to test out an upcoming fix for the issue of arguments in async functions growing in size because of the generator upvar that is generated when we desugar the async function. See rust-lang#62958 Also relates to rust-lang#107500 FYI `@oli-obk` `@pnkfelix`

…<try> Relocate coroutine upvars into Unresumed state Related to rust-lang#62958 This PR is an attempt to address the async/coroutine size issue by allowing independent def-use/liveness analysis on individual upvars in coroutines. It has appeared to address partially the size doubling issue introduced by the use of upvars. However, there are caveats detailed in the following list that I would like to address before turning this draft in. - The treatment here towards the `ty::Coroutine` in MIR passes is unfortunately "messier" than my liking, which is something I definitely want to change. I propose to promote upvars into `Body<'tcx>` along with `local_decls`, so that we can safely handle them safely. I would happily open a new separate PR to improve the upvar management. - It is not a generic solution, yet. For instance, we are still doubling the size in the example of rust-lang#62958. If we insert a pass before MIR type analysis to remove unnecessary drops, which we can, that particular size doubling will be solved. However, if a `Future` upvar is alive across more than one yield points, that upvar is still ineligible. It makes sense because we would like to minimize moving of variant fields. How to handle these upvars is not the focus of this PR for now. Out of expectation of possible change in the high level plan, I am keeping this as a draft in hope of invoking conversations. 🙇 cc `@pnkfelix` for the context.

Add tests for moving data across await point This patch adds a few tests to assert the current behavior when passing data across an await point. This will help to test out an upcoming fix for the issue of arguments in async functions growing in size because of the generator upvar that is generated when we desugar the async function. See rust-lang/rust#62958 Also relates to rust-lang/rust#107500 FYI `@oli-obk` `@pnkfelix`

Centril added the F-async_await label Jul 28, 2019

tmandry mentioned this issue Aug 3, 2019

Use MaybeUninit to produce more correct layouts #63035

Merged

nikomatsakis added the AsyncAwait-Triaged Async-await issues that have been triaged during a working group meeting. label Aug 13, 2019

jonas-schievink mentioned this issue Mar 8, 2020

Tracking Issue for Coroutine Memory Usage #69826

Open

7 tasks

jonas-schievink mentioned this issue May 22, 2020

Simple async wrapper doubles the size of a future #72454

Closed

patrickfreed mentioned this issue Aug 4, 2021

RUST-793 Reduce size of returned futures mongodb/mongo-rust-driver#417

Merged

jonas-schievink removed the F-async_await label Aug 29, 2021

Kobzol mentioned this issue Sep 24, 2021

WIP: Avoid storing captured upvars in generators twice if possible #89213

Closed

tatsuya6502 mentioned this issue Jan 25, 2023

Moka get_with inflates future size by ~7x moka-rs/moka#212

Closed

bryangarza mentioned this issue Jan 31, 2023

Add tests to assert current behavior of large future sizes #107500

Merged

bryangarza added a commit to bryangarza/rust that referenced this issue Jan 31, 2023

Add tests to assert current behavior of large future sizes

e177ed3

Based on a couple of sources: - https://swatinem.de/blog/future-size/ - rust-lang#62958

pnkfelix mentioned this issue Jan 31, 2023

Extend -Z print-type-sizes to distinguish generator upvars+locals from "normal" fields. #107533

Merged

bryangarza added a commit to bryangarza/rust that referenced this issue Feb 2, 2023

Add tests to assert current behavior of large future sizes

cc8f990

Based on a couple of sources: - https://swatinem.de/blog/future-size/ - rust-lang#62958

compiler-errors mentioned this issue Feb 4, 2023

Don't explicitly include upvars in the DINode produced by build_generator_variant_struct_type_di_node #107653

Closed

Swatinem mentioned this issue Feb 5, 2023

Add test for Future inflating arg size to 3x #107695

Merged

bryangarza mentioned this issue Feb 11, 2023

Add tests for moving data across await point rust-lang/miri#2784

Merged

pnkfelix mentioned this issue Mar 1, 2023

[WIP] mir-opt for generators/futures: copy propagate upvar into locals #108590

Closed

6 tasks

nolanderc mentioned this issue Mar 8, 2023

async block occupies twice the size of its captures (causing explosion in size). #108906

Open

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

dingxiangfei2009 mentioned this issue Jan 20, 2024

Relocate coroutine upvars into Unresumed state #120168

Closed

jplatte mentioned this issue Feb 17, 2024

eyeball-im: Add VectorDiff::async_map jplatte/eyeball#45

Closed

dingxiangfei2009 mentioned this issue Nov 10, 2024

Relocate upvars to Unresumed state and make coroutine prefix trivial #127522

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async fn doubles argument size #62958

Async fn doubles argument size #62958

4teap commented Jul 24, 2019

tmandry commented Jul 31, 2019

eddyb commented Aug 9, 2019

nikomatsakis commented Aug 13, 2019

steveklabnik commented Apr 9, 2020

tmandry commented Apr 21, 2020 •

edited

Loading

pnkfelix commented Feb 16, 2023 •

edited

Loading

cjhopman commented Feb 16, 2023

pnkfelix commented Feb 17, 2023 •

edited

Loading

Async fn doubles argument size #62958

Async fn doubles argument size #62958

Comments

4teap commented Jul 24, 2019

tmandry commented Jul 31, 2019

eddyb commented Aug 9, 2019

nikomatsakis commented Aug 13, 2019

steveklabnik commented Apr 9, 2020

tmandry commented Apr 21, 2020 • edited Loading

pnkfelix commented Feb 16, 2023 • edited Loading

cjhopman commented Feb 16, 2023

pnkfelix commented Feb 17, 2023 • edited Loading

tmandry commented Apr 21, 2020 •

edited

Loading

pnkfelix commented Feb 16, 2023 •

edited

Loading

pnkfelix commented Feb 17, 2023 •

edited

Loading