Compilation of a crate using a large static map fails on latest i686-pc-windows-gnu Beta #36799

urschrei · 2016-09-28T14:45:24Z

I'm trying to build a cdylib (https://github.com/urschrei/lonlat_bng) which requires the https://github.com/urschrei/ostn15_phf crate.

building on AppVeyor, on i686-pc-windows-gnu, using the latest beta, is failing with an OOM error:

Details

The ostn15_phf crate is essentially just a very big static map, built using PHF (the generated, uncompiled map is around 42.9mb)

The build passed when running cargo test, using rustc 1.12.0-beta.3 (341bfe43c 2016-09-16):
https://ci.appveyor.com/project/urschrei/lonlat-bng/build/105/job/3y1llt6luqs3phs3

It's now failing when running cargo test, using rustc 1.12.0-beta.6 (d3eb33ef8 2016-09-23):
https://ci.appveyor.com/project/urschrei/lonlat-bng/build/job/27pgrkx2cnn2gw50

The failure occurs when compiling ostn15_phf with fatal runtime error: out of memory

The text was updated successfully, but these errors were encountered:

sfackler · 2016-09-28T21:32:44Z

cc @brson just want to make sure you're aware of this.

nikomatsakis · 2016-10-04T09:38:01Z

Related to #36926 perhaps? ("1.12.0: High memory usage when linking in release mode with debug info")

nikomatsakis · 2016-10-06T13:33:24Z

@urschrei have you tried this on platforms other than windows, out of curiosity?

nikomatsakis · 2016-10-06T13:36:14Z

I see. It seems to work on x86_64-pc-windows-gnu, but not i686.

urschrei · 2016-10-06T13:45:13Z

I haven't tried to build i686 on Linux or OSX, but I easily could…

nikomatsakis · 2016-10-06T13:48:47Z

Well, I just did a run on my linux box. The memory usage is certainly through the roof: https://gist.github.com/nikomatsakis/ea771dd69f12ebc5d3d5848fa59fb43a

This is using nightly (rustc 1.13.0-nightly (a059cb2f3 2016-09-27)). The peak is around 4GB.

nikomatsakis · 2016-10-06T17:10:36Z

So @alexcrichton has executed a massif run: https://gist.github.com/alexcrichton/d20d685dd7475b1801a2ccac6ba15b08

~~The peak result~~ Measurement 58 shows something like this:

Percentage	Area
36%	HIR
9%	MIR
14%	type/region arenas
5%	region maps
2%	constants

The peak result (measurement 48) looks pretty similar. More memory used by MIR:

Percentage	Area
26%	HIR
14%	MIR
5%	type/region arenas
4%	region maps
3%	constants

These numbers are just based on a kind of quick scan of the gist.

nikomatsakis · 2016-10-06T17:16:02Z

It seems clear we need to revisit our handling of statics and constants in a big way. But then this has been clear for a long time. =) I'm wondering if we can find some kind of "quick fix" here.

I also haven't compared with historical records -- but most of that memory we see above, we would have been allocating before too, so I'm guessing this is a case of being pushed just over the threshold on i686, versus a sudden spike.

nikomatsakis · 2016-10-07T17:15:34Z

Sizes of some types from MIR (gathered from play):

statement:      192
statement-kind: 176
lvalue:         16
rvalue:         152
local-decl:     48

nikomatsakis · 2016-10-12T19:42:37Z

I am feeling torn here. It seems the best we can do short-term is to make some small changes to MIR/HIR and try to bring the peak down below 4GB. The long term fix would be to revisit the overall representation particularly around constants and see what we can do to bring the size down. One thing I was wondering about (which is probably answerable from @alexcrichton's measurements) is what percentage of memory is being used just in the side-tables vs the HIR itself.

In any case, it seems like 152 bytes for a mir rvalue is really quite big.

nikomatsakis · 2016-10-12T19:48:53Z

Looking again at the massif results, it looks like MIR is taking more memory than I initially thought. One place that seems to use quite a bit is the list of scopes.

nikomatsakis · 2016-10-12T20:19:17Z

Better numbers:

percent	pass
35.09%	HIR
34.96%	MIR
11.77%	mk_region, node_type_insert
6.24%	region_maps
2.42%	consts
1.39%	spans

brson · 2016-10-20T16:12:26Z

Just pinging @nikomatsakis to keep this P-high bug on his radar.

nikomatsakis · 2016-10-20T17:49:33Z

I've made basically no progress here, I'm afraid. I think the most likely path forward in short term is to try and reduce memory usage in various data structures. Not very satisfying though. I'm not sure if we can make up the gap that way.

nikomatsakis · 2016-10-20T20:23:34Z

Discussed in compiler meeting. Conclusion: miri would be the proper fix, but maybe we can shrink MIR a bit in the short term, perhaps enough to push us over the edge. I personally probably don't have time for this just now (have some other regr to examine). Hence re-assigning to @pnkfelix -- @arielb1 maybe also interested in doing something?

pnkfelix · 2016-10-27T20:16:52Z

cc @nnethercote in case he might have input on ways to attack this

nnethercote · 2016-10-28T04:20:19Z

Massif is slow and the output isn't the easy to read, but it's the ideal tool for tackling this. The peak snapshot in @alexcrichton's data is number 48, and I've made a nicer, cut-down version of it here: https://gist.github.com/nnethercote/935db34ff2da854df8a69fa28c978497

You can see that ~30% of the memory usage comes from lower_expr. I did a DHAT run (more on that in a moment) and this is the main culprit:

    ExprKind::Tup(ref elts) => {
        hir::ExprTup(elts.iter().map(|x| self.lower_expr(x)).collect())
    }

push_scope/pop_scope account for another ~15%.

arielb1 · 2016-10-28T05:50:16Z

@nnethercote

The push_scope/pop_scope is MIR things. I think interning lvalues (they take 16 bytes, but occur multiple times per statement/terminator - they should take 4) and boxing rvalues (many statements are storage statements that don't have an rvalue) should claw back some performance.

Also, enum Operand is very common, in most contexts just an lvalue, and takes 72 bytes. Something should be done about this, even just boxing constants (which would reduce it to 24 bytes, or 16 bytes with interning lvalues, or 8 bytes with interning lvalues and bitpacking).

arielb1 · 2016-10-28T06:24:13Z

Having a 32-bit MIR "shared object representation" MirValue with 3/4 tag bits and the rest an index to an interner would basically apply all of these suggestions (where MirValue can represent all of Constant, Lvalue, Rvalue, Operand, and we keep the structs as strongly-typed newtypes).

We can then intern MirValues and/or occasionally GC-intern them, for further memory gain.

@eddyb

put empty generic lists behind a pointer This reduces the size of hir::Expr from 128 to 88 bytes (!) and shaves 200MB out of #36799. This is a performance-sensitive PR so please don't roll it up. r? @eddyb

michaelwoerister · 2017-09-25T08:17:38Z

@arielb1 Oh right, with one CGU memory usage should be the same unless there's some kind of bug.

@eddyb

encode region::Scope using fewer bytes Now that region::Scope is no longer interned, its size is more important. This PR encodes region::Scope in 8 bytes instead of 12, which should speed up region inference somewhat (perf testing needed) and should improve the margins on #36799 by 64MB (that's not a lot, I did this PR mostly to speed up region inference). This is a perf-sensitive PR. Please don't roll me up. r? @eddyb This is based on #44743 so I could get more accurate measurements on #36799.

arielb1 · 2017-09-27T16:44:45Z

@petrochenkov's span reform PR had lost us 400MB of space here, which means making this compile is that much harder. Maybe we should just use enough bits for the crate when encoding spans?

arielb1 · 2017-09-27T19:28:39Z

or use 40 bit spans, which should be enough to handle 64MB crates?

michaelwoerister · 2017-09-28T19:59:25Z

@arielb1 I'm not sure I understand: The span PR makes this test case use 400 MB more -- or less?

arielb1 · 2017-09-28T20:06:47Z

it makes the test case use 400MB more, because the span just overflows the 24-bit length we have.

arielb1 · 2017-09-28T20:21:29Z

The current peak is during LLVM translation where we are using memory from both MIR and LLVM. I think that after miri lands, we could easily store constants as byte arrays, which should bring this well into the green zone.

nikomatsakis · 2017-11-09T21:14:00Z

triage: P-medium

It seems like we are going to wait until we can fix this properly.

oli-obk · 2018-04-17T14:51:38Z

miri has landed. I'm not entirely sure how to reproduce this though.

urschrei · 2018-04-17T15:05:59Z

Is it on the latest Nightly? If so, I’ll kick off an Appveyor build in a couple of hours.

…

-- steph

On 17 Apr 2018, at 15:52, Oliver Schneider ***@***.***> wrote: miri has landed. I'm not entirely sure how to reproduce this though. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

oli-obk · 2018-04-17T15:14:07Z

jup. it's even in beta (although somewhat broken)

urschrei · 2018-04-17T19:44:23Z

Just ran into #49930, which causes…some failures. Will hold off and try again when the backport lands.

oli-obk · 2018-04-19T06:59:25Z

we have new nightlies!

urschrei · 2018-04-20T08:13:39Z

i686 is failing on ac3c228 (2018-04-18) with exit code: 3221225501: https://ci.appveyor.com/project/urschrei/lonlat-bng/build/job/qaipuqj88243xs4c#L115

(Which is a memory exhaustion error I think?)

oli-obk · 2018-04-20T09:22:59Z

looks like it. Because the x86_64 ~~target~~ host works. You can probably build for the i686 target on the x86_64 host.

urschrei · 2018-04-20T16:26:38Z

@oli-obk Oh, I had no idea – can you point me at some details?

oli-obk · 2018-04-21T08:29:45Z

You need to install the cross toolchain via rustup and invoke cargo with the target flag for that cross target

dkg · 2022-04-22T13:54:33Z

Not sure whether this is exactly the same issue, but compiling the test suite for v0.8.5 of the xxhash-rust crate also failed on 32-bit platforms due to rustc memory exhaustion. in that crate, test-vector.rs contained a series of ~5K assert_eq! expansions, and rustc was consuming > 6GiB of RAM during the build on 64-bit platforms (which obviously won't be possible for a 32-bit rustc process).

Future versions of that crate will work around the failure by changing the test.

I don't know how to run the more detailed RAM diagnostics @nikomatsakis and @alexcrichton report above, but i offer this as another example of excessive memory allocation that fails on 32-bit platforms. I can also report this as a separate issue, if that would be useful.

sfackler added regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 28, 2016

urschrei changed the title ~~Compilation of a crate using a large static map fails on latest i686 Beta~~ Compilation of a crate using a large static map fails on latest i686-pc-windows-gnu Beta Sep 28, 2016

pnkfelix added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. P-high High priority and removed regression-from-stable-to-beta Performance or correctness regression from stable to beta. labels Sep 29, 2016

brson assigned nikomatsakis Oct 6, 2016

brson mentioned this issue Oct 14, 2016

[stable] Backports for 1.12.1 #37173

Merged

nikomatsakis assigned pnkfelix and unassigned nikomatsakis Oct 20, 2016

nnethercote mentioned this issue Oct 28, 2016

Shrink Expr_::ExprInlineAsm. #37445

Merged

petrochenkov mentioned this issue Nov 3, 2017

Simplify Span by explicitly not supporting ctxt #45747

Closed

rust-highfive added P-medium Medium priority and removed P-high High priority labels Nov 9, 2017

petrochenkov mentioned this issue Jan 3, 2018

RFC: libsyntax2.0 rust-lang/rfcs#2256

Closed

urschrei mentioned this issue Apr 17, 2018

Constant evaluation regression in beta 1.26 #49930

Closed

2 tasks

petrochenkov mentioned this issue Apr 3, 2019

Tweak Span encoding. #58458

Merged

pnkfelix added the I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. label Apr 12, 2019

workingjubilee added the O-x86 label Jul 22, 2023

Noratrieb added O-x86_32 Target: x86 processors, 32 bit (like i686-*) and removed O-x86-all labels Oct 25, 2023

Compilation of a crate using a large static map fails on latest i686-pc-windows-gnu Beta #36799

Compilation of a crate using a large static map fails on latest i686-pc-windows-gnu Beta #36799

Comments

urschrei commented Sep 28, 2016 • edited Loading

Details

sfackler commented Sep 28, 2016

nikomatsakis commented Oct 4, 2016

nikomatsakis commented Oct 6, 2016

nikomatsakis commented Oct 6, 2016

urschrei commented Oct 6, 2016

nikomatsakis commented Oct 6, 2016 • edited Loading

nikomatsakis commented Oct 6, 2016 • edited Loading

nikomatsakis commented Oct 6, 2016

nikomatsakis commented Oct 7, 2016

nikomatsakis commented Oct 12, 2016

nikomatsakis commented Oct 12, 2016

nikomatsakis commented Oct 12, 2016 • edited Loading

brson commented Oct 20, 2016

nikomatsakis commented Oct 20, 2016

nikomatsakis commented Oct 20, 2016

pnkfelix commented Oct 27, 2016

nnethercote commented Oct 28, 2016 • edited by eddyb Loading

arielb1 commented Oct 28, 2016 • edited Loading

arielb1 commented Oct 28, 2016

michaelwoerister commented Sep 25, 2017

arielb1 commented Sep 27, 2017

arielb1 commented Sep 27, 2017

michaelwoerister commented Sep 28, 2017

arielb1 commented Sep 28, 2017

arielb1 commented Sep 28, 2017

nikomatsakis commented Nov 9, 2017

oli-obk commented Apr 17, 2018

urschrei commented Apr 17, 2018 via email

oli-obk commented Apr 17, 2018

urschrei commented Apr 17, 2018

oli-obk commented Apr 19, 2018

urschrei commented Apr 20, 2018

oli-obk commented Apr 20, 2018 • edited Loading

urschrei commented Apr 20, 2018

oli-obk commented Apr 21, 2018 • edited Loading

dkg commented Apr 22, 2022

urschrei commented Sep 28, 2016 •

edited

Loading

nikomatsakis commented Oct 6, 2016 •

edited

Loading

nikomatsakis commented Oct 6, 2016 •

edited

Loading

nikomatsakis commented Oct 12, 2016 •

edited

Loading

nnethercote commented Oct 28, 2016 •

edited by eddyb

Loading

arielb1 commented Oct 28, 2016 •

edited

Loading

oli-obk commented Apr 20, 2018 •

edited

Loading

oli-obk commented Apr 21, 2018 •

edited

Loading