Preliminary support for MIR in trans #29217

nikomatsakis · 2015-10-21T22:35:19Z

This branch implements a variant of trans that is based on MIR. It is very incomplete (intentionally), and had only the goal of laying out enough work to enable more incremental follow-on patches. Currently, only fns tagged with #[rustc_mir] use the new trans code. I plan to build up a meta-issue as well that tracks the various "not-yet-implemented" points. The only fn that has been tested so far is this amazingly complex "spike" fn:

#[rustc_mir]
fn sum(x: i32, y: i32) -> i32 {
    x + y
}

In general, the most interesting commit is the last one. There are some points on which I would like feedback from @rust-lang/compiler:

I did not use Datum. Originally, I thought that maybe just a ValueRef would be enough but I wound up with two very simple structures, LvalueRef and OperandRef, that just package up a ValueRef and a type. Because of MIR's structure, you don't wind up mixing by-ref and by-value so much, and I tend to think that a thinner abstraction layer is better here, but I'm not sure.
Related to the above, I expect that sooner or later we will analyze temps (and maybe variables too) to find those whose address is never taken and which are word-sized and which perhaps meet a few other criteria. For those, we'll probably want to avoid the alloca, just because it means prettier code.
I generally tried to re-use data structures from elsewhere in trans, though I'm sure we can trim these down.
I didn't do any debuginfo primarily because it seems to want node-ids and we have only spans. I haven't really read into that code so I don't know what's going on there.

r? @nrc

alexcrichton · 2015-10-21T23:09:38Z

⛵

nikomatsakis · 2015-10-22T22:46:03Z

@nrc says he won't have time to review this just now, so I'm going to pick on @dotdash. Let me know if you don't have time. :)

r? @dotdash

nikomatsakis · 2015-10-22T22:48:33Z

Note though that the code is intentionally a WIP :)

nagisa · 2015-10-23T13:05:47Z

src/librustc_typeck/check/regionck.rs

            }
        };

        let old_region_bounds_pairs_len = self.region_bound_pairs.len();

+        // Collect the types from which we create inferred bounds.
+        // For the return type, if diverging, substitute `bool` just
+        // because it will have no effect.


Sounds something like what would get fixed by making diverging a first class type, no?

bors · 2015-10-25T18:36:00Z

☔ The latest upstream changes (presumably #29254) made this pull request unmergeable. Please resolve the merge conflicts.

dotdash · 2015-10-26T19:18:13Z

src/librustc_trans/trans/mir/lvalue.rs

+                                                         from_end: false,
+                                                         min_length: _ } => {
+                        let base_ty = tr_base.ty.to_ty(tcx);
+                        let lloffset = common::C_u32(bcx.ccx(), offset);


Should this be using C_uint and ConstantIndex be adjusted to use a uint as well?

Hmm, thinking about it, that would make the available values differ when cross-compiling. 32bit at least gives consistent results for now.

dotdash · 2015-10-26T19:43:30Z

A few things ended up a commit early or late, LGTM otherwise and feels a lot cleaner than our current trans, though that's probably to be expected :-)

I like the more thinner abstraction compared to Datum, but that's just a gut feeling. Besides providing neater IR. analyzing temps and avoiding allocas will probably be a requirement for proper usage of the expect intrinsic IIRC. Maybe @Aatch can comment on that.

I can't really comment on the debuginfo story either. While I know a bit about the DI types, I have no idea how the mapping to source locations work.

r=me with the tidy error fixed

Aatch · 2015-10-27T03:20:31Z

I'd like to see rvalues translated in a way that doesn't always require a reified destination. We have a distinction in the MIR between real locations and temporaries, we should make use of it. Trying to deal with the issue of (almost) always needing a real destination causes enough problems in the current trans code that I'd hate to see the same mistake repeated here.

Looking at the code, extending LValueRef to indicate by-value or by-ref, then passing that directly into trans_rvalue is probably enough. With the ValueRef being write-once in the by-value case. In the creation of the MirContext, the LValueRefs can then be created with the appropriate setting. Even if right now we just keep it as "always by-ref", it provides a good base for handling by-value-only lvalues in the future.

Right now the code is set up to assume that LValueRefs are always allocas, which seems inconsistent with the idea that might want to avoid unnecessary allocas in the future. I'd like to see at least a basic handling of that case (i.e. by-value LValueRef) before this lands, even if it's peppered with unimplemented!() or similar.

nikomatsakis · 2015-10-28T14:01:37Z

On Mon, Oct 26, 2015 at 08:21:02PM -0700, James Miller wrote:

Right now the code is set up to assume that LValueRefs are always allocas, which seems inconsistent with the idea that might want to avoid unnecessary allocas in the future. I'd like to see at least a basic handling of that case (i.e. by-value LValueRef) before this lands, even if it's peppered with unimplemented!() or similar.

I was debating about the best way to handle this. I am inclined to
think an LvalueRef should always be an alloca -- that is, an Lvalue
is a memory location, full stop.

My plan was to identify write-once temporaries whose address is never
used and which are of immediate type and then throw those into a set
of "immediate temporaries". Then when we saw an assignment to an
immediate temporary, we would store an OperandRef into a different
table (i.e., there would never be an LvalueRef for that temporary,
because there is no lvalue).

You're right that this will require isolating a path for processing
rvalues that does not take a destination. I figured we would use a
similar delegating scheme, where for many rvalue types (e.g., a+b)
this is the primary version, and the "into" version just does a store.

Anyway, I can try to mock this up to show you what I mean. I'm not
sure whether it makes sense to do it in this PR or not, though.

Aatch · 2015-10-29T00:37:33Z

@nikomatsakis well either way I think basic "support" for it should be in the initial PR. Just having the delegation scheme in place should be enough to start. I just really don't want to end up with a massive amount of refactoring work because it was delayed until later and further changes didn't consider it. It'll be way easier to say "use this API instead" in a code review if said API actually exists.

eddyb · 2015-10-29T07:16:59Z

I have to agree with @Aatch, refactoring the existing trans code to support that optimization ended up reaching effectively all of trans and I have no idea how that could have been reviewed or merged.

nikomatsakis · 2015-10-29T09:11:40Z

OK, seems reasonable, I'll take a stab at implementing what I had in mind.

On Thu, Oct 29, 2015 at 3:17 AM, Eduard-Mihai Burtescu <
[email protected]> wrote:

I have to agree with @Aatch https://github.com/Aatch, refactoring the
existing trans code to support that optimization ended up reaching
effectively all of trans and I have no idea how that could have been
reviewed or merged.

—
Reply to this email directly or view it on GitHub
#29217 (comment).

pcwalton · 2015-10-31T17:16:25Z

src/librustc_trans/trans/mir/rvalue.rs

+                        //
+                        // Note that this is currently duplicated with src/libcore/ops.rs
+                        // which does the same thing, and it would be nice to perhaps unify
+                        // these two implementations on day! Also note that we call `fmod`


typo: "on day" -> "one day"

bors · 2015-11-01T16:19:32Z

☔ The latest upstream changes (presumably #29486) made this pull request unmergeable. Please resolve the merge conflicts.

michaelwoerister · 2015-11-02T08:14:05Z

I just now saw the debuginfo-related things here. Debuginfo source locations in LLVM not only need the span but also the lexical scope of each source location. This information is later on used in the debugger to determine which variables are currently in scope. For this purpose we currently build a "scope map" for each function, which maps NodeIDs to the appropriate DI/DWARF scope descriptor.

nikomatsakis · 2015-11-02T11:43:32Z

@michaelwoerister ok -- we do preserve scope information in MIR at the moment -- I envisioned this as temporary, but it seems like there are more and more reasons for it to stay around -- we should chat briefly about how best to connect those two pieces.

nikomatsakis · 2015-11-02T14:46:02Z

@dotdash @eddyb @Aatch the last commit now illustrates the process I had in mind. As you can see, this path is never (currently) used.

luqmana · 2015-11-02T19:04:27Z

src/librustc_trans/trans/mir/mod.rs

+               llarg
+           } else if common::type_is_fat_ptr(tcx, arg_ty) {
+               // we pass fat pointers as two words, but we want to
+               // represent them internally as a pointer two two words,


Typo: 'to two words'

Aatch · 2015-11-03T00:19:10Z

Looks good. Analysing the MIR can be done at a later date, but now the code has to be able to handle non-alloca LValues, even if they don't come up right now.

nikomatsakis · 2015-11-03T00:58:35Z

@Aatch hmm, I had a commit to that effect, I must have accidentally dropped the hunks from the patch. I'll maybe just add one that builds up the hashmap, it's not a lot more code, it'd be nice to see the system working.

nikomatsakis · 2015-11-03T11:40:13Z

@Aatch ok, see the latest pushes (I rebased, but I kept the commits separate so you could see what has changed). In particular d328072 adds the code you were referring to (handle references to temps that have no alloca), and 2b9233a actually skips allocas when possible. Under these changes, my sample test case produces the following IR:

"BB(0)":                                          ; preds = %entry-block
  %2 = load i32, i32* %arg0, align 4
  store i32 %2, i32* %x, align 4
  %3 = load i32, i32* %arg1, align 4
  store i32 %3, i32* %y, align 4
  %4 = load i32, i32* %x
  %5 = load i32, i32* %y
  %6 = add i32 %4, %5
  br label %"BB(1)"

There is still a bit of indirection due to the named variables x and y, but you can see that the temporaries don't have allocas anymore. Instead, they are represented directly by %4 and %5.

nikomatsakis · 2015-11-03T14:05:25Z

Ok, so I added enough new code that somebody ought to review it, I think. @dotdash are you up to reviewing those last four or five commits?

dotdash · 2015-11-03T14:07:54Z

I'll take a look this evening

2015-11-03 15:05 GMT+01:00 Niko Matsakis [email protected]:

Ok, so I added enough new code that somebody ought to review it, I
think. @dotdash https://github.com/dotdash are you up to reviewing
those last four or five commits?

—
Reply to this email directly or view it on GitHub
#29217 (comment).

dotdash · 2015-11-03T21:23:59Z

The alloca avoidance looks like I had expected, so r=me with nits and that fat-pointer thing fixed.

If you feel like it, it might be nice to check which codegen tests already survive being built through MIR and having them duplicated to cover both current trans and MIR trans, but I don't feel that's something that has to be done right now.

nikomatsakis · 2015-11-03T21:42:58Z

@dotdash

If you feel like it, it might be nice to check which codegen tests already survive being built through MIR and having them duplicated to cover both current trans and MIR trans, but I don't feel that's something that has to be done right now.

@brson and I were discussing an alternative of possibly setting up some kind of "arewemiryet.com" infrastructure that would regularly test with MIR enabled and display which tests are failing with MIR. And once all tests work, it would test which crates are failing in the ecosystem.

In any case, I think the support is as yet still a bit incomplete.

nikomatsakis · 2015-11-03T23:03:06Z

@bors r=dotdash

bors · 2015-11-03T23:03:06Z

📌 Commit b46c0fc has been approved by dotdash

bors · 2015-11-04T00:03:44Z

⌛ Testing commit b46c0fc with merge 22d8bc5...

bors · 2015-11-04T00:30:04Z

💔 Test failed - auto-linux-64-x-android-t

nikomatsakis · 2015-11-04T01:40:49Z

@bors r=dotdash

bors · 2015-11-04T01:40:50Z

📌 Commit e787863 has been approved by dotdash

@nrc

This branch implements a variant of trans that is based on MIR. It is very incomplete (intentionally), and had only the goal of laying out enough work to enable more incremental follow-on patches. Currently, only fns tagged with `#[rustc_mir]` use the new trans code. I plan to build up a meta-issue as well that tracks the various "not-yet-implemented" points. The only fn that has been tested so far is this amazingly complex "spike" fn: ```rust #[rustc_mir] fn sum(x: i32, y: i32) -> i32 { x + y } ``` In general, the most interesting commit is the last one. There are some points on which I would like feedback from @rust-lang/compiler: - I did not use `Datum`. Originally, I thought that maybe just a `ValueRef` would be enough but I wound up with two very simple structures, `LvalueRef` and `OperandRef`, that just package up a `ValueRef` and a type. Because of MIR's structure, you don't wind up mixing by-ref and by-value so much, and I tend to think that a thinner abstraction layer is better here, but I'm not sure. - Related to the above, I expect that sooner or later we will analyze temps (and maybe variables too) to find those whose address is never taken and which are word-sized and which perhaps meet a few other criteria. For those, we'll probably want to avoid the alloca, just because it means prettier code. - I generally tried to re-use data structures from elsewhere in trans, though I'm sure we can trim these down. - I didn't do any debuginfo primarily because it seems to want node-ids and we have only spans. I haven't really read into that code so I don't know what's going on there. r? @nrc

bors · 2015-11-04T02:13:07Z

⌛ Testing commit e787863 with merge a216e84...

bors · 2015-11-04T06:07:23Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-debug-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-linux-cross-opt, auto-linux-musl-64-opt, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt, auto-win-msvc-32-opt, auto-win-msvc-64-opt

alexcrichton · 2015-11-04T06:39:25Z

❇️

rust-highfive assigned nrc Oct 21, 2015

rust-highfive assigned dotdash and unassigned nrc Oct 22, 2015

nagisa reviewed Oct 23, 2015
View reviewed changes

dotdash reviewed Oct 26, 2015
View reviewed changes

nikomatsakis force-pushed the mir-trans branch from 64d4f3c to c4ed88a Compare October 29, 2015 09:50

pcwalton reviewed Oct 31, 2015
View reviewed changes

nikomatsakis force-pushed the mir-trans branch 2 times, most recently from 4ad49d7 to 98bd698 Compare November 2, 2015 14:45

nikomatsakis force-pushed the mir-trans branch from 98bd698 to 1efa4a8 Compare November 2, 2015 14:49

luqmana reviewed Nov 2, 2015
View reviewed changes

nikomatsakis added 2 commits November 3, 2015 04:35

New trans codepath that builds fn body from MIR instead.

02017b3

Plumbing to omit allocas for temps when possible (currently unused)

e84829d

nikomatsakis force-pushed the mir-trans branch from 1efa4a8 to 087c72d Compare November 3, 2015 09:37

Add a MIR visitor

544b06d

nikomatsakis force-pushed the mir-trans branch from 087c72d to d328072 Compare November 3, 2015 11:38

nikomatsakis force-pushed the mir-trans branch from 7e6a79f to 441273c Compare November 3, 2015 11:47

nikomatsakis added 3 commits November 3, 2015 18:00

Add (and use) an analysis to determine which temps can forgo an alloca.

6a5b263

correct typos

9c9f4be

address nits from dotdash

b46c0fc

nikomatsakis force-pushed the mir-trans branch from 2c8340c to b46c0fc Compare November 3, 2015 23:02

remove unused import

e787863

bors merged commit e787863 into rust-lang:master Nov 4, 2015

bors mentioned this pull request Nov 4, 2015

Fix various bugs around empty structs and patterns #29383

Merged

nikomatsakis deleted the mir-trans branch March 30, 2016 16:16

Preliminary support for MIR in trans #29217

Preliminary support for MIR in trans #29217

Conversation

nikomatsakis commented Oct 21, 2015

alexcrichton commented Oct 21, 2015

nikomatsakis commented Oct 22, 2015

nikomatsakis commented Oct 22, 2015

nagisa Oct 23, 2015

Choose a reason for hiding this comment

bors commented Oct 25, 2015

dotdash Oct 26, 2015

Choose a reason for hiding this comment

dotdash Oct 26, 2015

Choose a reason for hiding this comment

dotdash commented Oct 26, 2015

Aatch commented Oct 27, 2015

nikomatsakis commented Oct 28, 2015

Aatch commented Oct 29, 2015

eddyb commented Oct 29, 2015

nikomatsakis commented Oct 29, 2015

pcwalton Oct 31, 2015

Choose a reason for hiding this comment

bors commented Nov 1, 2015

michaelwoerister commented Nov 2, 2015

nikomatsakis commented Nov 2, 2015

nikomatsakis commented Nov 2, 2015

luqmana Nov 2, 2015

Choose a reason for hiding this comment

Aatch commented Nov 3, 2015

nikomatsakis commented Nov 3, 2015

nikomatsakis commented Nov 3, 2015

nikomatsakis commented Nov 3, 2015

dotdash commented Nov 3, 2015

dotdash commented Nov 3, 2015

nikomatsakis commented Nov 3, 2015

nikomatsakis commented Nov 3, 2015

bors commented Nov 3, 2015

bors commented Nov 4, 2015

bors commented Nov 4, 2015

nikomatsakis commented Nov 4, 2015

bors commented Nov 4, 2015

bors commented Nov 4, 2015

bors commented Nov 4, 2015

alexcrichton commented Nov 4, 2015