Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ppc64] ICE: Region parameter out of range when substituting in region'b #42778

Closed
cuviper opened this issue Jun 20, 2017 · 27 comments · Fixed by #60588
Closed

[ppc64] ICE: Region parameter out of range when substituting in region'b #42778

cuviper opened this issue Jun 20, 2017 · 27 comments · Fixed by #60588
Assignees
Labels
C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ O-PowerPC Target: PowerPC processors O-SystemZ Target: SystemZ processors (s390x) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@cuviper
Copy link
Member

cuviper commented Jun 20, 2017

Trying to natively build powerpc64 as of 380100c, I get an ICE in stage 1:

Copying stage1 compiler (powerpc64-unknown-linux-gnu)
Building stage1 std artifacts (powerpc64-unknown-linux-gnu -> powerpc64-unknown-linux-gnu)
   Compiling core v0.0.0 (file:///home/jistone/rust/src/libcore)
error: internal compiler error: src/librustc/ty/subst.rs:411: Region parameter out of range when substituting in region'b (root type=Some((&&'b str,))) (index=1)

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.20.0-dev (380100c56 2017-06-20) running on powerpc64-unknown-linux-gnu

note: run with `RUST_BACKTRACE=1` for a backtrace

thread 'rustc' panicked at 'Box<Any>', src/librustc_errors/lib.rs:426
stack backtrace:
   0: rust_metadata_std_1ef34ef8b50021bf3e2b2bca3dfc146e
   1: rust_metadata_std_1ef34ef8b50021bf3e2b2bca3dfc146e
   2: rust_metadata_std_1ef34ef8b50021bf3e2b2bca3dfc146e
   3: rust_metadata_std_1ef34ef8b50021bf3e2b2bca3dfc146e
   4: rust_metadata_std_1ef34ef8b50021bf3e2b2bca3dfc146e
   5: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
   6: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
   7: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
   8: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
   9: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  10: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  11: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  12: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  13: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  14: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  15: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  16: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  17: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  18: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  19: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  20: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  21: <unknown>
  22: <unknown>
  23: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  24: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  25: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  26: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  27: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  28: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  29: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  30: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  31: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  32: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  33: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  34: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  35: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  36: rust_metadata_rustc_e1561b4cf34be67f9773866f88239afd
  37: <unknown>
  38: <unknown>
  39: <unknown>
  40: <unknown>
  41: <unknown>
  42: <unknown>
  43: rust_metadata_std_1ef34ef8b50021bf3e2b2bca3dfc146e
  44: <unknown>
  45: rust_metadata_std_1ef34ef8b50021bf3e2b2bca3dfc146e
  46: <unknown>
  47: <unknown>

error: Could not compile `core`.
@sanxiyn sanxiyn added the O-PowerPC Target: PowerPC processors label Jun 20, 2017
@cuviper
Copy link
Member Author

cuviper commented Jun 21, 2017

FWIW beta (4795a8f) builds fine. I'll see if I can bisect it.

@cuviper
Copy link
Member Author

cuviper commented Jun 28, 2017

I seem to be hitting unrelated issues when bisecting, so it may be regressing in and out lately...

It still ICEs on powerpc64 as of 47faf1d. I found that powerpc64le builds fine though, so this could be a general big-endian issue. If I get time I'll try s390x.

Also, the reported region sometimes changes, though still the same location. Most often I get:

error: internal compiler error: src/librustc/ty/subst.rs:411: Region parameter out of range when
    substituting in region 'b (root type=Some((&&'b str,))) (index=1)

But a few times I've seen:

error: internal compiler error: src/librustc/ty/subst.rs:411: Region parameter out of range when
    substituting in region 'a (root type=Some((str::pattern::StrSearcher<'a, 'b>,))) (index=0)

And the backtraces are still useless...

@arielb1
Copy link
Contributor

arielb1 commented Jun 29, 2017

@cuviper

Could you post the results of ../x.py build --verbose?

@cuviper
Copy link
Member Author

cuviper commented Jun 29, 2017

rust42778-powerpc64-build-verbose.txt

That's doing a rebuild from the working dir of my prior attempts, so the only interesting piece is the stage1 libcore that ICEs. If you want the log of a complete build from scratch, I can get that too.

@arielb1 arielb1 self-assigned this Jun 29, 2017
@arielb1
Copy link
Contributor

arielb1 commented Jun 29, 2017

I'm seeing a different error:

thread 'rustc' panicked at 'index out of bounds: the len is 17586 but the index is 270859', src/liballoc/vec.rs:1552

Could you compile with debuginfo to get a backtrace? That's the debuginfo setting in config.toml.

@cuviper
Copy link
Member Author

cuviper commented Jun 29, 2017

Fresh build directory, debuginfo enabled: rust42778-powerpc64-build-verbose-2.txt

error: internal compiler error: src/librustc/ty/subst.rs:466: Type parameter `T/#0` (T/0) out of range when substituting (root type=Some((&mut [T], _))) substs=[]

Same dir just building again: rust42778-powerpc64-build-verbose-3.txt

error: internal compiler error: src/librustc/ty/subst.rs:411: Region parameter out of range when substituting in region 'a (root type=Some((str::pattern::StrSearcher<'a, 'b>,))) (index=0)

@arielb1
Copy link
Contributor

arielb1 commented Jun 29, 2017

Very odd. rustc compiled on x86-64 (both a rustc I hand-compiled and the official nightlies) works without any problems, but rustc compiled on powerpc64 crashes with all sorts of random errors. I wonder whether there is some sort of LLVM bug - the next beta uses LLVM 4.0.1, so I should check whether that fixes things.

Also, there shouldn't be any LLVM IR differences between a rustc compiled on powerpc and a rustc compiled on x86, so maybe I could do a diff

@cuviper
Copy link
Member Author

cuviper commented Jun 29, 2017

We've seen weird differences between cross- and native-compiled rustc before for big-endian machines, which seems confirmed by powerpc64le working fine here. I'm still trying to get s390x access...

cc @nagisa who has found a few of these.

@arielb1
Copy link
Contributor

arielb1 commented Jun 29, 2017

@cuviper

The usual problem is that cross-compiled and native-compiled metadata end up being incompatible, so an x86-built ppc64 libcore is incompatible with a ppc-built ppc64 libcore. Through maybe we've ended up in a case where ppc can't decode its own metadata.

@nagisa
Copy link
Member

nagisa commented Jun 30, 2017

Yes, my first suspect would be endianness issue. It could be fairly plausible that the metadata decoder is correct, but the encoder is not, hence the bug showing up on ppc-built stuff only.

@cuviper
Copy link
Member Author

cuviper commented Jun 30, 2017

I got an s390x build, and it fared even worse. Tons of move errors on Copy types like this:

error[E0382]: use of moved value: `self`
    --> src/libcore/num/mod.rs:1412:28
     |
1412 |               (self << n) | (self >> (($BITS - n) % $BITS))
     |                ----          ^^^^ value used here after move
     |                |
     |                value moved here
...
2261 | /     uint_impl! { u16, u16, 16,
2262 | |         intrinsics::ctpop,
2263 | |         intrinsics::ctlz,
2264 | |         ctlz_nonzero,
...    |
2268 | |         intrinsics::sub_with_overflow,
2269 | |         intrinsics::mul_with_overflow }
     | |_______________________________________- in this macro invocation
     |
     = note: move occurs because `self` has type `u16`, which does not implement the `Copy` trait

before it finally ICEd:

thread 'rustc' panicked at 'assertion failed: self.qualif.intersects(Qualif::MUTABLE_INTERIOR)', src/librustc_mir/transform/qualify_consts.rs:732

I'm guessing that these particular errors and ICEs are not significant in themselves, only as signs of some more general corruption.

@arielb1
Copy link
Contributor

arielb1 commented Jul 1, 2017

Oddly enough, if I use an x86 stage0 to compile a powerpc stage1, or an powerpc stage0 to compile an x86 stage1, everything works fine, but if I use a powerpc stage0 to compile a powerpc stage1, the powerpc stage1 created is broken.

@arielb1
Copy link
Contributor

arielb1 commented Jul 2, 2017

I can't figure out a good way to debug this without a powerpc computer. So I'm gotta leave this alone unless anyone's going to help me track it down (e.g., ssh access to a fast(!) powerpc computer).

@cuviper
Copy link
Member Author

cuviper commented Jul 3, 2017

I don't know if I can provide any machine access -- I'll check -- but I will keep working on this myself. Any tips for how to approach this? I tried valgrind, but I got so many errors that I think either I'm using it wrong or that build is just hopelessly broken.

@cuviper
Copy link
Member Author

cuviper commented Jul 4, 2017

Bisecting took me to #40454's merge commit 27650ee. The parent commit 07a2dd4 builds fine. I didn't drill down into the individual commits for that PR, as it just seems to be iterating different ideas.

If you look on master, #42819 also moved that code to ptr::swap_nonoverlapping to use in more places.

I don't know what that code would have to do with endianness or cross-compiling weirdness per se, but I do worry about the memory alignment of these:

    // The approach here is to utilize simd to swap x & y efficiently. Testing reveals
    // that swapping either 32 bytes or 64 bytes at a time is most efficient for intel
    // Haswell E processors. LLVM is more able to optimize if we give a struct a
    // #[repr(simd)], even if we don't actually use this struct directly.
    //
    // FIXME repr(simd) broken on emscripten and redox
    #[cfg_attr(not(any(target_os = "emscripten", target_os = "redox")), repr(simd))]
    struct Block(u64, u64, u64, u64);
    struct UnalignedBlock(u64, u64, u64, u64);

Later the temp block t gets cast to *mut u8, as do the input *mut T pointers in the first place, which would seem to imply that alignment won't matter. But if repr(simd) is still having an effect as stated, then I worry that perhaps the extra alignment constraints are just being assumed in the code generated to move between x, y, and t. The input T could have a smaller alignment than Block.

@cuviper
Copy link
Member Author

cuviper commented Jul 4, 2017

I replaced swap_nonoverlapping_bytes with a naive byte-byte swap on master, and it works!

Now to figure out the right fix...

@cuviper
Copy link
Member Author

cuviper commented Jul 6, 2017

Just removing that repr(simd) gets a working build!

Regarding endianness and cross-compiling, could there be a bug in the way the simd type is translated to LLVM? Especially since it's being seen through a u8 pointer.

I'm going to do some comparisons of the builds with and without that repr(simd). @arielb1 if it would help, I could share those builds somewhere for you to grab.

@cuviper
Copy link
Member Author

cuviper commented Jul 6, 2017

cc @djzin @scottmcm since I'm blaming your PRs... 😉

cuviper added a commit to cuviper/rust that referenced this issue Jul 11, 2017
This is a workaround for rust-lang#42778, which was git-bisected to rust-lang#40454's
optimizations to `mem::swap`, later moved to `ptr` in rust-lang#42819.  Natively
compiled rustc couldn't even compile stage1 libcore on powerpc64 and
s390x, but they work fine without this `repr(simd)`.  Since powerpc64le
works OK, it seems probably related to being big-endian.

The underlying problem is not yet known, but this at least makes those
architectures functional again in the meantime.

cc @arielb1
frewsxcv added a commit to frewsxcv/rust that referenced this issue Jul 15, 2017
Disable big-endian simd in swap_nonoverlapping_bytes

This is a workaround for rust-lang#42778, which was git-bisected to rust-lang#40454's
optimizations to `mem::swap`, later moved to `ptr` in rust-lang#42819.  Natively
compiled rustc couldn't even compile stage1 libcore on powerpc64 and
s390x, but they work fine without this `repr(simd)`.  Since powerpc64le
works OK, it seems probably related to being big-endian.

The underlying problem is not yet known, but this at least makes those
architectures functional again in the meantime.

cc @arielb1
@Mark-Simulacrum Mark-Simulacrum added C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ labels Jul 27, 2017
@cuviper cuviper added the O-SystemZ Target: SystemZ processors (s390x) label Apr 10, 2018
@infinity0
Copy link
Contributor

The Debian bug is little-endian not big-endian. This bug is the other way around, and is very old.

@glaubitz
Copy link
Contributor

Right, the failure on big-endian was a different one for the previous 1.30 upload to experimental.

@infinity0
Copy link
Contributor

Er for the record, no the previous beta-7 was also problematic for little-endian but OK for big-endian https://buildd.debian.org/status/logs.php?pkg=rustc - the big-endian one failed mostly due to debuginfo-gdb tests, otherwise it would be fine (20 test failures).

Anyway we have some clues on the other bug, more comments there.

@arielb1
Copy link
Contributor

arielb1 commented Dec 15, 2018

Isn't this bug fixed by #43159? - i.e., what is the Rust situation on powerpc?

@sanxiyn
Copy link
Member

sanxiyn commented Feb 28, 2019

This is likely fixed. Rust 1.32.0 succeeded on ppc64el, see https://buildd.debian.org/status/logs.php?pkg=rustc&arch=ppc64el.

@infinity0
Copy link
Contributor

This bug is on big-endian, the link you pasted is little-endian, but if you remove the "el" at the end of the URL you get to the big-endian results which are also OK.

We do things slightly differently from Fedora though.

@jonas-schievink jonas-schievink added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 2, 2019
@jonas-schievink
Copy link
Contributor

@cuviper do you know what the status on Fedora is?

@cuviper
Copy link
Member Author

cuviper commented May 2, 2019

It was "fixed" by #43159 by disabling SIMD in that function -- not really a satisfactory solution IMO. So in that sense, we don't have any issue on Fedora any more. But perhaps we should see what happens to codegen these days if we revert that and try allowing SIMD there again.

edit: to be clear -- I will try changing that back myself to see what happens.

cuviper added a commit to cuviper/rust that referenced this issue May 6, 2019
This reverts commit 77bd4dc.

Issue rust-lang#42778 was formerly easy to reproduce on two big-endian targets,
`powerpc64` and `s390x`, so we disabled SIMD on this function for all
big-endian targets as a workaround.

I have re-tested this code on `powerpc64` and `s390x`, each with the
bundled LLVM 8 and with external LLVM 7 and LLVM 6, and the problems no
longer appear. So it seems safe to remove this workaround, although I'm
still a little uncomfortable that we never found a root-cause...
bors added a commit that referenced this issue May 10, 2019
Revert "Disable big-endian simd in swap_nonoverlapping_bytes"

This reverts commit 77bd4dc (#43159).

Issue #42778 was formerly easy to reproduce on two big-endian targets,
`powerpc64` and `s390x`, so we disabled SIMD on this function for all
big-endian targets as a workaround.

I have re-tested this code on `powerpc64` and `s390x`, each with the
bundled LLVM 8 and with external LLVM 7 and LLVM 6, and the problems no
longer appear. So it seems safe to remove this workaround, although I'm
still a little uncomfortable that we never found a root-cause...

Closes #42778.
r? @arielb1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ O-PowerPC Target: PowerPC processors O-SystemZ Target: SystemZ processors (s390x) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants