-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ppc64] ICE: Region parameter out of range when substituting in region'b #42778
Comments
FWIW beta (4795a8f) builds fine. I'll see if I can bisect it. |
I seem to be hitting unrelated issues when bisecting, so it may be regressing in and out lately... It still ICEs on powerpc64 as of 47faf1d. I found that powerpc64le builds fine though, so this could be a general big-endian issue. If I get time I'll try s390x. Also, the reported region sometimes changes, though still the same location. Most often I get:
But a few times I've seen:
And the backtraces are still useless... |
Could you post the results of |
rust42778-powerpc64-build-verbose.txt That's doing a rebuild from the working dir of my prior attempts, so the only interesting piece is the stage1 libcore that ICEs. If you want the log of a complete build from scratch, I can get that too. |
I'm seeing a different error:
Could you compile with debuginfo to get a backtrace? That's the |
Fresh build directory, debuginfo enabled: rust42778-powerpc64-build-verbose-2.txt
Same dir just building again: rust42778-powerpc64-build-verbose-3.txt
|
Very odd. rustc compiled on x86-64 (both a rustc I hand-compiled and the official nightlies) works without any problems, but rustc compiled on powerpc64 crashes with all sorts of random errors. I wonder whether there is some sort of LLVM bug - the next beta uses LLVM 4.0.1, so I should check whether that fixes things. Also, there shouldn't be any LLVM IR differences between a rustc compiled on powerpc and a rustc compiled on x86, so maybe I could do a diff |
We've seen weird differences between cross- and native-compiled rustc before for big-endian machines, which seems confirmed by powerpc64le working fine here. I'm still trying to get s390x access... cc @nagisa who has found a few of these. |
The usual problem is that cross-compiled and native-compiled metadata end up being incompatible, so an x86-built ppc64 libcore is incompatible with a ppc-built ppc64 libcore. Through maybe we've ended up in a case where ppc can't decode its own metadata. |
Yes, my first suspect would be endianness issue. It could be fairly plausible that the metadata decoder is correct, but the encoder is not, hence the bug showing up on ppc-built stuff only. |
I got an s390x build, and it fared even worse. Tons of move errors on
before it finally ICEd:
I'm guessing that these particular errors and ICEs are not significant in themselves, only as signs of some more general corruption. |
Oddly enough, if I use an x86 stage0 to compile a powerpc stage1, or an powerpc stage0 to compile an x86 stage1, everything works fine, but if I use a powerpc stage0 to compile a powerpc stage1, the powerpc stage1 created is broken. |
I can't figure out a good way to debug this without a powerpc computer. So I'm gotta leave this alone unless anyone's going to help me track it down (e.g., ssh access to a fast(!) powerpc computer). |
I don't know if I can provide any machine access -- I'll check -- but I will keep working on this myself. Any tips for how to approach this? I tried valgrind, but I got so many errors that I think either I'm using it wrong or that build is just hopelessly broken. |
Bisecting took me to #40454's merge commit 27650ee. The parent commit 07a2dd4 builds fine. I didn't drill down into the individual commits for that PR, as it just seems to be iterating different ideas. If you look on master, #42819 also moved that code to I don't know what that code would have to do with endianness or cross-compiling weirdness per se, but I do worry about the memory alignment of these: // The approach here is to utilize simd to swap x & y efficiently. Testing reveals
// that swapping either 32 bytes or 64 bytes at a time is most efficient for intel
// Haswell E processors. LLVM is more able to optimize if we give a struct a
// #[repr(simd)], even if we don't actually use this struct directly.
//
// FIXME repr(simd) broken on emscripten and redox
#[cfg_attr(not(any(target_os = "emscripten", target_os = "redox")), repr(simd))]
struct Block(u64, u64, u64, u64);
struct UnalignedBlock(u64, u64, u64, u64); Later the temp block |
I replaced Now to figure out the right fix... |
Just removing that Regarding endianness and cross-compiling, could there be a bug in the way the simd type is translated to LLVM? Especially since it's being seen through a I'm going to do some comparisons of the builds with and without that |
This is a workaround for rust-lang#42778, which was git-bisected to rust-lang#40454's optimizations to `mem::swap`, later moved to `ptr` in rust-lang#42819. Natively compiled rustc couldn't even compile stage1 libcore on powerpc64 and s390x, but they work fine without this `repr(simd)`. Since powerpc64le works OK, it seems probably related to being big-endian. The underlying problem is not yet known, but this at least makes those architectures functional again in the meantime. cc @arielb1
Disable big-endian simd in swap_nonoverlapping_bytes This is a workaround for rust-lang#42778, which was git-bisected to rust-lang#40454's optimizations to `mem::swap`, later moved to `ptr` in rust-lang#42819. Natively compiled rustc couldn't even compile stage1 libcore on powerpc64 and s390x, but they work fine without this `repr(simd)`. Since powerpc64le works OK, it seems probably related to being big-endian. The underlying problem is not yet known, but this at least makes those architectures functional again in the meantime. cc @arielb1
Affects both powerpc64le and powerpc64 in Debian now:
Correction, the latest upload for ppc64 builds fine: |
The Debian bug is little-endian not big-endian. This bug is the other way around, and is very old. |
Right, the failure on big-endian was a different one for the previous 1.30 upload to experimental. |
Er for the record, no the previous beta-7 was also problematic for little-endian but OK for big-endian https://buildd.debian.org/status/logs.php?pkg=rustc - the big-endian one failed mostly due to debuginfo-gdb tests, otherwise it would be fine (20 test failures). Anyway we have some clues on the other bug, more comments there. |
Isn't this bug fixed by #43159? - i.e., what is the Rust situation on powerpc? |
This is likely fixed. Rust 1.32.0 succeeded on ppc64el, see https://buildd.debian.org/status/logs.php?pkg=rustc&arch=ppc64el. |
This bug is on big-endian, the link you pasted is little-endian, but if you remove the "el" at the end of the URL you get to the big-endian results which are also OK. We do things slightly differently from Fedora though. |
@cuviper do you know what the status on Fedora is? |
It was "fixed" by #43159 by disabling SIMD in that function -- not really a satisfactory solution IMO. So in that sense, we don't have any issue on Fedora any more. But perhaps we should see what happens to codegen these days if we revert that and try allowing SIMD there again. edit: to be clear -- I will try changing that back myself to see what happens. |
This reverts commit 77bd4dc. Issue rust-lang#42778 was formerly easy to reproduce on two big-endian targets, `powerpc64` and `s390x`, so we disabled SIMD on this function for all big-endian targets as a workaround. I have re-tested this code on `powerpc64` and `s390x`, each with the bundled LLVM 8 and with external LLVM 7 and LLVM 6, and the problems no longer appear. So it seems safe to remove this workaround, although I'm still a little uncomfortable that we never found a root-cause...
Revert "Disable big-endian simd in swap_nonoverlapping_bytes" This reverts commit 77bd4dc (#43159). Issue #42778 was formerly easy to reproduce on two big-endian targets, `powerpc64` and `s390x`, so we disabled SIMD on this function for all big-endian targets as a workaround. I have re-tested this code on `powerpc64` and `s390x`, each with the bundled LLVM 8 and with external LLVM 7 and LLVM 6, and the problems no longer appear. So it seems safe to remove this workaround, although I'm still a little uncomfortable that we never found a root-cause... Closes #42778. r? @arielb1
Trying to natively build powerpc64 as of 380100c, I get an ICE in stage 1:
The text was updated successfully, but these errors were encountered: