Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[beta] librand unreachable pattern on armv7-unknown-linux-gnueabihf #37630

Closed
cuviper opened this issue Nov 7, 2016 · 21 comments · Fixed by #37691
Closed

[beta] librand unreachable pattern on armv7-unknown-linux-gnueabihf #37630

cuviper opened this issue Nov 7, 2016 · 21 comments · Fixed by #37691
Labels
O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state regression-from-stable-to-stable Performance or correctness regression from one stable version to another.

Comments

@cuviper
Copy link
Member

cuviper commented Nov 7, 2016

I just tried builds of 1.13.0-beta.3 on Fedora 24 and Rawhide (with LLVM 3.8 and 3.9 respectively), and both builds failed only on armv7hl, in the exact same place:

CFG_LLVM_LINKAGE_FILE=/builddir/build/BUILD/rustc-beta/armv7-unknown-linux-gnueabihf/rt/llvmdeps.rs LD_LIBRARY_PATH=/builddir/build/BUILD/rustc-beta/armv7-unknown-linux-gnueabihf/stage2/lib:/usr/lib:$LD_LIBRARY_PATH   armv7-unknown-linux-gnueabihf/stage2/bin/rustc --cfg stage2 -Clink-args=-Wl,-z,relro,-z,now -O --cfg rtopt -g  --target=armv7-unknown-linux-gnueabihf  -C prefer-dynamic -L "armv7-unknown-linux-gnueabihf/rt" -L native="/usr/lib"     --out-dir armv7-unknown-linux-gnueabihf/stage2/lib/rustlib/armv7-unknown-linux-gnueabihf/lib -C extra-filename=-f02899e6 -C metadata=f02899e6 src/librand/lib.rs
error[E0001]: unreachable pattern
  --> src/librand/distributions/gamma.rs:93:13
   |
93 |             0.0...1.0 => Small(GammaSmallShape::new_raw(shape, scale)),
   |             ^^^^^^^^^ this is an unreachable pattern

That full function is:

impl Gamma {
    /// Construct an object representing the `Gamma(shape, scale)`
    /// distribution.
    ///
    /// Panics if `shape <= 0` or `scale <= 0`.
    pub fn new(shape: f64, scale: f64) -> Gamma {
        assert!(shape > 0.0, "Gamma::new called with shape <= 0");
        assert!(scale > 0.0, "Gamma::new called with scale <= 0");

        let repr = match shape {
            1.0 => One(Exp::new(1.0 / scale)),
            0.0...1.0 => Small(GammaSmallShape::new_raw(shape, scale)),
            _ => Large(GammaLargeShape::new_raw(shape, scale)),
        };
        Gamma { repr: repr }
    }
}

This particular code hasn't changed in over a year, and my i686, x86_64, and aarch64 builds were fine with it. Only armv7hl has the error, and note even then it wasn't until --cfg stage2.

I had no problem with armv7hl on prior releases, so this appears to be a beta regression. I do have one patch applied from #36933 to disable NEON, but I also had that on 1.12.1 and it went fine. I'm baffled what else could be specific to armv7hl to cause this...

@TimNN
Copy link
Contributor

TimNN commented Nov 7, 2016

This could be related to #37559 and it's fix (rust-lang/compiler-rt#25), now maybe the opposite situation (if the compiler uses some of the builtins during the unreachability checking).

cc @japaric

@TimNN
Copy link
Contributor

TimNN commented Nov 7, 2016

@cuviper: Just to check:

  • is armv7hl your host architecture?
  • which compiler are you bootstrapping from?

@cuviper
Copy link
Member Author

cuviper commented Nov 7, 2016

  • Yes, Fedora builds always use native hosts.
  • These are bootstrapping from Fedora's own 1.12.1.
    • I could start a build using upstream bootstrap binaries, if that's a useful data point

@TimNN
Copy link
Contributor

TimNN commented Nov 7, 2016

Mh, no I think 1.12.1 was not affected by that bug, so using the upstream binaries wouldn't help, I think.

@TimNN
Copy link
Contributor

TimNN commented Nov 7, 2016

If you want, you could try with rust-lang/compiler-rt@592717e reverted, although I'm not really sure how much that will help, the code responsible for pattern checking does not seem to need any builtins from a quick glance.

@cuviper
Copy link
Member Author

cuviper commented Nov 7, 2016

I started a build with that reverted, if you care to watch paint dry...
http://koji.fedoraproject.org/koji/taskinfo?taskID=16341256

However, assuming your fix was correct, there are two more places with explicit "aapcs": __aeabi_cdcmpeq_check_nan and __aeabi_cfcmpeq_check_nan, used in support of the asm __aeabi_cdcmpeq and __aeabi_cfcmpeq. These sound like functions that could be used in pattern checking, but I don't know how you're checking this.

Of course the ABI of each check_nan needs to match what the asm caller expects. But this seems to open a big can of worms if all of those arm/*.S are using the wrong ABI, not matching their callers.

@TimNN
Copy link
Contributor

TimNN commented Nov 7, 2016

However, assuming your fix was correct, there are two more places with explicit "aapcs":

Then that is probably the real problem.

but I don't know how you're checking this.

I just looked around a bit in check_match.rs, and found that from a glance, only <= was used on floats.

I expected the <= to not use any builtins since this is a hard float target (is it not?).

@cuviper
Copy link
Member Author

cuviper commented Nov 7, 2016

Yes, it's hard float, but not NEON if that matters to your point.

Actually, I have another build going without that #36933 patch, in case NEON does make a difference.
http://koji.fedoraproject.org/koji/taskinfo?taskID=16340627

@TimNN
Copy link
Contributor

TimNN commented Nov 7, 2016

Alright, so parsing floats seems to be broken (printing constants works fine):

Test program:

use std::io::{self, BufRead};

fn main() {
    let inp = io::stdin();
    let mut inp = inp.lock();

    let mut line = String::new();

    inp.read_line(&mut line).unwrap();

    println!("{}", line.trim().parse::<f64>().unwrap());
}

For input 1, the output is 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000005.

So the comparisons / match checking are probably fine but the parsed floats are bad.

Edit: This was compiled with the official rust beta binaries.

cc @alexcrichton

@TimNN TimNN added I-wrong O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state labels Nov 7, 2016
@TimNN
Copy link
Contributor

TimNN commented Nov 7, 2016

This affects nightly-2016-11-06 but not nightly-2016-11-03 (Changes), notably that change set does not include rust-lang/compiler-rt#25.

Edit: forget that, I must have confused something, I cannot reproduce on either of those nightlies.

@cuviper
Copy link
Member Author

cuviper commented Nov 8, 2016

The NEON build (withholding #36933) still failed in the same way.

The build with reverted rust-lang/compiler-rt@592717e is still going, but it just got past that failure point of librand --cfg stage2, so it seems in good shape so far.

@TimNN
Copy link
Contributor

TimNN commented Nov 8, 2016

Alright, so with the test program above, the current beta, running on an RPi 2 with armv7-unknown-linux-gnu, I observer the following:

  • the parsing code calls __floatundidf
  • __floatundidf returns the (correct) value in d0;
    the callee of __floatundidf expects the result in r0, r1
  • the callee is RawFloat::from_int, which is just an x: u64 as f64

@japaric
Copy link
Member

japaric commented Nov 8, 2016

So, this is ... interesting.

According to this bug, for eabihf ("hard float") target, libgcc calls floating point intrinsics, like "floatundidf", using the hard float convention (argument passing over FPU registers (s0, d0, etc.)) (FWIW, this seems consistent to me, it respects the global calling convention, the "hf" in gnueabihf) whereas clang/llvm calls them using the soft float convention (argument passing over "general purpose" registers (r0, r1)). It seems that llvm does that because libgcc used to do it that way but libgcc (apparently) broke its ABI and changed from soft float to the current hard float ABI.

So, what to do. For gnueabihf, we should compile these floating point intrinsics using the soft float convention because that's how LLVM call these intrinsics. Except for the powisf2 intrinsic, that should be compiled using the hard float convention to avoid re-introducing #37559.

It's weird that LLVM calls some floating point intrinsics using the soft float convention but others with the hard float one ... but we have to do what LLVM dictates.

@cuviper
Copy link
Member Author

cuviper commented Nov 8, 2016

It's weird that LLVM calls some floating point intrinsics using the soft float convention but others with the hard float one ... but we have to do what LLVM dictates.

Is that really LLVM's doing? I see the powi intrinsics explicitly declared by rustc here and here, but I haven't found the same for __floatundidf / UINTTOFP_I64_F64.

That makes me wonder if perhaps LLVM is internally using consistent soft float for these, when they are implicitly added, whereas the calls generated for intrinsics by Rust are using the default hard float ABI. Maybe there's some attribute missing on Rust's declarations to get the right code generation out of it.

@japaric
Copy link
Member

japaric commented Nov 8, 2016

That makes me wonder if perhaps LLVM is internally using consistent soft float for these, when they are implicitly added, whereas the calls generated for intrinsics by Rust are using the default hard float ABI.

Yeah, that seems to be the case.

Maybe there's some attribute missing on Rust's declarations to get the right code generation out of it.

You mean to call powi using the soft float convention? extern "aapcs" would work I think but the intrinsic is already extern "intrinsic".

IMO, all the intrinsics should use hard float because that's the global setting. It'd be better if we could make that work.

@kali
Copy link
Contributor

kali commented Nov 8, 2016

I have found new discreps in stable vs beta in #37559

@alexcrichton
Copy link
Member

IIRC you can tag a call instruction in LLVM with the calling convention, so presumably if compiler-rt compiles all intrinsics (not just those in C) with the aapcs calling convention then for these platforms we should tag the calls to intrinsics with that calling convention?


Hm... I'm not so sure any more. Compiling this:

double a(double b, int c) {
  return __builtin_powi(b, c);
}

with clang yields:

$ clang -emit-llvm -S --target=armv7-unknown-linux-gnueabihf foo.c -c -mfloat-abi=hard -O && cat foo.ll
; ModuleID = 'foo.c'
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv7-unknown-linux-gnueabihf"

; Function Attrs: norecurse nounwind readnone
define double @a(double %b, i32 %c) #0 {
  %1 = tail call double @llvm.powi.f64(double %b, i32 %c)
  ret double %1
}

; Function Attrs: nounwind readnone
declare double @llvm.powi.f64(double, i32) #1

attributes #0 = { norecurse nounwind readnone "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a8" "target-features"="+dsp,+neon,+vfp3" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind readnone }

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 1, !"min_enum_size", i32 4}
!2 = !{!"clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)"}

That is, nothing weird with calling conventions here.

What's going on?!

@cuviper
Copy link
Member Author

cuviper commented Nov 8, 2016

IMO, all the intrinsics should use hard float because that's the global setting. It'd be better if we could make that work.

I think we should be consistent. Going fully hard float would be ideal, but we'd have to figure out how to influence the LLVM internal calls, as well as fork all the asm implementations in arm/*.S for hard float. Getting Rust's calls like powi to use soft float seems like it will be an easier task for now.

@brson brson added the regression-from-stable-to-beta Performance or correctness regression from stable to beta. label Nov 8, 2016
@brson
Copy link
Contributor

brson commented Nov 8, 2016

This is a beta regression that will be on stable Thursday.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Nov 10, 2016
This update of compiler-rt includes rust-lang/compiler-rt#26 which provides a
targeted fix to the powisf2 intrinsics to keep rust-lang#37559 fixed but also address
the new issue of rust-lang#37630. I've also [written up my thoughts][1] on why it appears
that this is the correct fix for now (hoepfully at least).

Closes rust-lang#37630

[1]: rust-lang/compiler-rt#26 (comment)
@alexcrichton
Copy link
Member

Ok I've done some investigation and believe @TimNN's patch ironically fixes everything, and I've submitted a PR to update to that compiler-rt.

@brson brson added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. and removed regression-from-stable-to-beta Performance or correctness regression from stable to beta. labels Nov 11, 2016
bors added a commit that referenced this issue Nov 11, 2016
std: Update compiler-rt for more ABI fixes

This update of compiler-rt includes rust-lang/compiler-rt#26 which provides a
targeted fix to the powisf2 intrinsics to keep #37559 fixed but also address
the new issue of #37630. I've also [written up my thoughts][1] on why it appears
that this is the correct fix for now (hoepfully at least).

Closes #37630

[1]: rust-lang/compiler-rt#26 (comment)
@cuviper
Copy link
Member Author

cuviper commented Nov 15, 2016

Note that since the official Rust builds were not fixed, those armhf 1.13.0 binaries won't be usable as a host stage0 to build 1.14.0. I just tried such a build for the new beta, and it failed right away, segfault compiling libcore. Using Fedora's own 1.13.0 binary (patched with rust-lang/compiler-rt#26) does appear to work so far as stage0 for the beta.

japaric pushed a commit to japaric/compiler-builtins that referenced this issue Nov 16, 2016
bors added a commit to rust-lang/compiler-builtins that referenced this issue Nov 16, 2016
alexcrichton added a commit to alexcrichton/rust that referenced this issue Nov 16, 2016
This update of compiler-rt includes rust-lang/compiler-rt#26 which provides a
targeted fix to the powisf2 intrinsics to keep rust-lang#37559 fixed but also address
the new issue of rust-lang#37630. I've also [written up my thoughts][1] on why it appears
that this is the correct fix for now (hoepfully at least).

Closes rust-lang#37630

[1]: rust-lang/compiler-rt#26 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state regression-from-stable-to-stable Performance or correctness regression from one stable version to another.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants