Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust should not assume NEON on armhf #35590

Closed
sylvestre opened this issue Aug 11, 2016 · 32 comments
Closed

rust should not assume NEON on armhf #35590

sylvestre opened this issue Aug 11, 2016 · 32 comments

Comments

@sylvestre
Copy link
Contributor

Reported here:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=834003
To quote Steve:

None of the Linux distros doing ARMv7 hard-float assume
NEON. The spec was agreed across the distros to explicitly not
depend on NEON for these ports because it's optional and there
are/were a number of ARMv7 CPUs in the wild without it.

@lucab
Copy link
Contributor

lucab commented Aug 11, 2016

/cc @alexcrichton @fabricedesre @petevine

This is seems to be coming from #30948 (comment). I guess this may have gone unnoticed so far because original porting has been done on a RPi2 with NEON.

@MagaTailor
Copy link

That's correct, strictly speaking, as originally there was just one armhf triple and it defaulted to ARMv6. (the current arm-unknown-). It was never a problem because rust std library remains minimalistic and most of the code has to be compiled from crates. (that's when -C target-cpu and -C target-feature kick in and the resulting code suffers no penalty)

However, if the official distribution methods start detecting the ISA correctly, it should be possible to add as many combinations as required. Personally, I don't like the potential fragmentation, so if it were up to me, I'd advise using the arm- triple and tweaking the codegen, either via patches or a wrapper script.

Oh, and I fully expect vfpv3-d16 coming next.

@Amanieu
Copy link
Member

Amanieu commented Aug 11, 2016

Only the armv7-unknown-linux-gnueabihf target uses neon. In particular, the arm-unknown-linux-gnueabihf (note the lack of v7) target does not. It seems to me that this issue can be solved by having debian armhf use the arm-unknown-linux-gnueabihf target instead.

@infinity0
Copy link
Contributor

That optimises for arm v6 so it wouldn't be ideal.

mk/cfg/arm-unknown-linux-gnueabihf.mk:CFG_JEMALLOC_CFLAGS_arm-unknown-linux-gnueabihf := -D__arm__ $(CFLAGS) -march=armv6 -marm
mk/cfg/arm-unknown-linux-gnueabihf.mk:CFG_GCCISH_CFLAGS_arm-unknown-linux-gnueabihf := -Wall -g -fPIC -D__arm__ $(CFLAGS) -march=armv6 -marm
mk/cfg/arm-unknown-linux-gnueabihf.mk:RUSTC_FLAGS_arm-unknown-linux-gnueabihf := -C target-feature=+v6,+vfp2

@MagaTailor
Copy link

On the other hand, I might be privvy to one little tidbit why the targets might not be equivalent nowadays, but only in the case of mixed C,C++/Rust codebases, namely the gcc-rs crate.

It forces -march=armv6 for the arm- target which is obviously necessary for cross-compilation but leads to benchmarkable performance loss on an ARMv7 system (see parity) so an unsuspecting user might get tripped.

@MagaTailor
Copy link

MagaTailor commented Aug 11, 2016

@infinity0 This is why you'd have to tweak the codegen, preferably via a wrapper script, in which case the target can be used as is.

@lucab
Copy link
Contributor

lucab commented Aug 11, 2016

What we are arguing here is that rustc has two profiles for gnueabihf targets, none of which corresponds to the traditional definition of Linux ARMv7-HF ports.

While above suggestions of probing the ISA, picking another target, and tweaking codegen are perfectly fine solution for specific users/machines, this doesn't work in the context of packaged binaries (which are moved across machines) and need to agree on the baseline architecture definition.

In this specific case, rust should decide which profile is the canonical target for a Linux ARMv7-HF port, and align it to the agreed definition of that architecture baseline: armv7+thumb2+VFP3D16

@Amanieu
Copy link
Member

Amanieu commented Aug 11, 2016

The Rust ARM targets don't line up with the debian/gnu ones at all.

Rust:

  • arm-unknown-linux-gnueabi: ARMv6, soft float
  • arm-unknown-linux-gnueabihf: ARMv6, VFPv2
  • armv7-unknown-linux-gnueabihf: ARMv7, VFPv3 + NEON

Debian/GNU:

  • armel/arm-linux-gnueabi: ARMv4t, soft float
  • armhf/arm-linux-gnueabihf: ARMv7, VFPv3 (no NEON)

I still think the best solution in this case would be for Debian armhf to use the arm-unknown-linux-gnueabihf Rust target.

@MagaTailor
Copy link

@lucab That's precisely what myself and @Amanieu meant - binaries would have to be built with the following command:

rustc --target=arm-unknown-linux-gnueabihf -C target-cpu=cortex-a8 -C target-feature=+v7,+thumb2,+vfp3,+d16

@alexcrichton
Copy link
Member

Thanks for the report @lucab and @sylvestre! I believe @Amanieu correctly summarized the Rust state of affairs here, but could you confirm the Debian/GNU state of affairs? It'd be helpful to understand what exactly the targets Linux distros are expecting and how they're compiled.

This was also discussed previously here I believe.

cc @larsbergstrom, I think Servo wants NEON on for sure, right?
cc @japaric

@anguslees
Copy link
Contributor

@alexcrichton: I confirm @Amanieu's Debian summary above. Ignoring obsolete/historical ARM ports, Debian now has:

@sanxiyn
Copy link
Member

sanxiyn commented Aug 12, 2016

This was also discussed in #31800.

As I understand, the only reason Rust ARM is v6 is to support original RPi. I objected this does not match the traditional definition. @alexcrichton decided to just ignore the definition, since RPi needs to be supported. See #31800 for details.

@glandium
Copy link
Contributor

Why not support the original RPi with an explicit armv6 triplet?

@sanxiyn
Copy link
Member

sanxiyn commented Aug 12, 2016

Indeed. My proposal is: 1. Align Rust gnueabi/gnueabihf to v4/v7. 2. Keep (now misnamed) Rust armv7 for v7+neon. 3. Since RPi is weird, create new target just for RPi.

@Amanieu
Copy link
Member

Amanieu commented Aug 12, 2016

Rust requires at least ARMv6, since previous versions of the architecture don't support atomic operations.

@glandium
Copy link
Contributor

GCC provides a (I think lock-based) std::atomic. Yes, that's slow, but I don't see a reason why Rust shouldn't allow something similar (not saying that you should all be working on that, but you shouldn't prevent people from doing it if they want to).

@sanxiyn
Copy link
Member

sanxiyn commented Aug 12, 2016

Here is my understanding, excluding soft float targets for simplicity. Three targets would be best, v6/v7/v7+neon. Rust currently has v6/v7+neon. Debian wants v7. In short term, Debian can use v6 target without introducing a new target. Long term, adding v7 target seems preferable.

Then there is matter of naming. Rust currently calls v6/v7/v7+neon as armhf/???/armv7. I think they should be called rpi/armhf/armv7 or armv6/armhf/armv7 to avoid confusion. Using armv7 for v7+neon is unfortunate, but it doesn't collide with existing names, and armv7 does mean v7+neon on iOS, so there's that.

@larsbergstrom
Copy link
Contributor

@alexcrichton We would certainly like some build of the rust standard libraries with Neon on by default for Servo. Without it, Servo is unbearably slow on any modern Android device, even if we turn it on for the Rust code in Servo itself.

@lucab
Copy link
Contributor

lucab commented Aug 12, 2016

@larsbergstrom What is the baseline without it for comparison? The request here is to swap NEON for VFPv3-D16, not just to drop it.

Moreover, Android targets fall under androideabi, which we are not touching here.

For a comparison and historical view of the whole arm hardfloats story, see https://wiki.debian.org/ArmHardFloatPort/VfpComparison.

@Amanieu
Copy link
Member

Amanieu commented Aug 12, 2016

@lucab: One of the key points of NEON, which isn't mentioned in that page is that it implies VFPv3-D32, whereas standard VFPv3 is VFPv3-D16. The difference here is that the former has 32-bit double-precision floating-point registers but the latter only has 16. Having more registers can help code generation in f64-heavy code.

@larsbergstrom
Copy link
Contributor

@lucab I don't have strong opinions about the ARM non-Android situation, and am happy to defer to the experts in this thread.

So far as baseline goes, this was for Neon on some specific hardware and benchmarks that were pretty computationally intensive. We did see quite a bit of difference between soft/hard vp3/neon in that case, though clearly just getting to any hard float got the order-of-magnitude perf gains.

TBH, ideally we would be able to self-build the standard library as part of Servo builds. For many of these embedded platforms the h/w provider would like to specify exactly the chipset, etc. that they have available, and asking Rust to provide every possible set of targets seems quite demanding to me.

@sanxiyn
Copy link
Member

sanxiyn commented Aug 13, 2016

First, this issue is about ARM on Linux, not ARM on Android. My understanding is that Rust is simply matching Google's documentation for ARM on Android, which is great.

Second, as far as I know Rust does not enable NEON for ARM on Android. So Servo must be doing something extra. This matches Google's documentation, see also #33414.

@larsbergstrom
Copy link
Contributor

@sanxiyn Without going into details, these builds were done with a full custom rust compiler rebuild + servo rebuild.

@alexcrichton
Copy link
Member

So for the three Debian targets we've got:

  • armel - no solution in Rust today
  • armhf - armv7-unknown-linux-gnueabihf minus NEON
  • arm64 - aarch64-unknown-linux-gnu

It sounds like we can keep armv7 with VFPv3-D16, though, and it also wouldn't affect Servo as there it's primarily used on Android. If we make that change it means that armel will not be supported, but perhaps that's good enough for now?

@anguslees
Copy link
Contributor

anguslees commented Aug 15, 2016

Just to put a counter-proposal on the table, another option would be to do
the gcc-ish thing: default to the most conservative milestones and use
options to enable additional hardware features. Specifically:

  • arm (v6 or whatever our minimum is), with non-default flags to enable v7,
    neon, vfp, etc (perhaps even hard/soft float calling convention).
  • aarch64, with presumably additional flags for future iterations of arm64
    chips that don't exist yet.

It sounds instead like we're trying to create targets that match the most
popular platforms we see around today. That's nice for user experience,
but obviously highly ephemeral, and I suggest doomed to failure over the
long term. If we want to try to pick winners, I suggest the target triples
we create for them should be super-specific (mention "android", "rpi", etc
instead of "unknown") - unlike the current triples - and we make sure our
"generic" triples are indeed very generic and merely good bases on which to
build user-specific tweaks.

@japaric
Copy link
Member

japaric commented Aug 16, 2016

@anguslees

default to the most conservative milestones

That would be what the arm-unknown-linux-gnueabi and arm-unknown-linux-gnueabihf targets are in
the context of this issue.

use options to enable additional hardware features.

Like those GCC configure flags that let you pick the "default" arch-level, float-abi,
target-cpu, etc, which are also the options used to compile the libraries (libc, libgcc_s,
libatomic, etc) that are part of the compiler bootstrap process?

I like that idea. The Rust equivalent would be slightly tweaking the specification of a built-in
target like arm-unknown-linux-gnueabihf. Currently, we don't have a mechanism to do that, but with
it, $DISTRO would be able to e.g. tweak the arm-unknown-linux-gnueabihf target to mean ARMv7 with
VFPv3-D16 FPU but without NEON and then build rustc/std for that tweaked target.

@alexcrichton thoughts on having a rustbuild mechanism to tweak "built-in" targets before the
bootstrap process starts? I agree with @anguslees that we shouldn't be changing the specification of
existing targets to match $DISTRO's definition nor should we be adding one target per
$DISTRO/$PLATFORM.


Thinking long term about the configurability of targets, the final state could look like this:

  • We only provide binary releases (rustc + std) for "baseline" targets (*), e.g.
    arm-unknown-linux-gnueabi (ARMv6+, no FPU). These releases are meant to run on as many systems
    as possible and have minimal optimizations applied to them because of that -- IOW, the target
    specification requests very few optimizations. These rustc releases also build, by default,
    binaries with few optimizations that can run on as many systems as possible.
  • Using the "on the fly compilation of std" feature and some yet to be determined configuration
    mechanism (just .cargo/config?), rustc users can build (std and) executables that are heavily
    optimized for $ARCH/$CPU/$DEVICE using one of our baseline rustc releases (or a distro release,
    for that matter).
  • Advanced users and distro package maintainers can use rustbuild to build a
    rustc with more optimizations (for e.g. faster compile times) and/or to tweak the specification of
    a baseline target, which is currently "built-in"/hard-coded in the compiler (though this may
    not be necessary if something like Target Bundles rfcs#1711 is accepted).

(*) This also means we don't have to release a binary for every single variation of a target: "this
one is for ARMv7", "this one is for ARMv7 but with NEON support", "this one is optimized for
$DISTRO", etc.

In that world, the armv7-unknown-linux-gnueabihf target would be eliminated because its an
optimization over the arm-unknown-linux-gnuebihf target. In general, it would up to distros to
provide optimized releases of rustc.

@sanxiyn
Copy link
Member

sanxiyn commented Aug 16, 2016

I agree with @anguslees that we shouldn't be changing the specification of existing targets to match $DISTRO's definition nor should we be adding one target per $DISTRO/$PLATFORM.

I disagree with @japaric and I think Rust Linux targets should match Linux distros' definition (it's not like there are multiple definitions, all major Linux distros agree), and I think it is reasonable to add one more target to Rust Linux targets, growing the number from 2 to 3. After all, Rust Android targets match Google's definition, and Rust iOS targets match Apple's definition.

@japaric
Copy link
Member

japaric commented Aug 16, 2016

FWIW, removing the neon requirement from the armv7 target is in agreement with the policy of making rustc/std more portable. The target that results from that policy happens to match "major" distros' definition and, in general, that policy will yield a rustc/std that works on most distributions because portability is its very own definition. With the policy of 'match "major" distro definition', we'd have to change the arm-gnueabi target to support ARMv4 but that's not possible because the architecture doesn't support atomics and those required to compile std (and pretty much any other standard crate that's not core).

After all, Rust Android targets match Google's definition, and Rust iOS targets match Apple's definition.

Can't really comment on android or ios but I guess there's simply no other definition in those spaces?

@alexcrichton
Copy link
Member

I'd be personally wary of having the definition of a target vary depending on how the compiler was built. I feel like we get a lot of mileage of having a common vocabulary when talking about targets. That is, arm-unknown-linux-gnueabi means the same thing no matter what compiler you're using, what platform you're compiling from, etc.

It seems that we don't necessarily have a reason to not follow what distros are doing. In theory they're already quite conservative in feature selection, and we don't necessarily have to support everything under the sun (e.g. ARMv4 could be delayed until later).

@alexcrichton
Copy link
Member

Ok, I've created #35814 which disables NEON but enables VFP3D16 on the armv7-unknown-linux-gnueabihf target. That should better match distros I believe, right? We can try to add an armv4 target later, but seems good to do that in the meantime.

@Amanieu
Copy link
Member

Amanieu commented Aug 19, 2016

ARMv4 is tricky. It's possible to get atomic support working for it, but only on Linux. You'd need to use the kernel user helpers to implement various __sync_* functions that are going to be emitted by LLVM.

@lucab
Copy link
Contributor

lucab commented Aug 19, 2016

I'm personally happy with having just armhf and arm64 aligned after this. Rationale:

  • armv4 may need some complex work, due to the above mentioned lack of atomics
  • ARM/Linaro efforts are mostly focused on v7+
  • there have been some discussions in Debian to discontinue the armel port in a not too far future

alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 19, 2016
One of the primary platforms for the `armv7-unknown-linux-gnueabihf` target,
Linux distributions, do not enable NEON extensions by default. This PR disables
that feature by defualt but enables the `d16` feature which enables VFP3D16 that
distributions do enable.

Closes rust-lang#35590
bors added a commit that referenced this issue Aug 25, 2016
rustc: Don't enable NEON by default on armv7 Linux

One of the primary platforms for the `armv7-unknown-linux-gnueabihf` target,
Linux distributions, do not enable NEON extensions by default. This PR disables
that feature by defualt but enables the `d16` feature which enables VFP3D16 that
distributions do enable.

Closes #35590
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests