Document and update i686 triples #31632

ranma42 · 2016-02-13T17:18:36Z

They now (should) match the behaviour of Clang and there is a brief comment in each documenting this.

As per discussion in #31110

Keep track of differences between Clang and rust choices about default CPUs for 32-bit x86 targets.

Try to be consistently replicate Clang default CPU for i686 triples.

rust-highfive · 2016-02-13T17:18:47Z

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

alexcrichton · 2016-02-13T18:49:04Z

@bors: r+ 8c840ee

Thanks again for the investigation!

MagaTailor · 2016-02-13T21:36:44Z

@dhuseby
The FreeBSD change looks slightly suspect - why call this target i686 and not i486 then? I could understand why clang wants to default to P4 on i586 but the obverse sounds misleading.

Either way, to avoid any potential user disappointment, the current BSD stage0 snapshot was probably built with the old codegen settings and won't run on the newly supported "i686" systems.

ranma42 · 2016-02-14T14:50:00Z

I reproduced locally the failure in #31646. This comparison from src/test/run-pass/issue-21634.rs

    if let Ok(mut x) = "3.1415".parse::<f64>() {
        assert_eq!(8.1415, { x += 5.0; x });
    }

fails because the f64 value 8.1415 is compared to the 80-bit result of the addition:

    1e17:       dd 44 24 5c             fldl   0x5c(%esp) ; load the result of parse
    1e1b:       d8 86 a7 20 00 00       fadds  0x20a7(%esi) ; add 5.0 from memory
    1e21:       dd 54 24 10             fstl   0x10(%esp) ; store the result
    1e25:       8d 44 24 10             lea    0x10(%esp),%eax
    1e29:       89 44 24 4c             mov    %eax,0x4c(%esp)
    1e2d:       8d 86 cf 20 00 00       lea    0x20cf(%esi),%eax
    1e33:       89 44 24 50             mov    %eax,0x50(%esp)
    1e37:       dd 86 9f 20 00 00       fldl   0x209f(%esi) ; load 8.1415
    1e3d:       d9 c9                   fxch   %st(1) ; exchange the two top elements of the FP stack (why???)
    1e3f:       da e9                   fucompp ; compare the two top elements and pop both

It is possible to fix this in several ways:

the value used for the test could be changed (for example to 3.125) so that adding 5.0 does not involve rounding
it might be possible to set the FPU control word so that results are always rounded to 53 bits of mantissa

Since it does not look like the purpose of the test was to check for the rounding behaviour, I think that the test should be fixed/made more robust.

hanna-kruppe · 2016-02-14T15:13:21Z

This shouldn't be the only test that fails. src/libcoretest/num/dec2flt/mod.rs has a test that fails without SSE2. If it doesn't fail, that's a bug in the test, because that test was specifically written to break on x87. The different rounding in x87 breaks the fast path of float parsing (in that it stops being completely accurate and round-trip-safe).

Somehow you need to deal with this if you want a non-SSE2 target. In fact, the discussion of this problem prompted some platforms being changed to a pentium4 base CPU (though I don't know if 32 bit Darwin was among them or if it was already "yonah"). You're gonna have to deal with this somehow. The simplest possibility would be to disable the fast path on 32 bit Darwin, though this is a significant performance regression (at least an order of magnitude or two IIRC, there are float parsing benchmarks in libcoretest).

bors · 2016-02-14T15:33:19Z

⌛ Testing commit 8c840ee with merge ec1f6c7...

bors · 2016-02-14T16:18:47Z

💔 Test failed - auto-mac-32-opt

nagisa · 2016-02-14T19:40:19Z

src/librustc_back/target/i686_apple_darwin.rs

@@ -12,7 +12,9 @@ use target::Target;

 pub fn target() -> Target {
    let mut base = super::apple_base::opts();
-    base.cpu = "yonah".to_string();
+    // Use i686 as default CPU. Clang uses the same default.
+    base.cpu = "i686".to_string();


According to the commit message that made this a yonah

Use more specific target CPUs on Darwin

Macs don't come with anything older than a Yonah (32bit) or Core2 (64bit), so we can default to those targets. Clang does the same.

I’m not sure why clang would change their default, but macs not existing with pre-yonah hardware seems like a pretty good reason to just use a yonah.

cc @dotdash

See my comment above: Clang defaults to yonah on i386-apple-darwin, but not on i686-apple-darwin.

@ranma42 but if there can’t possibly be such a combination of darwin+x86 which uses anything pre-yonah, why bother (EDIT: or, rather, restrict ourselves to) targeting a decade-older CPU?

@ranma42 Hm, where does clang do that distinction? Did you check the code or is there a command I could use to reproduce/check this (without owning a Mac that is ;-))

Nevermind, found the command in the other PR.

So if I'm reading the code correctly, there's some special handling for Darwin that disables the automatic CPU selection for any x86 target except for the i386 one. I wonder whether that's actually intentional.

@nagisa I do not know why Clang restricts itself to a decade-older CPU; we are about to do it in order to be consistent with Clang. I agree with you that it is surprising to have a sub-optimal default on Mac and that is the reason why I was suggesting to provide an i386-apple-darwin triple as default target for 32-bit Mac.

Clang in this regard seems... somewhat inconsistent? At the very least it seems fine to leave this as-is and perhaps document the oddity (to allow this PR to land)

ranma42 · 2016-02-14T20:24:50Z

@rkruppe I confirm that num::dec2flt::fast_path_correct fails. I experimented with fldcw and I can confirm that setting the rounding mode to 64 bits (instead of the default 80 bits) makes the test pass. Would this be ok?
EDIT: Obviously the 64-bits rounding would only be enabled for the fast path of the parser and afterwards the original control word would be restored.

hanna-kruppe · 2016-02-14T20:55:00Z

I'm of two minds regarding fiddling with the FPU control word. On the one hand, it's nice to not leave the performance on the table. On the other hand, it's a very low level trick that has clear disadvantages especially when the target does have SSE2 (among other things, it's slightly slower, might inhibit compiler optimizations, and might break if optimizations get better). On the gripping hand, if we had a way to guarantee that this code path is only taken on targets without SSE2, even as target specs evolve, then I'd feel a lot better about it.

ranma42 · 2016-02-14T21:49:11Z

It might be possible to only touch the FPU control word on x86 targets when SIMD operations are not available (SSE for f32, SSE2 for f64) using cfg_target_feature.

hanna-kruppe · 2016-02-14T22:00:01Z

Wait, cfg can check target CPU features? That'd be exactly what I wished for!

ranma42 · 2016-02-14T22:03:11Z

I should mention that LLVM does a similar fiddling with the control word when casting floating point types to integer in order to ensure truncation (see the lines after 22872 in lib/Target/X86/X86ISelLowering.cpp, I cannot link it because the file is 1+ MB of C++ code 😭 )

hanna-kruppe · 2016-02-14T23:04:20Z

Aw, target_feature="sse" does not actually work :( See #31662

ranma42 · 2016-02-15T08:45:24Z

It does not work yet ;)
I am happy that this PR exposed other issues, we can fix them before they affect distros

- Successful merges: #31551, #31581, #31614, #31626, #31632, #31642 - Failed merges:

nagisa · 2016-02-16T19:03:50Z

Note, that IMHO claiming to follow “what Clang does” is a really brittle way forward. We should…

Have some rules for what is acceptable common denominator to us (which might or might not match Clang’s lower-bound), so the PRs like this do not pop-up after every LLVM release modifying 10 or so target files;
Decide what i386, i686 et al mean to us since their actual meaning seems not really match what i386, i686 et al really are (both here and elsewhere).

ranma42 · 2016-02-16T22:55:59Z

I think it would be convenient if the meaning of rustc triples was consistent with that of Clang or with the GNU one as much as possible.
In any case, our choices (and their rationale) should be documented, in order to avoid PRs like this and to guide future choices when adding architectures and targets.
What is the best place to discuss the points listed by @nagisa ?

nagisa · 2016-02-16T23:04:46Z

Internals forum would be a good place.

alexcrichton · 2016-03-08T01:42:14Z

Closing due to inactivity, but feel free to resubmit with the tests fixed!

ranma42 added 2 commits February 13, 2016 17:55

Add comments about the differences between Clang and rustc target CPUs

0062f72

Keep track of differences between Clang and rust choices about default CPUs for 32-bit x86 targets.

Synchronize i686 rustc triples with Clang

8c840ee

Try to be consistently replicate Clang default CPU for i686 triples.

rust-highfive assigned alexcrichton Feb 13, 2016

ranma42 mentioned this pull request Feb 13, 2016

Do not use SIMD instructions on i686 #31110

Closed

Manishearth mentioned this pull request Feb 14, 2016

Rollup of 6 pull requests #31646

Merged

nagisa reviewed Feb 14, 2016
View reviewed changes

bors added a commit that referenced this pull request Feb 15, 2016

Auto merge of #31646 - Manishearth:rollup, r=Manishearth

81c7024

- Successful merges: #31551, #31581, #31614, #31626, #31632, #31642 - Failed merges:

bors added a commit that referenced this pull request Feb 15, 2016

Auto merge of #31646 - Manishearth:rollup, r=Manishearth

13675a5

- Successful merges: #31551, #31581, #31614, #31626, #31632, #31642 - Failed merges:

alexcrichton closed this Mar 8, 2016

ranma42 mentioned this pull request Apr 27, 2016

--target should ignore the machine part of the triplet in most cases #33147

Open

ranma42 mentioned this pull request May 5, 2016

Fix fast path of float parsing on x87 #33429

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document and update i686 triples #31632

Document and update i686 triples #31632

ranma42 commented Feb 13, 2016

rust-highfive commented Feb 13, 2016

alexcrichton commented Feb 13, 2016

MagaTailor commented Feb 13, 2016

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

bors commented Feb 14, 2016

bors commented Feb 14, 2016

nagisa Feb 14, 2016

ranma42 Feb 14, 2016

nagisa Feb 14, 2016

dotdash Feb 14, 2016

dotdash Feb 14, 2016

dotdash Feb 14, 2016

ranma42 Feb 14, 2016

alexcrichton Feb 16, 2016

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

ranma42 commented Feb 15, 2016

nagisa commented Feb 16, 2016

ranma42 commented Feb 16, 2016

nagisa commented Feb 16, 2016

alexcrichton commented Mar 8, 2016

Document and update i686 triples #31632

Document and update i686 triples #31632

Conversation

ranma42 commented Feb 13, 2016

rust-highfive commented Feb 13, 2016

alexcrichton commented Feb 13, 2016

MagaTailor commented Feb 13, 2016

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

bors commented Feb 14, 2016

bors commented Feb 14, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

ranma42 commented Feb 14, 2016

hanna-kruppe commented Feb 14, 2016

ranma42 commented Feb 15, 2016

nagisa commented Feb 16, 2016

ranma42 commented Feb 16, 2016

nagisa commented Feb 16, 2016

alexcrichton commented Mar 8, 2016