-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document and update i686 triples #31632
Conversation
Keep track of differences between Clang and rust choices about default CPUs for 32-bit x86 targets.
Try to be consistently replicate Clang default CPU for i686 triples.
(rust_highfive has picked a reviewer for you, use r? to override) |
@dhuseby Either way, to avoid any potential user disappointment, the current BSD stage0 snapshot was probably built with the old codegen settings and won't run on the newly supported "i686" systems. |
I reproduced locally the failure in #31646. This comparison from if let Ok(mut x) = "3.1415".parse::<f64>() {
assert_eq!(8.1415, { x += 5.0; x });
} fails because the 1e17: dd 44 24 5c fldl 0x5c(%esp) ; load the result of parse
1e1b: d8 86 a7 20 00 00 fadds 0x20a7(%esi) ; add 5.0 from memory
1e21: dd 54 24 10 fstl 0x10(%esp) ; store the result
1e25: 8d 44 24 10 lea 0x10(%esp),%eax
1e29: 89 44 24 4c mov %eax,0x4c(%esp)
1e2d: 8d 86 cf 20 00 00 lea 0x20cf(%esi),%eax
1e33: 89 44 24 50 mov %eax,0x50(%esp)
1e37: dd 86 9f 20 00 00 fldl 0x209f(%esi) ; load 8.1415
1e3d: d9 c9 fxch %st(1) ; exchange the two top elements of the FP stack (why???)
1e3f: da e9 fucompp ; compare the two top elements and pop both It is possible to fix this in several ways:
Since it does not look like the purpose of the test was to check for the rounding behaviour, I think that the test should be fixed/made more robust. |
This shouldn't be the only test that fails. Somehow you need to deal with this if you want a non-SSE2 target. In fact, the discussion of this problem prompted some platforms being changed to a pentium4 base CPU (though I don't know if 32 bit Darwin was among them or if it was already "yonah"). You're gonna have to deal with this somehow. The simplest possibility would be to disable the fast path on 32 bit Darwin, though this is a significant performance regression (at least an order of magnitude or two IIRC, there are float parsing benchmarks in |
⌛ Testing commit 8c840ee with merge ec1f6c7... |
💔 Test failed - auto-mac-32-opt |
@@ -12,7 +12,9 @@ use target::Target; | |||
|
|||
pub fn target() -> Target { | |||
let mut base = super::apple_base::opts(); | |||
base.cpu = "yonah".to_string(); | |||
// Use i686 as default CPU. Clang uses the same default. | |||
base.cpu = "i686".to_string(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the commit message that made this a yonah
Use more specific target CPUs on Darwin
Macs don't come with anything older than a Yonah (32bit) or Core2 (64bit), so we can default to those targets. Clang does the same.
I’m not sure why clang would change their default, but macs not existing with pre-yonah hardware seems like a pretty good reason to just use a yonah
.
cc @dotdash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above: Clang defaults to yonah
on i386-apple-darwin
, but not on i686-apple-darwin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranma42 but if there can’t possibly be such a combination of darwin+x86 which uses anything pre-yonah, why bother (EDIT: or, rather, restrict ourselves to) targeting a decade-older CPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranma42 Hm, where does clang do that distinction? Did you check the code or is there a command I could use to reproduce/check this (without owning a Mac that is ;-))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, found the command in the other PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I'm reading the code correctly, there's some special handling for Darwin that disables the automatic CPU selection for any x86 target except for the i386
one. I wonder whether that's actually intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nagisa I do not know why Clang restricts itself to a decade-older CPU; we are about to do it in order to be consistent with Clang. I agree with you that it is surprising to have a sub-optimal default on Mac and that is the reason why I was suggesting to provide an i386-apple-darwin
triple as default target for 32-bit Mac.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang in this regard seems... somewhat inconsistent? At the very least it seems fine to leave this as-is and perhaps document the oddity (to allow this PR to land)
@rkruppe I confirm that |
I'm of two minds regarding fiddling with the FPU control word. On the one hand, it's nice to not leave the performance on the table. On the other hand, it's a very low level trick that has clear disadvantages especially when the target does have SSE2 (among other things, it's slightly slower, might inhibit compiler optimizations, and might break if optimizations get better). On the gripping hand, if we had a way to guarantee that this code path is only taken on targets without SSE2, even as target specs evolve, then I'd feel a lot better about it. |
It might be possible to only touch the FPU control word on |
Wait, |
I should mention that LLVM does a similar fiddling with the control word when casting floating point types to integer in order to ensure truncation (see the lines after 22872 in |
Aw, |
It does not work yet ;) |
Note, that IMHO claiming to follow “what Clang does” is a really brittle way forward. We should…
|
I think it would be convenient if the meaning of |
Internals forum would be a good place. |
Closing due to inactivity, but feel free to resubmit with the tests fixed! |
They now (should) match the behaviour of Clang and there is a brief comment in each documenting this.
As per discussion in #31110