-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rust: fails on clang32 #9091
Comments
I tried patching this, but the error still occurs. |
very strange, the .o and .a still have the |
@mati865 ? I don't even know where 'upstream' is for that file to report or send a pull request, nor do I know why patching the |
I've discovered D'oh
|
This seems to be working. Will try to make a PR (draft cause it's pretty clear that rust doesn't want you to patch their vendored code) |
Eventually got this error:
|
This was merged and Published as 0.1.14 |
All packages which successfully built on clang64, for which all of their dependencies are already available for clang32. Except for rust (#9091). Batch 7
All packages which successfully built on clang64, for which all of their dependencies are already available for clang32. Except for rust (#9091). Batch 8
I hacked source code (by disabling what fails to build) and it passed the build but the compiler is totally broken. |
I've tried aborting instead of unwinding but this doesn't help. Seems we need proper unwinding because code that runs proc-macros relies on catching the exceptions. |
Did some debugging. Somehow the |
For reference, the In my test program, the return address from the call to
|
For a test, I am hacking the libgcc hack and clang subsystem patch to always use dynamic libunwind |
That gets it a stack frame further into unwinding but then can't find unwind info for https://github.com/rust-lang/rust/blob/1cf8fdd4f0be26bcfa9e3b1e10d4bf80107ba492/library/panic_unwind/src/gcc.rs#L62 I guess there's something wrong with the writing of |
Kind of feels like https://reviews.llvm.org/D84607, but when trying to look via |
That didn't help. @mstorsjo hate to ping you on such a meandering thread, but do you know any way to figure out why |
I think I found how to output linker command line, and add an option to linker, for rust build. |
Turns out RUSTFLAGS don't seem to apply to 'host' tools, which is what is failing. I have not yet seen anything like HOSTRUSTFLAGS. https://github.com/rust-lang/cargo/blob/18751dd3f238d94d384a7fe967abfac06cbfe0b9/src/cargo/core/compiler/build_context/target_info.rs#L589-L613 |
I was hoping to get a repro.tar from lld, but not being able to add flags to the 'build script' build (and it apparently cleaning up its .o files) is making that not possible. It appears that
|
There is some oddity in
I have inspected crashing binary and it contains |
This difference probably is about whether there is a section symbol for the section or not. Technically, at the lowest level, those section symbols are pretty much unnecessary. I guess there is none, while gnu nm maybe does print one anyway (as there is a section still). Llvm’s As for the root cause, dwarf unwinding is kinda tricky. (Sorry I never commented on this earlier when I was pinged.) The main contents of the With libgcc, there are marker symbols in crtbegin/crtend.o, named like With libunwind, there’s no such static registration, but when necessary, libunwind iterates over the loaded dlls/exes, and finds all the However, if the marker symbols are non-empty (I remember them being nonempty in libgcc’s crtbegin.o, even if it would seem empty when I try to read the source right now), libunwind would see 4 unexpected zero bytes (indicating a terminator of the section) at the start of the A different possible failure mode is if the unwinder used does expect the data to be registered, but it isn’t, or if the section contents is outside of the markers, so it doesn’t get included. At least libunwind has got some debug logging included that should be possible to turn on, which helps for getting at least some idea of what’s going on. |
FWIW Rust has it's own CRT begin and end files: https://github.com/rust-lang/rust/tree/master/library/rtstartup and does not use those from C toolchain. |
Maybe someday we can get back to trying to figure this out... What I remember from debugging at the time was that unwind information for many functions were missing from |
Possibly coincidentally, ran into an unwinding issue trying to get rustc bootstrapped on arm64 (#13278 (comment)). Don't know if x86_64 is the odd man out and both i686 and aarch64 are plagued by some underlying unwinding issue, or if it's just a coincidence that both happened to fail in the unwinding code. |
Ok, great! So unwinding mostly works, but here, it hits a stack frame where something goes wrong. That changes what we need to look for. Which function does this correspond to, and where does |
Unfortunately I couldn't find out what is under that failing address. |
Ok. I guess I could try to sit down with it and debug what's going wrong too, some day. The ideal for me, for doing that, would be a build with debug info present, with libunwind linked as a DLL (so it's easy for me to add more debugging printouts in it - for dwarf unwinding, running it in wine doesn't really help at all, as the OS facilities aren't used for unwinding). |
I think I was able to pinpoint binary with problematic frame and perhaps unsurprisingly it's the binary I'm calling itself... Part of log with my own prints to list all the modules in which libunwind searches for unwind sections:
Which might sound nice but if we compare working and broken binary:
The problematic one is quite big and has all kind of different symbols that are absent on working one: I still plan to debug it one I find some time but providing link to the archive just in case: https://1drv.ms/u/s!AgMYIlqTF8b9gus1M-aZcFfphKAzsg?e=OeDwg4 |
I have started looking at this now, and I have almost nailed it down, but not quite. I noticed that when running this test in wine, it didn't hit the crash. ( When comparing logs from the case in wine (and/or without dynamicbase) and the crashing case, there's a Now why doesn't this happen in the To figure out why, I'd kinda want to look at how the code generation happens. Are you able to provide LLVM IR (bitcode file or textual) for the object file that contains the symbol |
I've tried to see how the dwarf code generation would intentionally generate a null pointer for the personality field, and it does look like that really shouldn't be happening. So I think the unwind info actually does have a reference to a personality function symbol there. But after linking, it's null - but still considered a symbol reference, not an absolute address, so there's a base relocation in it. I'm not quite sure how this can happen. I had a suspicion that It's possible that there's some other special symbol trickery that would provide a null In any case, the whole |
Not at all, IIRC that would mean unhandled ICE.
I was planning to test this next this or next weekend 😄
Yeah, I was thinking about null pointer but couldn't understand how it could become non-null without memory corruption. Now that makes sense!
Oh, I think I saw literal pointer in some std code but surely it wasn't null...
It shouldn't be hard but it will take me some time to figure out. Without any changes I can only show stage 1 build dirs (over 9 GiB unpacked) with object files (rlibs) among the other things. Inside there is https://1drv.ms/u/s!AgMYIlqTF8b9gus1M-aZcFfphKAzsg?e=rc6inR
That can be painful even after I get IR because of the reasons above.
AFAICT no Fat or Thin LTO is involved here, Rust can do it's own kind of LTO before linking step but I think it's also disabled for those builds.
Shouldn't be the case as Rust disables weak symbols for Windows.
Yeah, it's lang item so the machinery around it is not exactly trivial to follow.
Ouch, I didn't see that coming. Could it be
Sure, I'll try to obtain it soon. |
Right, ok - and that's also consistent with this seeming to be a missing personality function, not intended null.
Well not really, since as we realized later, the issue is that it should be a nonnull
Thanks, I downloaded this. Unfortunately I don't quite know how to try to rerun the linking of rustdoc.exe here... I at least managed to located the object file which contains
Ok, good, they're a bit of a mess (even though they should work fine in llvm based mingw environments).
More or less (or maybe vice versa - I think the code in
Ok, great! Are you able to check what the state is for other architecture builds (x86_64 and aarch64?), for |
This now includes all temporary files so it's ~24 GiB unpacked... Here are all arguments passed to |
Thanks! The clang arguments included a file However, linking C++ code for i686 with |
Normally, a relocation against a symbol that has been GC'd away like this would end up as a linker error: https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/lld/COFF/Chunks.cpp#L352 However due to how the DWARF sections work, it's expected to have such dangling references left out there, if things are GC'd, so it ignores that error for such sections: https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/lld/COFF/Chunks.cpp#L328-L333 (Also the error is entirely skipped for anything in mingw mode) But now I found how this is fixed for the existing C++ cases: llvm/llvm-project@7e07683 I made a patch for LLD that should fix it similarly for this symbol: https://reviews.llvm.org/D136879 |
Terrific! Trying to run it manually:
So there is still more work to do on my end. |
…y functions These need to have special treatment wrt to .eh_frame sections and GC - as long as we don't have a full parser of the .eh_frame section in the COFF linker. This fixes Rust unwind issues on i686 mingw as discussed in msys2/MINGW-packages#9091. Differential Revision: https://reviews.llvm.org/D136879
Maybe a missing or incorrect manifest? Without a manifest, the Windows compatibility heuristic might be seeing "install" in the name of the exe and assuming it needs elevation. |
I think Rust binaries have no manifest, except for host compiler built for msvc which enables long paths support. |
Huh, I thought you needed to have an |
…y functions These need to have special treatment wrt to .eh_frame sections and GC - as long as we don't have a full parser of the .eh_frame section in the COFF linker. This fixes Rust unwind issues on i686 mingw as discussed in msys2/MINGW-packages#9091. Differential Revision: https://reviews.llvm.org/D136879
hm, I just found a similar issue where clang32 exe is auto-elevated, while mingw32 isn't: mesonbuild/wrapdb#513 |
Probably add test stuff to https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-cjson/PKGBUILD to replicate. |
|
yeah, gcc and rust in gcc envs apply our default manifest (https://packages.msys2.org/base/mingw-w64-windows-default-manifest) while that's not the case in the clang ones |
Interesting it only seems to be cropping up on i686... I thought the setup heuristic applied to x86_64 too... |
yeah, see https://learn.microsoft.com/en-us/previous-versions/aa905330(v=msdn.10)#installer-detection |
Oddly,
https://github.com/rust-lang/stacker/blob/master/psm/src/arch/x86_windows_gnu.s#L63-L69
What's up with that
@20
? Maybe that's the cause?The text was updated successfully, but these errors were encountered: