-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
debuginfo/pretty-std-collections.rs
test sometimes fails on macOS
#78665
Comments
Oh, I just found the related Zulip topic: https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/apple-x86_64.20gha.20checks |
When reporting fatal errors, LLVM calls abort() to exit the program. There is a chance that might interfere with Python printing stuff to stdout, as by default it relies on buffering to increase performance. This commit tries to disable Python buffering, to hopefully get useful logs while debugging rust-lang#78665.
…Simulacrum Try running lldb_batchmode.py with PYTHONUNBUFFERED When reporting fatal errors, LLVM calls abort() to exit the program. There is a chance that might interfere with Python printing stuff to stdout, as by default it relies on buffering to increase performance. This commit tries to disable Python buffering, to hopefully get useful logs while debugging rust-lang#78665.
Please ping me if another failure happens again. |
This just happened in #78448. |
Even more information to try and debug rust-lang#78665.
Show more error information in lldb_batchmode Even more information to try and debug rust-lang#78665.
Show more error information in lldb_batchmode Even more information to try and debug rust-lang#78665.
Another instance: https://github.com/rust-lang-ci/rust/runs/1356979047 |
Copying the full log from a recent failure:
The The error I've tried running the test in a loop locally, but have been unable to get it to fail. It looks like it is failing while trying to print out
|
Another instance: #78804 (comment) |
I'm not sure if anyone else is looking into this issue. I've tried many experiments with little success. I've never been able to reproduce locally, but I can pretty reliably reproduce on GitHub Actions. The only change that seems to reliably avoid the error is to remove If you search for the error messages on Google, you can find many other people running into similar situations. It seems like most of them are related to somehow dynamically linking the wrong LLVM. I'm not sure how that fits for this situation. I'm also unsure if lldb is somehow linking the wrong LLVM, why it doesn't fail 100% of the time. I'm also uncertain exactly how the error is originating (does lldb launch clang somehow?). |
I should get an actual macOS device to investigate this more productively early next week. |
Failed in #78590. |
Minor update of other clues I've discovered:
@ortem as the author of the lldb_providers, do you have much experience with debugging lldb issues? Do you think you could provide any insight or help here? Just to summarize, there is a test (
This happens when calling |
Failed in #78631.
Are you sure that is the problem, other than it is the only from the 3 errors displayed not yet debunked here? In my experience - with gdb, not lldb! - stderr is suppressed if the test succeeds, so you need to explicitly fail the test to see whether this error is also normal on that platform. What is displayed in the report is the contents of a file, that earlier in the test execution contains other messages - at least rustc warnings about the test case code compiled. I had the impression there were 5 stages, each of which might be overwriting useful diagnostics from an earlier stage. |
@ehuss Looks like the problem occurs only when using LLDB without native Rust support, isn't it? If so, it may be helpful to disable this specific test from Regarding removing
|
Thanks for the tip on the logger, I was assuming setting I'm concerned about just disabling the test, as it feels like sweeping the problem under the rug without understanding what is wrong. It does seem likely it is an issue with lldb, but I think it would be good to at least have some vague understanding of what is wrong, and why it only affects this test. Sorry, I'm not too familiar with lldb or the Rust integration. What do you mean by "LLDB without native Rust support"? What little information I can find at https://rustc-dev-guide.rust-lang.org/debugging-support-in-rustc.html#lldb seems to point to a repository that is archived and hasn't been updated in 3 years. AFAIK, building a custom lldb was removed over a year ago (#62592). I can see a small amount of rust support in upstream LLDB, but it seems to be just a stub. It is a bit difficult to compare, but the lldb in rust-lang/llvm-project is a fair bit different from apple/llvm-project. With logging, the last line printed is consistently just before this line. Looking through the LLDB source, if I'm reading it correctly, this triggers creating a TypeSystem which I believe for Rust is TypeSystemClang. Unfortunately that is a pretty large amount of code. |
@shepmaster added a new data point that they were able to reproduce on the Apple DTK ( I'm going to be away for a week or two, so I won't be able to look at this anymore. I would recommend that if nobody wants to investigate this further, that someone should add |
This also reproduced locally on a pratically clean macOS 10.15 installation by running |
The test file already has ignore-windows, ignore-freebsd, ignore-android... I think an ignore-macos won't be much of an issue. |
→ #79094 |
You're right, seems like "LLDB with native Rust support" (which is basically LLDB with Rust support patches) is not used in BTW, probably @tromey could tell something about @est31 There are really lots of ignored debugging tests, so I agree with you that disabling these tests on macOS won't make it much worse. |
Yes, but maybe not in the way you may be thinking. In LLDB, each language provides its own code to analyze DWARF. So, the Rust-specific DWARF analysis handles this case without issuing an error. See |
Add //ignore-macos to pretty-std-collections.rs On macOS the test is flaky and sometimes fails, sometimes succeeds on CI. This is no fix for the underlying issue, but I feel the workaround is worth it as the issue makes it harder to get things merged into master. cc rust-lang#78665
@ehuss my PR only ignored the test on Mac OS. It should still be tested on Linux, no? |
@ehuss oh that's unfortunate. It was definitely not my intent when doing that. The gdb tests should still be run. At least when I change something in the gdb related text on my local Linux machine I get a test failure (otherwise a success). For some reason I don't get such a reaction when altering the lldb related text tho. |
(To make it clear, the premise of #79094 was that it does still run all the tests on linux, but apparently the lldb ones were only run on mac OS.... I installed lldb locally on my Linux and it's broken.... maybe the solution is really to make the Linux box do |
CI on macOS sometimes fails due to this failure:
You can find the full log here: https://github.com/rust-lang-ci/rust/runs/1340649828
The failure occurs on #78501, #78489, and #78661 (at least).
The text was updated successfully, but these errors were encountered: