Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poky-patched cargo segfaults on cargo build --bin helloworld #128492

Open
Yashinde145 opened this issue Aug 1, 2024 · 18 comments
Open

poky-patched cargo segfaults on cargo build --bin helloworld #128492

Yashinde145 opened this issue Aug 1, 2024 · 18 comments
Labels
A-incr-comp Area: Incremental compilation C-gub Category: the reverse of a compiler bug is generally UB E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Yashinde145
Copy link

cargo build for a simple hello world program gives seg fault when built in sdk environment using poky sources.
This was first observed in rustc v1.78 and continued in v1.79 and v1.80 also.
(Note: There's no change in the process of sdk build env when tested between the versions).

rustc --version --verbose:

rustc 1.79.0 (129f3b996 2024-06-10) (built from a source tarball)
binary: rustc
commit-hash: 129f3b9964af4d4a709d1383930ade12dfe7c081
commit-date: 2024-06-10
host: x86_64-pokysdk-linux-gnu
release: 1.79.0
LLVM version: 18.1.7

Error output

error: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md

note: rustc 1.79.0 (129f3b996 2024-06-10) (built from a source tarball) running on x86_64-pokysdk-linux-gnu

note: compiler flags: --crate-type bin -C embed-bitcode=no -C debuginfo=2 -C incremental=[REDACTED]

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
end of query stack
error: rustc interrupted by SIGSEGV, printing backtrace

The backtrace generated is same for with "RUST_BACKTRACE=1" and "RUST_BACKTRACE=full".
I suspect the following out of bounds index access is the main reason for seg fault here.

at compiler/rustc_metadata/src/creader.rs:193:31:
index out of bounds: the len is 20 but the index is 60747757

fbc9b94 and 0025c9c are the recent commits related to this.
Maybe @oli-obk can help to understand the error better?

Backtrace

{"$message_type":"artifact","artifact":"/home/poky/build/tmp/work/qemux86_64-poky-linux/core-image-sato/1.0/testimage-sdk/hello/target/debug/build/hello-69a92b98b70371ba/build_script_build-69a92b98b70371ba.d","emit":"dep-info"}
thread 'rustc' panicked at compiler/rustc_metadata/src/creader.rs:193:31:
index out of bounds: the len is 20 but the index is 60747757
stack backtrace:
   0:     0x7fa9a2bcf31f - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hdd8826f6b9d3bb6e
   1:     0x7fa9a2c0246b - core::fmt::write::hce77028645369722
   2:     0x7fa9a2bca2ce - <unknown>
   3:     0x7fa9a2bcf0ee - <unknown>
   4:     0x7fa9a2bb889a - <unknown>
   5:     0x7fa9a2bb8594 - std::panicking::default_hook::h49af0c7febe67f8d
   6:     0x7fa9a37e7187 - <unknown>
   7:     0x7fa9a2bb90e9 - std::panicking::rust_panic_with_hook::h14a0ca211eb21fbf
   8:     0x7fa9a2bcf6e2 - <unknown>
   9:     0x7fa9a2bcf529 - <unknown>
  10:     0x7fa9a2bb8cc6 - rust_begin_unwind
  11:     0x7fa9a2b71422 - core::panicking::panic_fmt::hb4b7de66d883fcc4
  12:     0x7fa9a2b715f6 - core::panicking::panic_bounds_check::h4eecb12f9bb341c4
  13:     0x7fa9a83c8ffd - <rustc_metadata[acbd193db608c71e]::creader::CStore as rustc_session[2c1f89e6504c4cfe]::cstore::CrateStore>::stable_crate_id
  14:     0x7fa9a89931e9 - <unknown>
  15:     0x7fa9a89978ed - <rustc_middle[fd6644fcde453e4f]::query::on_disk_cache::OnDiskCache>::serialize
  16:     0x7fa9a8891541 - <rustc_middle[fd6644fcde453e4f]::ty::context::TyCtxt>::serialize_query_result_cache
  17:     0x7fa9a826b2ea - <unknown>
  18:     0x7fa9a82640cc - <unknown>
  19:     0x7fa9a826b5c0 - <unknown>
  20:     0x7fa9a827c0f1 - <unknown>
  21:     0x7fa9a822acd9 - rustc_incremental[7e9745f68514030c]::persist::save::save_dep_graph
  22:     0x7fa9a37f6045 - <unknown>
  23:     0x7fa9a37a30bd - <unknown>
  24:     0x7fa9a37c7ba7 - <unknown>
  25:     0x7fa9a37c968d - <unknown>
  26:     0x7fa9a2bbc73b - <unknown>
  27:     0x7fa9a29c9b62 - <unknown>
  28:     0x7fa9a2a4463c - <unknown>
  29:                0x0 - <unknown>

error: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md

note: rustc 1.79.0 (129f3b996 2024-06-10) (built from a source tarball) running on x86_64-pokysdk-linux-gnu

note: compiler flags: --crate-type bin -C embed-bitcode=no -C debuginfo=2 -C incremental=[REDACTED]

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
end of query stack
error: rustc interrupted by SIGSEGV, printing backtrace
@Yashinde145 Yashinde145 added C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 1, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 1, 2024
@oli-obk
Copy link
Contributor

oli-obk commented Aug 1, 2024

sdk environment using poky sources.

I don't know what this means. How can we reproduce what you are doing? Can you maybe create a repository that reproduces your issue?

@oli-obk oli-obk added E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Aug 1, 2024
@Yashinde145
Copy link
Author

Sure, I will create a repo and share the link here in sometime.

@matthiaskrgr
Copy link
Member

looks like incr comp bug? :/

@oli-obk oli-obk added the A-incr-comp Area: Incremental compilation label Aug 1, 2024
@Yashinde145
Copy link
Author

Here is the link for the repo, https://github.com/Yashinde145/Cargo_seg_fault_1_79
You can fork it and follow the following steps,

Steps to reproduce the seg fault:

  1. $ cd Cargo_seg_fault_1_79
  2. $ source oe-init-build-env (A new build dir will be generated and pwd is /home/Cargo_seg_fault_1_79/build dir now)
  3. Add the following lines(config changes) in build/conf/local.conf file
TOOLCHAIN_TARGET_TASK = "cargo rust"
TOOLCHAIN_HOST_TASK:append = " packagegroup-rust-cross-canadian-${MACHINE}" 
TOOLCHAIN_TARGET_TASK:append = " libstd-rs"
IMAGE_CLASSES += "testimage testsdk"  
TESTIMAGE_AUTO:qemuall = "1" 

SANITY_TESTED_DISTROS=""
  1. $ bitbake core-image-sato -c do_testsdk (Build may take around 30-40 mins)

The following error logs will be seen:

NOTE: test_cargo_build (rust.RustCompileTest)
NOTE:  ... ERROR
Traceback (most recent call last):
  File "/home/poky/meta/lib/oeqa/sdk/cases/rust.py", line 34, in test_cargo_build
    self._run('cd %s/hello; cargo build' % self.tc.sdk_dir)
  File "/home/poky/meta/lib/oeqa/sdk/case.py", line 15, in _run
    return subprocess.check_output(". %s > /dev/null; %s;" % \
  File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
oeqa.utils.subprocesstweak.OETestCalledProcessError
// Above is the python scripts backtrace running the sdk test scripts


Command: . /home/poky/build/tmp/work/qemux86_64-poky-linux/core-image-sato/1.0/testimage-sdk/environment-setup-core2-64-poky-linux > /dev/null; 
// this cmd invokes the sdk env

cd /home/poky/build/tmp/work/qemux86_64-poky-linux/core-image-sato/1.0/testimage-sdk//hello; 
// hello world rust src

cargo build --target x86_64-pokysdk-linux-gnu;' returned non-zero exit status 101
// cargo build cmd

After this the above mention backtrace is seen.

@workingjubilee
Copy link
Member

All of the changes in the yocto patches seem like they are designed to make it more likely that an incremental compilation failure occurs by

  • removing build hashes
  • removing layout verification
  • removing target checks

This does not mean "yocto did it". But it is harder to see what did.

@Yashinde145 Your problem would be significantly easier to diagnose if this build system at least guaranteed sufficient symbol tables and/or frame-pointers in the build dependencies so that we get back a full backtrace to work with. All I can guess from here is that you should set RUSTFLAGS=-Cincremental=false forever.

@Noratrieb
Copy link
Member

https://github.com/Yashinde145/Cargo_seg_fault_1_79/blob/c6956f2339e5b30d8ae00a23377e5ed7117f47e9/meta/recipes-devtools/rust/files/0001-cargo-do-not-write-host-information-into-compilation.patch from a cursory look at your patches, this one seems really suspicious. it doesn't just remove the host, it also removes the rustc version. so maybe you're getting overlapping build directories for different rustc versions somewhere, which is guaranteed to cause crashes?
not sure if that's what's happening, but it might be. doesn't even have to be caused by incremental, could just be a dependency rlib.

@Noratrieb
Copy link
Member

It would be useful to know if it still ceashes without these patches.

@workingjubilee
Copy link
Member

Does this actually need to be core-image-sato to reproduce?

Why isn't core-image-minimal sufficient?

It would be nice if this reproducer was turnkey. As in one command. Not "source this, write this conf file, then run this build command, and still have no idea how to inject environment variables into a build system that is deliberately trying to be impervious to external settings, without reading a treatise about 'layers' and how much I would absolutely love them if I was building a custom distribution for SBCs."

@Yashinde145
Copy link
Author

All of the changes in the yocto patches seem like they are designed to make it more likely that an incremental compilation failure occurs by

  • removing build hashes
  • removing layout verification
  • removing target checks

This does not mean "yocto did it". But it is harder to see what did.

The commit Yashinde145/Cargo_seg_fault_1_79@c6956f2 does increment rust version updates from v1.75 to v1.79 which includes:

  • removing build hashes and layout verification- These were removed because rust builds were not reproducible (reproducibility test is to check if same host and build configs yields identical binaries generated in different build dirs).

  • removing target checks- Some of the rust tests were unsupported/failed in yocto oe-selftest (yocto's test framework to for testing toolchains and packages). Thus, they were skipped/excluded.

@Yashinde145 Your problem would be significantly easier to diagnose if this build system at least guaranteed sufficient symbol tables and/or frame-pointers in the build dependencies so that we get back a full backtrace to work with. All I can guess from here is that you should set RUSTFLAGS=-Cincremental=false forever.

Agreed. Even RUST_BACKTRACE=full didn't give detailed backtrace here. I will check if there's any way to get symbol table/ frame-pointers in the build system.

I guess RUSTFLAGS=-Cincremental is used here to speed up the build process and compilation time. I will check again by disabling it.

@Yashinde145
Copy link
Author

Does this actually need to be core-image-sato to reproduce?
Why isn't core-image-minimal sufficient?

The issue is seen when running do_testsdk task which is provided by core-image-sato.

It would be nice if this reproducer was turnkey. As in one command. Not "source this, write this conf file, then run this build command, and still have no idea how to inject environment variables into a build system that is deliberately trying to be impervious to external settings, without reading a treatise about 'layers' and how much I would absolutely love them if I was building a custom distribution for SBCs."

I will try to share the reproducer with a bash script in a while.

@workingjubilee
Copy link
Member

@Yashinde145 making binaries more "reproducible" by making them look more similar to each other, when those differences are often what rustc and cargo use to prevent compiling and linking code that is not ABI-compatible, is dangerous.

And this patch unconditionally disables a check instead of unconditionally enabling it. Which means rustc no longer checks if it can produce correct code, ever: yoctoproject/poky@c08c522

Do you see the problem with yocto repeatedly simply disabling safety checks in the name of "reproducibility", and then filing bugs against rustc? Why is yocto waiting until they've busted the compiler's output instead of asking how the compiler can be patched to make something more reproducible?

@Yashinde145
Copy link
Author

https://github.com/Yashinde145/Cargo_seg_fault_1_79/blob/c6956f2339e5b30d8ae00a23377e5ed7117f47e9/meta/recipes-devtools/rust/files/0001-cargo-do-not-write-host-information-into-compilation.patch from a cursory look at your patches, this one seems really suspicious. it doesn't just remove the host, it also removes the rustc version. so maybe you're getting overlapping build directories for different rustc versions somewhere, which is guaranteed to cause crashes? not sure if that's what's happening, but it might be. doesn't even have to be caused by incremental, could just be a dependency rlib.

It would be useful to know if it still ceashes without these patches.

Yes, I will check by reverting the 0001-cargo-do-not-write-host-information-into-compilation.patch and applying the actual fix rust-lang/cargo#14107 for the problem.

@Yashinde145
Copy link
Author

@Yashinde145 making binaries more "reproducible" by making them look more similar to each other, when those differences are often what rustc and cargo use to prevent compiling and linking code that is not ABI-compatible, is dangerous.

And this patch unconditionally disables a check instead of unconditionally enabling it. Which means rustc no longer checks if it can produce correct code, ever: yoctoproject/poky@c08c522

Do you see the problem with yocto repeatedly simply disabling safety checks in the name of "reproducibility", and then filing bugs against rustc? Why is yocto waiting until they've busted the compiler's output instead of asking how the compiler can be patched to make something more reproducible?

I am not sure what's the actual cause here.
Let me cross-check with everyone's input here and then the picture might be clearer.

@Yashinde145
Copy link
Author

Reproducer script:
repro.txt

#!/bin/bash

# Exit immediately if a command exits with a non-zero status
set -e

# Clone the repository
git clone https://github.com/Yashinde145/Cargo_seg_fault_1_79
cd Cargo_seg_fault_1_79

# Source the OE build environment
source oe-init-build-env

# Define the local.conf file path
LOCAL_CONF_FILE="conf/local.conf"

# Append necessary configurations to local.conf
echo 'TOOLCHAIN_TARGET_TASK = "cargo rust"' >> $LOCAL_CONF_FILE
echo 'TOOLCHAIN_HOST_TASK:append = " packagegroup-rust-cross-canadian-${MACHINE}"' >> $LOCAL_CONF_FILE
echo 'TOOLCHAIN_TARGET_TASK:append = " libstd-rs"' >> $LOCAL_CONF_FILE
echo 'IMAGE_CLASSES += "testimage testsdk"' >> $LOCAL_CONF_FILE
echo 'TESTIMAGE_AUTO:qemuall = "1"' >> $LOCAL_CONF_FILE
echo 'SANITY_TESTED_DISTROS=""' >> $LOCAL_CONF_FILE

# Run the bitbake command
bitbake core-image-sato -c do_testsdk

@bjorn3
Copy link
Member

bjorn3 commented Aug 3, 2024

Does this reproduce with the latest rustc nightly and all obsolete patches dropped? Half the patches seem to be obsolete and for some of them the original patch is not entirely correct I think.

@workingjubilee
Copy link
Member

I updated the script to be repeatable from the root of the repository:

#!/bin/bash

# Exit immediately if a command exits with a non-zero status
set -e

# Source the OE build environment
source oe-init-build-env

# Define the local.conf file path
LOCAL_CONF_FILE="conf/local.conf"

local_conf_setup="$(grep "$LOCAL_CONF_FILE" -e 'TOOLCHAIN')"
# Append necessary configurations to local.conf
if [ "$local_conf_setup" ]; then
  echo "Already set up conf file..."
else
  echo 'TOOLCHAIN_TARGET_TASK = "cargo rust"' >> $LOCAL_CONF_FILE
  echo 'TOOLCHAIN_HOST_TASK:append = " packagegroup-rust-cross-canadian-${MACHINE}"' >> $LOCAL_CONF_FILE
  echo 'TOOLCHAIN_TARGET_TASK:append = " libstd-rs"' >> $LOCAL_CONF_FILE
  echo 'IMAGE_CLASSES += "testimage testsdk"' >> $LOCAL_CONF_FILE
  echo 'TESTIMAGE_AUTO:qemuall = "1"' >> $LOCAL_CONF_FILE
  echo 'SANITY_TESTED_DISTROS=""' >> $LOCAL_CONF_FILE
fi

# Run the bitbake command
bitbake core-image-sato -c do_testsdk

I then removed a number of patches, including the "hardcodepaths.patch" file (in reality, it now only disables checking Rust's data layouts against LLVM's data layouts), and got this during the build:

| error: data-layout for target x86_64-poky-linux-gnu, e-m:e-i64:64-f80:128-n8:16:32:64-S128, differs from LLVM target's x86_64-poky-linux-gnu default layout, e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128

@Yashinde145 Please remove the hardcodepaths.patch file and commit the necessary fixes to correctly build the poky targets. The version as-of your update commit does not remove any paths embedded into the binary anymore, it simply disables a safety check for codegen. Rust has quite enough codegen-correctness problems, against what are often unclearly-documented-at-best ABIs, without your patches introducing more.

I do not think we can help you further with your problems until that is done.

@workingjubilee workingjubilee added C-gub Category: the reverse of a compiler bug is generally UB and removed C-bug Category: This is a bug. labels Aug 4, 2024
@workingjubilee workingjubilee changed the title Cargo build gives segmentation fault in sdk build env poky-patched cargo segfaults on cargo build --bin helloworld Aug 4, 2024
@workingjubilee workingjubilee added the I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. label Aug 4, 2024
@Yashinde145
Copy link
Author

Update for:

Yes, I will check by reverting the 0001-cargo-do-not-write-host-information-into-compilation.patch and applying the actual fix rust-lang/cargo#14107 for the problem.

and

I guess RUSTFLAGS=-Cincremental is used here to speed up the build process and compilation time. I will check again by disabling it.

I did both the changes and still get the same error.

@Yashinde145
Copy link
Author

@workingjubilee ,
Thanks for the script updates.

@Yashinde145 Please remove the hardcodepaths.patch file and commit the necessary fixes to correctly build the poky targets.

I will make the commit changes as suggested by you
but it may take some time to fix the data-layout difference error.

I will update here once I have the changes ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation C-gub Category: the reverse of a compiler bug is generally UB E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants