Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU][OpenMP] OpenMC debug build process crashes #64163

Closed
yashssh opened this issue Jul 27, 2023 · 9 comments
Closed

[AMDGPU][OpenMP] OpenMC debug build process crashes #64163

yashssh opened this issue Jul 27, 2023 · 9 comments
Assignees

Comments

@yashssh
Copy link
Contributor

yashssh commented Jul 27, 2023

As the title says I can't build the OpenMC app with -DCMAKE_BUILD_TYPE=Debug|RelWithDebInfo cmake options mentioned here. -DCMAKE_BUILD_TYPE=Release builds fine.

I'm using the build scripts from https://github.com/jtramm/openmc_offloading_builder for MI100 target. I'm not aware which compiler commit introduced this behavior I trying to build with the latest compiler.

Attaching the stack trace of the crash

[ 66%] Linking CXX shared library lib/libopenmc.so                                                                                                                                                                                                                   
/usr/local/lib/python3.8/dist-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/libopenmc.dir/link.txt --verbose=1                                                                                                                                       
/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang++ -fPIC -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908  -fopenmp -fopenmp-cuda-mode -Dgsl_CONFIG_CONTRACT_CHE
CKING_OFF -Wno-tautological-constant-compare -Wno-openmp-mapping -g -shared -Wl,-soname,libopenmc.so -o lib/libopenmc.so CMakeFiles/libopenmc.dir/Unity/unity_0_cxx.cxx.o  -Wl,-rpath,/usr/lib/x86_64-linux-gnu/hdf5/serial::::::::::::::::::::::::::::::::::::::::::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5.so /usr/lib/x86_64-linux-gnu/libpthread.so /usr/lib/x86_64-linux-gnu/libsz.so /usr/lib/x86_64-linux-gnu/libz.so /usr/lib/x86_64-linux-gnu/libdl.
so -lm /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5_hl.so lib/libpugixml.a /usr/local/lib/libfmt.a                                                                                                                                                                  
clang-linker-wrapper: /long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-project/llvm/lib/Target/AMDGPU/SIOptimizeVGPRLiveRange.cpp:532: void (anonymous namespace)::SIOptimizeVGPRLiveRange::optimize
LiveRange(Register, MachineBasicBlock *, MachineBasicBlock *, MachineBasicBlock *, SmallSetVector<MachineBasicBlock *, 16> &) const: Assertion `!O.readsReg()' failed.                                                                                               
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.                                                                                                                                                          
Stack dump:                                                                                                                                                                                                                                                          
0.      Program arguments: /long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --device-debug --linker-path=/usr/bin/ld -- -z relro --hash-styl
e=gnu --eh-frame-hdr -m elf_x86_64 -shared -o lib/libopenmc.so /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/../lib/x86_64
-unknown-linux-gnu -L/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/lib/clang/17/lib/x86_64-unknown-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib
/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -L/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/lib -L/long_pathname_so_that_rpms_can_package_the_deb
ug_info/src/extlibs/openmc/llvm-install/lib -L/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc/llvm-install/lib -L. -soname libopenmc.so CMakeFiles/libopenmc.dir/Unity/unity_0_cxx.cxx.o -rpath /usr/lib/x86_64-linux-gnu/hdf5/serial::::::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5.so /usr/lib/x86_64-linux-gnu/libpthread.so /usr/lib/x86_64-linux-gnu/libsz.so /usr/lib/x86_64-linux-gnu/libz
.so /usr/lib/x86_64-linux-gnu/libdl.so -lm /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5_hl.so lib/libpugixml.a /usr/local/lib/libfmt.a -lstdc++ -lm -lomp -lomptarget -lomptarget.devicertl -L/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/ope
nmc_offloading_builder/llvm-install/lib -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o                                                                                                             
1.      Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.                                                                                                                                                                                                 
2.      Running pass 'SI Optimize VGPR LiveRange' on function '@__omp_offloading_39_35d4096__ZN6openmc31process_advance_particle_eventsEv_l252'                                                                                                                      
 #0 0x0000000002fb1d38 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x2fb1d38)                                            
 #1 0x0000000002fafb3e llvm::sys::RunSignalHandlers() (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x2fafb3e)                                                                 
 #2 0x0000000002fb24ed SignalHandler(int) Signals.cpp:0:0                                                                                                                                                                                                            
 #3 0x00007ff0aa421420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)                                                                                                                                                                                  
 #4 0x00007ff0a9eb400b raise /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1                                                                                                                                                           
 #5 0x00007ff0a9e93859 abort /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81:7                                                                                                                                                                                      
 #6 0x00007ff0a9e93729 get_sysdep_segment_value /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:509:8                                                                                                                                                               
 #7 0x00007ff0a9e93729 _nl_load_domain /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:970:34                                                                                                                                                                       
 #8 0x00007ff0a9ea4fd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)                                                                                                                                                                                                     
 #9 0x0000000002405184 (anonymous namespace)::SIOptimizeVGPRLiveRange::runOnMachineFunction(llvm::MachineFunction&) SIOptimizeVGPRLiveRange.cpp:0:0                                                                                                                  
#10 0x00000000030b09f0 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x30b09f0)                                      
#11 0x00000000029d1627 llvm::FPPassManager::runOnFunction(llvm::Function&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x29d1627)                                            
#12 0x0000000002be0a51 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) CallGraphSCCPass.cpp:0:0                                                                                                                                                     
#13 0x00000000029d20a7 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x29d20a7)                                              
#14 0x0000000003545a5b codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
#15 0x0000000003544a62 llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x3544a62)
#16 0x0000000003516f90 llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x3516f90)
#17 0x00000000035165b7 llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x35165b7)
#18 0x0000000002098508 (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::SmallVectorImpl<llvm::StringRef>&, llvm::opt::ArgList const&) ClangLinkerWrapper.cpp:0:0
#19 0x000000000209234e llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int)::$_0::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(llvm::SmallVector<llvm::object::OffloadFile, 3u>&) const ClangLinkerWrapper.cpp:0:0
#20 0x0000000002089c2d (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
#21 0x000000000208573a main (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x208573a)
#22 0x00007ff0a9e95083 __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:342:3
#23 0x000000000208452e _start (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x208452e)
 #0 0x0000000002fb1d38 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x2fb1d38)
 #1 0x0000000002fafb75 llvm::sys::RunSignalHandlers() (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x2fafb75)
 #2 0x0000000002fb24ed SignalHandler(int) Signals.cpp:0:0
 #3 0x00007ff0aa421420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x00007ff0a9eb400b raise /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #5 0x00007ff0a9e93859 abort /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81:7
 #6 0x00007ff0a9e93729 get_sysdep_segment_value /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:509:8
 #7 0x00007ff0a9e93729 _nl_load_domain /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:970:34
 #8 0x00007ff0a9ea4fd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
 #9 0x0000000002405184 (anonymous namespace)::SIOptimizeVGPRLiveRange::runOnMachineFunction(llvm::MachineFunction&) SIOptimizeVGPRLiveRange.cpp:0:0
#10 0x00000000030b09f0 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x30b09f0)
#11 0x00000000029d1627 llvm::FPPassManager::runOnFunction(llvm::Function&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x29d1627)
#12 0x0000000002be0a51 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) CallGraphSCCPass.cpp:0:0
#13 0x00000000029d20a7 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x29d20a7)
#14 0x0000000003545a5b codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
#15 0x0000000003544a62 llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x3544a62)
#16 0x0000000003516f90 llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x3516f90)
#17 0x00000000035165b7 llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x35165b7)
#18 0x0000000002098508 (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::SmallVectorImpl<llvm::StringRef>&, llvm::opt::ArgList const&) ClangLinkerWrapper.cpp:0:0
#19 0x000000000209234e llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int)::$_0::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(llvm::SmallVector<llvm::object::OffloadFile, 3u>&) const ClangLinkerWrapper.cpp:0:0
#20 0x0000000002089c2d (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
#21 0x000000000208573a main (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x208573a)
#22 0x00007ff0a9e95083 __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:342:3
#23 0x000000000208452e _start (/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/llvm-install/bin/clang-linker-wrapper+0x208452e)
clang++: error: unable to execute command: Aborted (core dumped)
clang++: error: linker command failed due to signal (use -v to see invocation)
make[2]: *** [CMakeFiles/libopenmc.dir/build.make:106: lib/libopenmc.so] Error 1
make[2]: Leaving directory '/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/openmc/build'
make[1]: *** [CMakeFiles/Makefile2:188: CMakeFiles/libopenmc.dir/all] Error 2
make[1]: Leaving directory '/long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/openmc_offloading_builder/openmc/build'
make: *** [Makefile:136: all] Error 2
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 27, 2023

@llvm/issue-subscribers-backend-amdgpu

@arsenm
Copy link
Contributor

arsenm commented Jul 28, 2023

Can you attach the failing temps

@yashssh
Copy link
Contributor Author

yashssh commented Jul 28, 2023

What do you mean by temps here? Is there a CMAKE option that can be used to get intermediate files (also won't those be a lot?)?

@arsenm
Copy link
Contributor

arsenm commented Jul 28, 2023

What do you mean by temps here? Is there a CMAKE option that can be used to get intermediate files (also won't those be a lot?)?

I mean the .bc, .s and .i files for the failing case.

There's no cmake option. Sometimes the clang assertion points you to a preprocessed source and a script to repeat the compile which works well enough.

Better would be just get to the failing clang invocation (e.g. run ninja -v -j 1) and copy the command at the failure point. Then you can add -save-temps, or can extract the exact cc1 or lld command by looking at the -### output.

@yashssh
Copy link
Contributor Author

yashssh commented Jul 31, 2023

inputs.zip
Attaching the original IR file on which the build process crashes as well as llvm-reduce output. Both fire the same assertion as the one mentioned in ticket description. To be invoked via llc
llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 libopenmc.so.amdgcn-amd-amdhsa.gfx908.postopt.bc -o /dev/null

Haven't looked into any potential source of error yet.

@yashssh yashssh self-assigned this Aug 1, 2023
@arsenm
Copy link
Contributor

arsenm commented Aug 2, 2023

for (auto &O : make_early_inc_range(MRI->use_operands(Reg))) {

I think this just needs to be use_nodbg_operands

@arsenm
Copy link
Contributor

arsenm commented Aug 2, 2023

for (auto &O : make_early_inc_range(MRI->use_operands(Reg))) {

I think this just needs to be use_nodbg_operands

Actually we do want to update the debug instructions, I think it just needs to skip the debug uses for the assert

@yashssh
Copy link
Contributor Author

yashssh commented Aug 2, 2023

Opened D156893

yashssh pushed a commit that referenced this issue Aug 7, 2023
… reg in SIOptimizeVGPRLiveRange

This will prevent the `assert(!O.readsReg())` from firing in
SIOptimizeVGPRLiveRange::optimizeLiveRange

Fix for #64163

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D156893
@arsenm
Copy link
Contributor

arsenm commented Aug 9, 2023

This was fixed by 3dc413e but apparently github only noticed the referenced, not the fix part

@arsenm arsenm closed this as completed Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants