Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCX and enable mca dso do not mix #11632

Closed
hppritcha opened this issue Apr 29, 2023 · 1 comment
Closed

UCX and enable mca dso do not mix #11632

hppritcha opened this issue Apr 29, 2023 · 1 comment

Comments

@hppritcha
Copy link
Member

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

main and v5.0.x

RHEL8
x86_64
connectx5

ucx-rdmacm-1.12.0-1.55103.x86_64
ucx-cma-1.12.0-1.55103.x86_64
ucx-xpmem-1.12.0-1.55103.x86_64
ucx-1.12.0-1.55103.x86_64
ucx-ib-1.12.0-1.55103.x86_64
ucx-devel-1.12.0-1.55103.x86_64
ucx-knem-1.12.0-1.55103.x86_64

to reproduce:

./configure --with-ucx --enable-mca-dso --prefix=favoritepath
make install

the resulting ompi_info will segfault with a traceback like:

[er-head:1511505:0:1511505] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7fffeded5298)
==== backtrace (tid:1511505) ====
 0 0x0000000000012c20 .annobin_sigaction.c()  sigaction.c:0
 1 0x0000000000043529 var_destructor() XZXX /opal/mca/base/mca_base_var.c:1859
 2 0x000000000003ec96 opal_obj_run_destructors()  XXXXXX/ompi-er2/opal/mca/base/../../../opal/class/opal_object.h:472
 3 0x000000000004136b mca_base_var_finalize()  XXXXXX/opal/mca/base/mca_base_var.c:1122
 4 0x000000000002ed5a opal_finalize_cleanup_domain()  XXXXXXXX/opal/runtime/opal_finalize_core.c:128
 5 0x000000000002eeba opal_finalize_util()  XXXXXXXX/opal/runtime/opal_finalize_core.c:143
 6 0x0000000000402121 main()  XXXXXXZ/ompi/tools/ompi_info/ompi_info.c:195
 7 0x0000000000023493 __libc_start_main()  ???:0
 8 0x00000000004014de _start()  ???:0

if one configures with --with-ucx=no on this system open mpi runs nominally with --enable-mca-dso conig option.

roiedanino added a commit to roiedanino/ompi that referenced this issue May 7, 2023
…ion fix

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit d79d5e8)
roiedanino added a commit to roiedanino/ompi that referenced this issue May 7, 2023
…ion fix

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit d79d5e8)
roiedanino added a commit to roiedanino/ompi that referenced this issue May 7, 2023
…ion fix

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit d79d5e8)
janjust pushed a commit to roiedanino/ompi that referenced this issue May 9, 2023
…ion fix

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit d79d5e8)
roiedanino added a commit to roiedanino/ompi that referenced this issue May 15, 2023
…ion fix

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit d79d5e8)
roiedanino added a commit to roiedanino/ompi that referenced this issue May 15, 2023
…ion fix

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit d79d5e8)
janjust pushed a commit to roiedanino/ompi that referenced this issue May 25, 2023
…ion fix

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit d79d5e8)
roiedanino added a commit to roiedanino/ompi that referenced this issue May 29, 2023
roiedanino added a commit to roiedanino/ompi that referenced this issue Jun 1, 2023
roiedanino added a commit to roiedanino/ompi that referenced this issue Jun 5, 2023
roiedanino added a commit to roiedanino/ompi that referenced this issue Jun 5, 2023
janjust added a commit that referenced this issue Jun 6, 2023
…gfault

OPAL/MCA/COMMON/UCX: #11632 bugfix - mca string variables registration
roiedanino added a commit to roiedanino/ompi that referenced this issue Jun 6, 2023
bugfix open-mpi#11632

Signed-off-by: Roie Danino <[email protected]>

(cherry picked from commit 32aba0b)
@hppritcha
Copy link
Member Author

fixed via #11738 and #11640

bosilca pushed a commit to yli137/ompi that referenced this issue Aug 8, 2023
bosilca pushed a commit to bosilca/ompi that referenced this issue Aug 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant