Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{lang}[foss/2021.04] SciPy-bundle v2021.05 #12935

Merged

Conversation

boegel
Copy link
Member

@boegel boegel commented May 23, 2021

(created using eb --new-pr)

@boegel boegel added the update label May 23, 2021
@boegel boegel force-pushed the 20210523193010_new_pr_SciPy-bundle202105 branch from 00e3b40 to 79fa6a9 Compare May 23, 2021 19:00
@easybuilders easybuilders deleted a comment from boegelbot May 23, 2021
@boegel boegel added this to the 4.4.0 milestone May 23, 2021
@boegel boegel force-pushed the 20210523193010_new_pr_SciPy-bundle202105 branch from 79fa6a9 to f1b3bae Compare May 24, 2021 07:10
@easybuilders easybuilders deleted a comment from boegelbot May 24, 2021
@boegel
Copy link
Member Author

boegel commented May 24, 2021

@boegelbot please test @ generoso

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=12935 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_12935 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 17297

Test results coming soon (I hope)...

- notification for comment with ID 847013338 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@verdurin
Copy link
Member

Test report by @verdurin
FAILED
Build succeeded for 31 out of 39 (3 easyconfigs in total)
easybuild-c7.novalocal - Linux centos linux 7.9.2009, x86_64, Intel Xeon Processor (Skylake, IBRS), Python 3.6.8
See https://gist.github.com/54c8b9e7268413bc805d184dfdc7f5c0 for a full test report.

@boegel
Copy link
Member Author

boegel commented May 24, 2021

@verdurin Looks like we need to include the OpenSSL wrapper dependency to Rust? Can you verify if that fixes this problem for you?

cargo: error while loading shared libraries: libssl.so.1.1: cannot open shared object file: No such file or directory

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
generoso-c1-s-1 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/397a5ffe5bfb932023c7beaf21a01783 for a full test report.

@boegel
Copy link
Member Author

boegel commented May 24, 2021

Test report by @boegel
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
node3161.skitty.os - Linux centos linux 7.9.2009, x86_64, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz, Python 3.6.8
See https://gist.github.com/87cf1d1752e0a62cc0e0d31905f22a2c for a full test report.

@boegel
Copy link
Member Author

boegel commented May 24, 2021

Test report by @boegel
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
node2617.swalot.os - Linux centos linux 7.9.2009, x86_64, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz (haswell), Python 3.6.8
See https://gist.github.com/76ebfb350ff7dd0e042e31d296fafad3 for a full test report.

@boegel
Copy link
Member Author

boegel commented May 24, 2021

Test report by @boegel
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
node3532.doduo.os - Linux RHEL 8.2, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/81d6648614a3e51ef5b44bbf056df1a0 for a full test report.

@verdurin
Copy link
Member

Test report by @verdurin
FAILED
Build succeeded for 27 out of 28 (3 easyconfigs in total)
nuc.lan - Linux Fedora 33, x86_64, Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, Python 3.9.4
See https://gist.github.com/dabae0a1f54ff156219a56ac71e29cb9 for a full test report.

@boegel
Copy link
Member Author

boegel commented May 24, 2021

@verdurin The following module(s) are unknown: "Rust/1.52.1-GCCcore-10.3.0"

Multiple sessions in parallel?

@verdurin
Copy link
Member

Test report by @verdurin
SUCCESS
Build succeeded for 1 out of 1 (3 easyconfigs in total)
nuc.lan - Linux Fedora 33, x86_64, Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, Python 3.9.4
See https://gist.github.com/a46c0443a3884af5ca43ca6b6adabe2c for a full test report.

@verdurin
Copy link
Member

@boegel the most recent build is with Rust modified with the OpenSSL wrapper as a dependency.

@boegel
Copy link
Member Author

boegel commented May 24, 2021

Looks ready to go on x86_64, but there are known problems on aarch64 and pp64le:

Neither of these issues should block this PR though, especially since the problem is not in numpy itself, but in OpenBLAS and/or FlexiBLAS...

Copy link
Member

@verdurin verdurin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

@verdurin
Copy link
Member

Going in, thanks @boegel!

@verdurin verdurin merged commit c7b7b4c into easybuilders:develop May 25, 2021
@boegel boegel deleted the 20210523193010_new_pr_SciPy-bundle202105 branch May 25, 2021 11:29
@Flamefire
Copy link
Contributor

Test report by @Flamefire
FAILED
Build succeeded for 20 out of 21 (3 easyconfigs in total)
taurusml20 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), Python 2.7.5
See https://gist.github.com/7766b6b58fd10ed162f936a67a0553a0 for a full test report.

@boegel
Copy link
Member Author

boegel commented May 25, 2021

@Flamefire Anything useful in the core dump?

@Flamefire
Copy link
Contributor

The segfault I get is a double-free:
*** Error in /beegfs/global0/ws/s3248973-easybuild/easybuild-ml/software/Python/3.9.5-GCCcore-10.3.0/bin/python': double free or corruption (out): 0x0000000013434ed0 ***`

(gdb) bt
#0  0x0000200000b6fbf0 in raise () from /lib64/libc.so.6
#1  0x0000200000b71f6c in abort () from /lib64/libc.so.6
#2  0x0000200000bb8d10 in __libc_message () from /lib64/libc.so.6
#3  0x0000200000bc9be0 in free () from /lib64/libc.so.6
#4  0x0000200b0abf9920 in release_dgeev (params=0x7ffffffdcf00) at numpy/linalg/umath_linalg.c.src:2229
#5  DOUBLE_eig_wrapper (JOBVL=JOBVL@entry=78 'N', JOBVR=JOBVR@entry=86 'V', args=0x200b1f804120, dimensions=<optimized out>, steps=<optimized out>) at numpy/linalg/umath_linalg.c.src:2324
#6  0x0000200b0abf9a70 in DOUBLE_eig (args=<optimized out>, dimensions=<optimized out>, steps=<optimized out>, __NPY_UNUSED_TAGGEDfunc=<optimized out>) at numpy/linalg/umath_linalg.c.src:2336

@boegel
Copy link
Member Author

boegel commented May 26, 2021

@Flamefire Please report that in mpimd-csc/flexiblas#17 (keeping track of things in merged PRs is impossible)

That seems to suggest the problem may be in numpy rather than FlexiBLAS though...

@Flamefire
Copy link
Contributor

Not sure. It doesn't happen with OpenBLAS alone and there are some failing lapack calls with FlexiBLAS so it might as well be memory corruption
And yes reported there

@boegel
Copy link
Member Author

boegel commented May 27, 2021

I opened a dedicated issue for the segfault on POWER, see #12968 .

For the failing numpy tests on aarch64, we can track that in #11959.

@Flamefire
Copy link
Contributor

Test report by @Flamefire
SUCCESS
Build succeeded for 9 out of 9 (3 easyconfigs in total)
taurusi8006 - Linux centos linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), Python 2.7.5
See https://gist.github.com/592685426ccac14e05e81dc8ad116794 for a full test report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants