-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{lang}[iimkl/2023a] SciPy-bundle v2023.07 #18875
{lang}[iimkl/2023a] SciPy-bundle v2023.07 #18875
Conversation
I am trying to make a SciPy-bundle for the newest iimkl toolchain. I have copied the differences between the foss and intel versions from previous bundles (mainly just a single patch). However, it segfaults in the I am tempted to patch out the test and hope for the best, but I suspect somebody with more expertise than me could perhaps debug and fix it. The test looks like this:
and the
|
OK, patching out the segfaulting test leads to a bunch of other FORTRAN errors. It looks like something is wrong in how to interface FORTRAN and Python with the Intel compilers. I have no idea how to proceed. The errors are of this kind:
and the code it tries to compile is this autogenerated code:
|
numpy/numpy#20157 At least this error message seems like it isn't a new thing. @akesandgren how have we circumvented this issue with scipy bundle so far? |
easybuild/easyconfigs/s/SciPy-bundle/SciPy-bundle-2023.07-iimkl-2023a.eb
Show resolved
Hide resolved
Now there is a C compile error. It looks like a function has changed signature from accepting an int* to accepting a long* and that has broken some other code. It is one of those things that it is scary to change without understanding it, but I'll dig into it a bit more... The error is this:
and the relevant function in
The error is when |
It looks like the second pointer should be
In any case, there are two other tests that fail, I suspect that has something to do with FORTRAN data types as well:
I must admit that I do not have the expertise to take this further. |
Progress: Now numpy builds. The main problem was that Intel fortran does not support 10-byte reals and 16-byte integers, and the python code in f2py needs to know the supported data type to build extensions that do not crash. Perhaps this was not tested properly before. Now scipy fails to build as meson cannot find mkl. I'll look at that later, unless somebody has some cool insight that could help me... |
Mayday! I will need help with this by somebody who knows easybuild a lot better than me, perhaps @Micket or @boegel ? I finally got numpy to build with intel/2023a, but scipy fails to build. I suspect it may be as simple as passing the right option to the build step. The Meson build system cannot find MKL, it says that it fails to find it using pkg-config or cmake:
And indeed pkg-config cannot find mkl, there is no mkl.pc file. There are "only" these:
I do not know meson at all, and mkl only very superficially, so I do not know how to make pip install scipy correctly. Probably just some option that needs to be passed. |
so a quick look at mkl has never supplied anything named plainly A quick check with this toolchain + pkgconf
it does work. Skimming the meson.build a bit it looks like they are aware of that, and it should be possible to tell it to use blas_name == mkl-dynamic-lp64-seq. Looks to me that the easyblock just sets "mkl" and expects the meson build to figure it out (a fair assumption, most build tools wouldn't rely on pkg-config for this, but just look at MKL_ROOT and pick the build config themselves) I guess this worked in the past, but no longer |
Thank you, @Micket. Can I assume that Is the scipy.py easyblock still used, when SciPy-bundle is a PythonBundle - I mean, does the PythonBundle easyblock call sub-easyblocks so to speak? It is relevant for where to hack :-) |
OK, this is an ugly hack, but it actually gets scipy compiled. Now it just needs to pass its test suite. The saga continues... |
@schiotz: Tests failed in GitHub Actions, see https://github.com/easybuilders/easybuild-easyconfigs/actions/runs/6623690740
bleep, bloop, I'm just a bot (boegelbot v20200716.01) |
Yes. The rule for PythonBundle is if a custom easyblock matching the extension name exists (and it does for scipy, numpy and some others), it is used, else default to PythonPackage
That is what scipy mentions in their meson file, though it's not strict. |
Now scipy compiles, and gets two thirds through its test suite, before Python crashes with a segmentation fault:
The crashing function seems to be
Edit: I am hallucinating... |
+# Hack for MKL in easybuild | ||
+if blas_name == 'mkl' | ||
+ blas_name = 'mkl-dynamic-lp64-seq' | ||
+endif | ||
+if lapack_name == 'mkl' | ||
+ lapack_name = 'mkl-dynamic-lp64-seq' | ||
+endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just fix the scipy easyblock instead. The most annoying part would just be to check if this behavior is backwards compatible with older scipy's or if we need to make a check with LooseVersion and only use the full "mkl-dynamic-lp64-seq" name going forward.
I also wonder how this plays together with RPATH users .. sigh. Maybe more involved logic that uses the environment variables that the toolchain sets up should be attempted instead of these prepackages PC files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scipy only builds with Meson from SciPy-bundle-2023.02
and later, and there are no Intel versions of these, so I do not think there will be any problem with backwards compatibility.
I have no clue about RPATH, but usually pkg-config sets that correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made a PR for the easyblock, I'll test it later today.
It seems that the above-mentioned test is already skipped on 32-bit cpus; and on windows the corresponding complex128 test is skipped because it crashes. Apparently the test is just skipped in these cases, perhaps it is OK for us to skip them, too. From
|
…ed test tolerances.
Now it works. Updates:
|
Test report by @boegel |
Test report by @boegelbot |
Test report by @boegel |
Co-authored-by: Simon Branford <[email protected]>
@branfosj I used
Others have definitely seen this too, see scipy/scipy#17075 and scipy/scipy#11339 |
@boegelbot please test @ generoso |
@boegel: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1912382353 processed Message to humans: this is just bookkeeping information for me, |
I've run it twice with |
Test report by @branfosj |
Test report by @branfosj |
@boegelbot please test @ jsc-zen3 |
@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1912427908 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegel |
Test report by @branfosj |
Test report by @boegelbot |
Test report by @boegel |
Test report by @boegelbot |
Test report by @boegel |
requested changes made
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Test report by @boegel |
It's time to call this one, we've burned enough dead dinosaurs over this. Test suite with a handful of tests being tweaked/disabled (see patches) for:
Should any more issues arise, they'll have to be dealt with in a subsequent PR... |
Going in, thanks @schiotz! |
Test report by @boegel |
After having tried this without lowopt/strict with intel compilers and flexiblas/OpenBLAS I see the same type of problems. |
(created using
eb --new-pr
)