Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auto-detection of Intel MKL and OpenBLAS #1316

Merged
merged 6 commits into from
Sep 25, 2022

Conversation

ischoegl
Copy link
Member

@ischoegl ischoegl commented Jun 5, 2022

Changes proposed in this pull request

The following priority is used:

  1. Use Intel MKL if installed
  2. Use OpenBLAS if installed
  3. Use lapack,blas if installed
  4. Use prior behavior

If applicable, fill in the issue number this pull request is fixing

Closes Cantera/enhancements#144

If applicable, provide an example illustrating new features this pull request is introducing

$ scons build locate_lapack=auto
[...]
INFO: Using private installation of Sundials version 5.3.
Checking for C library mkl_rt... (cached) yes
INFO: Using MKL 2022.0.1 for Intel(R) 64 architecture
INFO: Skipping compilation of the Fortran 90 interface.
[...]

Checklist

  • The pull request includes a clear description of this code change
  • Commit messages have short titles and reference relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • Style & formatting of contributed code follows contributing guidelines
  • The pull request is ready for review

@ischoegl ischoegl force-pushed the check-for-mkl-and-openblas branch 5 times, most recently from 6c0e8b0 to df07bd4 Compare June 5, 2022 18:05
@ischoegl ischoegl marked this pull request as ready for review June 5, 2022 19:04
@ischoegl ischoegl requested a review from a team June 5, 2022 19:36
@bryanwweber
Copy link
Member

Thanks @ischoegl I think this is an interesting proposal. I'm concerned that turning these things on by default will cause more issues like Cantera/conda-recipes#29, especially since we also default to adding the conda directories to the path. What do you think?

@ischoegl
Copy link
Member Author

ischoegl commented Jun 6, 2022

@bryanwweber … packaging is an interesting aspect here, as local installations were my primary concern. For the latter, I don’t see any potential issues, whereas for the former things are certainly more interesting. What do you mean by ‘especially since we also default to adding the conda directories to the path’? Having conda paths included should be desirable, no?

@bryanwweber
Copy link
Member

The problem I can foresee is that you most likely don't want to scons install into the same environment with all your build dependencies, or at least, I wouldn't want to, for instance by setting python_cmd. In that case, MKL might not be installed (or a different version is installed) in the environment where python_cmd is located, and you'd get errors loading the module.

Similarly, if someone has the conda directories added into their linker path, but installs into the system directories, MKL might not be found. This should be prevented by the fact that setting prefix would disable automatic conda paths, but then they might add the conda paths manually to pick up one of our other dependencies.

@ischoegl
Copy link
Member Author

ischoegl commented Jun 6, 2022

The problem I can foresee is that you most likely don't want to scons install into the same environment with all your build dependencies, or at least, I wouldn't want to, for instance by setting python_cmd. In that case, MKL might not be installed (or a different version is installed) in the environment where python_cmd is located, and you'd get errors loading the module.

I believe most users would want to install into the same environment, see environment.yaml discussion. I believe there are a couple of edge cases, which need to be identified and where auto detection may need to be disabled.

Similarly, if someone has the conda directories added into their linker path, but installs into the system directories, MKL might not be found. This should be prevented by the fact that setting prefix would disable automatic conda paths, but then they might add the conda paths manually to pick up one of our other dependencies.

I think that if the conda layout is used, MKL is certainly desirable, but the same is true for other layouts? Linking against dependencies that are not available in a different context is imho user error, where the same arguments apply to various ‘system’ libraries, e.g. yaml-cpp, etc.? We likewise check for existence and provide fallback options only if nothing is found?

@ischoegl
Copy link
Member Author

ischoegl commented Jun 6, 2022

@bryanwweber ... I ended up adding an option that allows for control of the detection of 'optimized' BLAS/LAPACK.

In addition, I ran some tests with some odd results for MKL when running speed tests for ignition

Without MKL (private Sundials 5.3)

In [2]: %run custom_reactions.py
Average time of 100 simulation runs for 'gri30.yaml' (CH4)
- New framework (YAML): 92.27 ms (T_final=2736.60)
- One Python reaction: 99.00 ms (T_final=2736.60) ... +7.30%

With MKL (private Sundials 5.3)

In [2]: %run custom_reactions.py
Average time of 100 simulation runs for 'gri30.yaml' (CH4)
- New framework (YAML): 102.01 ms (T_final=2736.60)
- One Python reaction: 99.66 ms (T_final=2736.60) ... -2.31%

With OpenBLAS (private Sundials 5.3)

In [2]: %run custom_reactions.py
Average time of 100 simulation runs for 'gri30.yaml' (CH4)
- New framework (YAML): 100.79 ms (T_final=2736.60)
- One Python reaction: 103.94 ms (T_final=2736.60) ... +3.12%

Without MKL (system Sundials 6.0)

In [2]: %run custom_reactions.py
Average time of 100 simulation runs for 'gri30.yaml' (CH4)
- New framework (YAML): 92.19 ms (T_final=2736.60)
- One Python reaction: 99.34 ms (T_final=2736.60) ... +7.76%

With MKL (system Sundials 6.0)

In [3]: %run custom_reactions.py
Average time of 100 simulation runs for 'gri30.yaml' (CH4)
- New framework (YAML): 104.20 ms (T_final=2736.60)
- One Python reaction: 102.95 ms (T_final=2736.60) ... -1.20%

These results indicate that the 'optimized' libraries slow things down. Further, it looks like the inclusion of Python calls (which should slow things down) is advantageous for MKL.

PS: There's likely overhead involved that may require fine-tuning beyond just enabling/disabling some libraries.

@ischoegl ischoegl marked this pull request as draft June 6, 2022 16:45
@ischoegl
Copy link
Member Author

@speth ... in light of the recent UG posting about speed differences, I am curious about your opinion on the slow-downs I observed when testing changes this PR is proposing. While the mechanism is small, I'd appreciate any clues?

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as implemented now, with the default being to just use Eigen, this is a reasonable addition.

Regarding the performance differences you noted, I'm not really sure what's behind them. For systems as small as GRI 3.0, the linear algebra really should not be taking that much of the compute time. The place where MKL and OpenBLAS can really beat Eigen is more likely to be in larger systems where multithreading is actually helpful.

SConstruct Outdated Show resolved Hide resolved
SConstruct Outdated Show resolved Hide resolved
@ischoegl ischoegl force-pushed the check-for-mkl-and-openblas branch 10 times, most recently from 7b9ea5b to f06dd93 Compare September 11, 2022 12:18
@ischoegl ischoegl marked this pull request as ready for review September 11, 2022 12:42
@ischoegl
Copy link
Member Author

Thanks for your input on this, @speth!

I updated a couple of things, most notably a new locate_lapack=standard option that looks for lapack,blas, which is also part of the auto sequence.

Outside of packaging, I personally don’t see major issues where auto configuration could go awry, but I’m happy to leave things ‘off’.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this just needs a small fix to the documentation. Otherwise, this looks good to me.

SConstruct Outdated Show resolved Hide resolved
@ischoegl
Copy link
Member Author

ischoegl commented Sep 11, 2022

@speth … thanks! Before I fix the docstring, I believe that at this point the auto option does the correct thing in all cases I can think of. So it may be safe to make it the default after all?

@speth
Copy link
Member

speth commented Sep 11, 2022

🤷 I'm willing to give it a try.

@ischoegl ischoegl force-pushed the check-for-mkl-and-openblas branch 3 times, most recently from 7fea1f0 to 17b8137 Compare September 12, 2022 01:38
@ischoegl ischoegl marked this pull request as draft September 12, 2022 02:51
@ischoegl ischoegl marked this pull request as ready for review September 12, 2022 02:53
@ischoegl ischoegl marked this pull request as draft September 12, 2022 04:32
@ischoegl ischoegl force-pushed the check-for-mkl-and-openblas branch 3 times, most recently from 219c01f to 5cb2550 Compare September 12, 2022 15:09
@ischoegl
Copy link
Member Author

ischoegl commented Sep 12, 2022

While updating the docstrings to use 'automatic' configuration of BLAS/LAPACK by default, I realized that system_blas_lapack would be a better option name, as it is consistent with what is done for other external dependencies (especially sundials). I came up with the following:

 % scons help --option=system_blas_lapack
scons: Reading SConscript files ...

* system_blas_lapack: [ 'n' | 'y' | 'default' ]
    Select whether to use BLAS/LAPACK from a system installation ('y'), use
    Eigen linear algebra support ('n'), or to decide automatically based on
    libraries detected on the system ('default'). Specifying 'blas_lapack_libs'
    or 'blas_lapack_dir' changes the default to 'y', whereas installing the
    Matlab toolbox changes the default to 'n'. On macOS, the 'default' option
    uses the Accelerate framework, whereas on other operating systems the
    preferred option depends on the CPU manufacturer. In general, OpenBLAS
    ('openblas') is prioritized over standard libraries ('lapack,blas'), with
    Eigen being used if no suitable BLAS/LAPACK libraries are detected. On Intel
    CPU's, MKL (Windows: 'mkl_rt' / Linux: 'mkl_rt,dl') has the highest
    priority, followed by the other options. Note that Eigen is required whether
    or not BLAS/LAPACK libraries are used.
    - default: 'default'

@ischoegl ischoegl force-pushed the check-for-mkl-and-openblas branch 3 times, most recently from 9bebf78 to 2a11eb4 Compare September 12, 2022 15:24
@ischoegl ischoegl marked this pull request as ready for review September 12, 2022 16:31
@ischoegl ischoegl requested a review from speth September 12, 2022 16:31
@speth
Copy link
Member

speth commented Sep 22, 2022

The new docstring looks good to me. What's behind the change to disable the Fortran interface on the OneAPI builders? Can you update the commit message to explain why that was needed?

Unit tests fail for optimized LAPACK libraries (see Cantera#1393)
@ischoegl
Copy link
Member Author

ischoegl commented Sep 22, 2022

The new docstring looks good to me. What's behind the change to disable the Fortran interface on the OneAPI builders? Can you update the commit message to explain why that was needed?

Done - I created issue #1393 to document the reason for this change: the f77-demo test fails for optimized LAPACK libraries (MKL is now automatically detected on the OneAPI tool chain).

@ischoegl
Copy link
Member Author

@speth … let me know if there’s anything else to do here. I don’t want to address #1393 here as it is imho a pre-existing condition.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, looks good to me.

@speth speth merged commit 0cd37f0 into Cantera:main Sep 25, 2022
@ischoegl ischoegl deleted the check-for-mkl-and-openblas branch September 25, 2022 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use BLAS/LAPACK by default if installed
3 participants