Skip to content

Conference call notes 20220608

Kenneth Hoste edited this page Jul 6, 2022 · 10 revisions

(back to Conference calls)

Notes on the 199th EasyBuild conference call, Wednesday 8 June 2022 (15:00 UTC)

Attendees

Alphabetical list of attendees (17):

  • Sebastian Achilles (Jülich Supercomputing Centre, Germany)
  • Jordi Camps (CNAG-CRG)
  • Em Dragowsky (Case Western Reserve University, Ohio, US)
  • Jasper Grimm (University of York, UK)
  • Thomas Hayward-Schneider (Max Planck Institute for Plasma Physics (IPP))
  • Kenneth Hoste (HPC-UGent, Belgium)
  • Terje Kvernes (University of Oslo, Norway)
  • Kurt Lust (Univ. of Antwerpen, Belgium + LUMI User Support Team)
  • Sam Moors (Vrije Universiteit Brussel, Belgium)
  • Sebastien Moretti (SIB, Switzerland)
  • Mikael Öhman (Chalmers University of Technology, Sweden)
  • Bart Oldeman (Digital Research Alliance of Canada)
  • Jurij Pečar (EMBL, Germany)
  • Jörg Saßmannshausen (Imperial College London, UK)
  • Alexandre Strube (Jülich Supercomputing Centre, Germany)
  • Davide Vanzo (Microsoft Azure)
  • Lars Viklund (Umeå University, Sweden)

Agenda

  • overview of recent developments
  • progress on 2022a update of common toolchains
  • Q&A

Recent developments

  • release timeline
    • latest release: EasyBuild v4.5.5 (8 June 2022)
    • ETA next release: early July 2022
  • recent changes
    • framework
      • bug fixes
        • correctly identify Apple Silicon M1 as Arm 64-bit by also considering arm64 next to aarch64 (PR #4014)
        • fix 'eb --show-system-info' on Apple M1 system (PR #4015)
      • enhancements
        • set $FFT(W)_LIB_DIR to imkl-FFTW's lib path in build environment if usempi toolchain option is enabled (PR #4011)
          • should
        • add support for FFTW.MPI toolchain component ($FFT*DIR variables) (PR #4012)
        • add support for customizing EasyBuild command used in jobs via --job-eb-cmd (PR #4016)
      • changes
        • ...
    • easyblocks
      • bug fixes
        • also symlink cert.pem in from-source OpenSSL installation (if it exists) (PR #2735)
      • enhancements
        • enhance SuperLU easyblock to support building on top of FlexiBLAS and be compatible with SuperLU v5.3(PR #2722)
        • enhance Clang easyblock to support also installing Python bindings (PR #2721 + PR #2725)
        • modify FFTW's sanity check step to allow checking only for MPI parts of FFTW installation (PR #2724)
        • add support to ConfigureMake for tweaking (first part of) test command via 'test_cmd' (PR #2726 + PR #2737)
        • enhance MrBayes easyblock with custom sanity check command (PR #2727)
        • update cudnnarch string templates used to compose source tarball names from cuDNN 8.3.3 onwards (PR #2728)
        • add sanity check command to OpenSSL wrapper easyblock to verify that system certificates are available to OpenSSL (PR #2735)
        • ignore exit code of pkg-config command in OpenSSL wrapper easyblock, since with pkgconf they exit with a non-zero exit code if the OS package is not installed (PR #2736)
      • updates
        • update NEURON easyblock to use CMakeMake for recent versions (PR #2304)
        • update sanity check in OpenMPI easyblock to support OpenMPI v5.0.0 (PR #2709)
        • update ABAQUS easyblock for ABAQUS 2022 (PR #2716)
        • update TensorFlow easyblock for version 2.8.0 (PR #2723)
      • changes
        • ...
      • new software
        • add custom easyblock for FFTW.MPI (PR #2724)
    • easyconfigs
      • bug fixes
        • fix RepeatMasker-4.1.2-p1 easyconfig by moving the database configure step to be after installation (PR #15280)
        • add hwloc dependency to recent tbb easyconfigs (PR #15294)
        • OpenMPI 4.1.1: patch and build --with-cuda=internal (PR #15528 + PR #15589)
          • fixes segfaults when using CUDA buffers (issue #14801)
          • should UCX-CUDA re-enable the OpenMPI warning if libcuda.so.1 (provided with GPU driver) is missing
          • performance-related patch is being pushed upstream by Bart, cfr. https://github.com/open-mpi/ompi/pull/10364
          • Bart is planning to also propose inclusion of small cuda.h upstream
        • add missing dependencies + switch to non-static build for Arriba v2.1.0 (PR #15623)
        • add alternative checksums for class, nnet, spatial extensions in R v4.2.0 easyconfig (PR #15619)
        • also build shared library + fix $PYTHONPATH for gmsh 4.9.0 (PR #15579)
        • fix download of thrift 0.12.0 for Arrow 0.16.0 (PR #15597)
      • enhancements
        • allow external tools to be located elsewhere for ETE (PR #15578)
        • add additional sanity check commands for IQ-TREE v2.2.1 (PR #15596)
        • add csh -> tcsh symlink in recent tcsh easyconfigs (PR #15571)
      • (noteworthy) new software
      • noteworthy software updates
      • changes
        • switch from pkg-config to pkgconf as build dependency for OpenSSL wrapper easyconfigs (PR #15616)
        • install sklearn meta-package with scikit-learn v1.0.1 (PR #15613)
  • to merge/fix/tackle soon
    • framework
      • reported bugs / bug fixes
        • make sure that ARCH constant has 'aarch64' (rather than 'arm64') as value on macOS ARM (PR #4018)
        • tweak eb wrapper script to correctly handle errors when python command fails to run (PR #4019)
        • easyblock PR patches never applied when running in dry run mode (issue #4017)
      • enhancements
        • add Generation module naming scheme (PR #3547)
        • update prepare_rpath_wrappers to enable wrapper shipping with a module (WIP) PR #4003
          • relevant for EESSI project
          • see also companion PR for GCC easyblock (PR #2638)
        • time to switch to icx/icpx for Intel C/C++ compilers (issue #4009)
          • we should look into adding support for additional toolchain options to control which C/C++/Fortran compiler is used?
        • add support for easystack file that contains easyconfig filenames (PR #4021)
      • changes
        • ...
    • easyblocks
      • bug reports/fixes
        • fix extension filter for Perl packages (PR #2699)
        • make Amber easyblock aware of FlexiBLAS (PR #2720)
        • Bundle easyblock ignores make_module_req_guess() from components (issue #2733)
      • enhancements
        • enable building of shared library for Libint 2.7+ (PR #2738)
        • don't allow an easyblock that overrides run_all_steps() to be used in a bundle (PR #2732)
          • context: GROMACS does not play nice as a component (PR #2731)
        • warn when building CUDA-enabled software if libcuda is not present (issue #4022)
      • updates
        • update LAMMPS easyblock for LAMMPS/29Oct20 (PR #2213)
      • new software
        • ...
      • changes
        • ...
    • easyconfigs
      • still over 600 open easyconfig PRs, we're way overdue a significant cleanup round...
      • bug fixes
        • pygmo tries to install stuff in Python install directory (PR #15631)
          • should EasyBuild configuration on generoso be changed to use read-only installation directories to help catch this?
        • switch to Rust 1.60.0 build dependency for bamtofastq, since build of fails with Rust 1.52.1 (E0658) (PR #15636)
      • enhancements
        • support offline installation of Rust (see also issue #13548)
      • new software
      • noteworthy software updates
        • RoseTTAFold v1.0.0 (PR #13795)
        • TensorFlow v2.7.1 (WIP) (PR #14990)
          • failing tests are for new features of TensorFlow
          • ignore these failing tests for now, report upstream
        • hipSYCL v0.9.2 (PR #15074)
        • Qt6 (PR #15096)
        • AlphaFold v2.2.0 (PR #15129)
          • failing jax tests on A100?
          • Sam: easy way out, just use older jax version that is still supported
        • PyTorch v1.11.0 (PR #15137)
        • CP2K v9.1 with foss/2021a (PR #15146) and intel/2021a (PR #15147)
      • changes
        • suggestion: always use PythonBundle instead of PythonPackage (issue #15639)

2022a update of common toolchains

  • ~55 PRs merged already for easyconfigs in this generation
  • candidates for 2022a toolchains included with EasyBuild v4.5.5:
    • foss/2022.05 (see PR #15561)
      • GCC 11.3.0 (latest 11.x) + binutils 2.38
        • with ld.bfd as default linker (rather than ld.gold, which is no longer actively maintained)
      • OpenMPI 4.1.4 + UCX 1.12.1 + PMIx 4.1.2 + libfabric 1.15.1
      • FlexiBLAS 3.2.0 + OpenBLAS 0.3.20 + BLIS 0.9.0
      • FFTW 3.3.10
      • ScaLAPACK 2.2.0
    • intel/2022.05 (see PR #15485)
      • GCCcore 11.3.0 + binutils 2.38 as base
      • intel-compilers 2022.1.0 (latest)
      • impi 2021.6.0 (latest)
      • imkl 2022.1.0 (latest)
  • "major apps" with 2022.05 candidate toolchains:
  • TODO: test more software on top of candidate toolchains, incl. CP2K, OpenFOAM, TensorFlow, PyTorch, ...

Q&A

  • problematic tests suite for PyTorch, TensorFlow, AlphaFold
    • should we still be running the tests by default?
      • probably yes, even though they can take a lot of time
    • looking into allowing a handful of tests to fail
      • for PyTorch: allow max. of 10 "failed!" lines in output of test suite
      • similar approach for TensorFlow makes sense
  • PETSc PR #15519: should we have all those dependencies included?
    • we usually try to provide "fat" installations by default
    • should be using existing PETSc easyblock though...
  • Sebastian: should MutlilevelEstimators PR #15630 use a proper easyblock for installing Julia packages?
    • JSC has an easyblock to install Julia packages "properly"
    • related (Mikael): should we get back to building Julia from source?
      • (Bart) that's being done at ComputeCanada, but you need to be very careful, and make sure you pick up their patches dependencies
      • probably not worth the effort in terms of performance of the installation (since Julia does lots of JiT compilation)
    • we should also hide the CPU arch versionsuffix in Julia easyconfigs using same approach as we use for Java (multi-source easyconfig based on CPU arch)
    • note that the easyconfig in the MutlilevelEstimators PR doesn't have any source files specified, so the sources are just pulled in during the installation... (which we should definitely avoid)
Clone this wiki locally