Skip to content

WeeklyTelcon_20171003

Geoffrey Paulsen edited this page Jan 9, 2018 · 1 revision

Open MPI Weekly Telcon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees

  • Geoff Paulsen (IBM)
  • Geoffroy Vallee (ORNL)
  • Brian Barrett
  • Howard Pritchard
  • Joshua Hursey
  • Todd Kordenbrock
  • Geoffroy Vallee (ORNL)
  • Joshua Ladd (Mellanox)
  • Thomas Naughton

Agenda

Review v2.0.x Milestones v2.0.4

  • Going to switch v2.0.x to only Critical fixes only!
    • Only Critical fix we know of now is MAdvise fix.
    • IN.
  • Ask people to move to v2.1.x or v3.0.0
  • If nothing else critical Howard and Jeff will make an RC soon.
  • targeting Oct 21st for release.
    • Still on Target.
  • Iterating a bit on disabling cuda inside of hwloc 4249 PR on this branch.
    • Issue 4248 - disabling cuda on hwloc

Review v2.x Milestones v2.1.2

  • v2.1.3 (unscheduled, but probably jan 19, 2018)
    • PR4172 - a mix between feature / bugfix.
  • nothing happening much now. Some outstanding
    • 4229 - Geoff Review.

Review v3.0.x Milestones v3.0

  • v3.0.1 - Opened the branch for bugfixes Sep 18th.
    • Still targeting End of October for release of v3.0.1
    • Everything ready to push has been.
  • Branched v3.1 last night, but forgot to build nightly tarballs.
    • Building now.
    • Still getting a ton of TCP errors.
      • Mohan and George going back on forth for a solution.
      • Both master PR 4263
  • Want to create first RC soon, but need this PR 4263 in first.
  • PMIx 2.1 should get in in time for v3.1
    • On Thursday will talk about this.

    • At Euro MPI - folks talked about. A lesser used PMIx feature to help support Sessions.

    • Minus this change

    • One new feature is cross version compatibility.

    • PMIx version 2.x will support one step back, PMIx v1.x Not sure if it support v1.0 and v1.1 and v1.2

    • Discuss next week exactly what this supports.

    • useful for slurm build with older PMIx.

Review Master Master Pull Requests

  • Minus TCP btl things, looking good.
  • A lot of hangs going on too, but can't debug until TCP is fixed.
  • proc_hostname code not coded correctly for 3 years. git bisect from PMIx from 2 weeks ago
    • Giles posted a fix
    • IN.

MTT / Jenkins Testing Dev

  • Python client doesn't have nightly snapshot integration.
    • Need this since this is most of the release testing.

This week Discussion Points.

  • Website - openmpi.org
    • Brian trying to make things more automated, so can checkout repo, etc. Repo is TOO large.
    • Majority of the problem is the Tarballs. and already storing those in S3.

Oldest PR

Oldest Issue

Next face-to-face meeting

  • Jan / Feb
  • Possible locations: San Jose, Portland, Albuquerque, Dallas

Status Updates:

Status Update Rotation

  1. Mellanox, Sandia, Intel
  2. LANL, Houston, IBM, Fujitsu
  3. Amazon,
  4. Cisco, ORNL, UTK, NVIDIA

Back to 2017 WeeklyTelcon-2017

Clone this wiki locally