Skip to content

WeeklyTelcon_20191126

Geoffrey Paulsen edited this page Nov 26, 2019 · 1 revision

Open MPI Weekly Telecon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Noah Evans (Sandia)
  • Geoffrey Paulsen (IBM)
  • Akshay Venkatesh (NVIDIA)
  • Austen Lauria (IBM)
  • Brian Barrett (AWS)
  • Edgar Gabriel (UH)
  • Harumi Kuno (HPE)
  • Howard Pritchard (LANL)
  • Josh Hursey (IBM)
  • Matthew Dosanjh (Sandia)
  • Michael Heinz (Intel)
  • Thomas Naughton (ORNL)
  • Todd Kordenbrock (Sandia)

not there today (I keep this for easy cut-n-paste for future notes)

  • Artem Polyakov (Mellanox)
  • William Zhang (AWS)
  • Jeff Squyres (Cisco)
  • Brendan Cunningham (Intel)
  • George Bosilca (UTK)
  • David Bernhold (ORNL)
  • Brandon Yates (Intel)
  • Charles Shereda (LLNL)
  • Erik Zeiske
  • Joshua Ladd (Mellanox)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Nathan Hjelm (Google)
  • Ralph Castain (Intel)
  • Xin Zhao (Mellanox)
  • mohan (AWS)

Agenda/New Business

New Items to discuss:

3.0.5 and 3.1.5 shipped yesterday.

  • Looks like something is wonky with announce mailing list.
  • Brian and Jeff will investigate.
  • Planning for no new fixes on 3.x, unless super critical

v4.0.x PRs:

  • Cherry-pick NO_Op fix to v4.0.x
    • Geoff will do this tomorrow.
  • PR 7166
    • This is rather large.
    • Just bugfix to make orte work as documented? Is it important in v4.0.x?
    • We forgot some feature. Useful for debugging, maybe can push to v4.1.x
  • PR 7116
    • Ensure no backwards compat issues?
    • Howard will send email to ARM.
  • PR 7149 - Geoff go look at.
  • Reminder: Please don't set milestone if PR is targeted to master.

Do we want a v4.1.x release?

  • A few new enhancements desirable.
  • Added a Target v4.1.x label
    • Many new enhancements / features would be useful
    • 7151 - This is indeed a performance enhancement.
    • 7173
    • Should look into amount of work back-porting features to a release branch.
    • It would be a major thing. But always say we don't take features into release branch thats out there.
      • people continue to open PRs with features.
    • Two issues:
      • One - we've really stalled out v5.0.0
      • Two - are performance features really an issue to pull in?
        • PR 7151 - seems to be boarderline bugfix / feature / risky

Target labels on PRs. just one for branch going into

Super Computing Discussion:

  • Super Computing conference was last week

Containers

  • Shifter containers spoke about challenges with HPC + containers.
    • Containers starting to figure out that HPC + containers are not optimized
    • No good answer for needing an MPI for your platform, sucking in libs from base system.
    • Have 4 year old containers, and hitting some libc issues.
    • Software development is "hard", and containers don't solve all of the problems.
  • Any open mpi requests for playing better with containers?
    • Link to fewer libraries.
    • not link to network libraries.

PPRTE discussion at super computing:

  • Probably should get down to supporting only one runtime.
  • Josh, Ralph, Jeff, Brian , and Tom
  • Met one day to talk about PRRTE / ORTE and what to do.
  • PRRTE probably makes the most sense
    • git submodules much better than subversion external modules.
    • Being part of the OMPI package is limiting.
    • Boxes in the Runtime to prevent ORTE from taking off on it's own.
    • Not a huge operation.
      • PMIx would be a first class citizen
      • Still bundle PRRTE in tarballs, so could launch over ssh.
      • Have to add additional Nightly testing to catch issues.
    • Talked about not being a bash script.
    • Ralph said he had most of this working on a branch.
  • PRRTE only has external hwloc, pmix, and libevent.
    • If you pull this in, will need to build PRRTE with the internal versions of
    • May accelerate need to kill off internals in Open-MPI to simplify things.
  • Release tarballs;
    • Still drop these into tarball for conveience?
    • Should discuss, perhaps a version of the tarball that has everything?
  • Possibly do a survey again, to just have everything external?
  • PRRTE Testing
    • Can develop some PMIx Unit test(s) for PMIx library and for Resource managers
      • To mimic the way that Open MPI uses PMIx.
      • PMIx acceptance tests in Open MPI project
    • Currently don't have much Runtime tests.
      • Mapping, binding, output filename, etc.
      • Use these tests to

PRRTE with/without Open-MPI was discussed at PMIx BOF

  • Questions, and discussion. Interested.

--- OLD ---

New PRRTE launcher proposal on mailing list.

Thomas - took at look to make some high level observations

  • A comment about stability / testing.
  • There are not explicit testing for ORTE, but it gets tested in Open MPI CI / MTT.
  • PRRTE has less testing, because it's not directly testing.
  • PRRTE will be needed for PMIX community.
    • Probably same community, and a shame to double the effort.
  • Binding options are the same and are there in PRRTE.
  • Singleton and ??? frameworks not in PRRTE, because it's not needed in PMIX
    • A ticket open on singletons / PMIx_Spawn()
  • Georges code for reliable connections will get pushed upstream soon when it's ready.
  • DVM has switched over to PRRTE.

PRRTE / ORTE Discussion

  • Concerns about supporting another project.
    • It adds another level of overhead coordinating / synchronizing with PRRTE/PMIX community.
  • It is valuable to have a runtime system that's divorced from MPI.
    • Don't know how to balance these two, because they're both not wrong.
  • We need a launcher, but don't want to support more than one.
    • But we support many launchers like slurm, lsf, flux, etc.
    • Those other launchers have companies and organizations behind them, and they support them through Open MPI.
  • A compromise between the two, would be to create an ORTE with all MPI removed.
    • easy to make a dist tarball
    • would skirt many political issues
  • Shouldn't support both ORTE and PRRTE.
    • Need to use the high level interfaces for PMIx, so we can move from version to version.
    • So this gets back to making PMIx the first class citizen.
      • Has to happen if we are working away from ORTE.

OLD Discussion from previous weeks:

  • All of this in context in v5.0

  • Intel is no longer driving PRRTE work, and Ralph won't be available for PRRTE much either.

  • PRRTE will be a good PMIX developement environment, but no longer a focus to be a scale and robust launcher.

  • OMPI community could come into PRRTE, and put in production / scalability testing, features, etc.

  • Given that we have not been good at contributing to PRRTE (other than Ralph), there's another proposal

    • There's been a drift from ORTE / PRRTE, so transitioning is risky.
  • Step 1. Make PMIX a first class citizen

    • Still good to keep PMIX as a static framework (no more glue, but still under orte/mca/pmix, but basicly just passes through, and call PMIX_ calls directly.
    • Allows us to still have internal backup PMIx if no external PMIX is found.
  • Step 2. We can whittle down orte, since PMIX does much of this.

  • Two things PRRTE won't care about, is scale and all binding patterns.

  • Only recent versions of SLURM have PMIx

  • Need to continue to support ssh.

    • Not just core PMIx, still need daemons for SSH to work, but they're not part of PMIx.
    • Part of ORTE that we wouldn't be deleting.
  • What does Altair PbsPro and open source PbsPro do?

    • Torque is different than PbsPro
  • Are there OLD systems that we currently support that we still don't care, and could discontinue support in v5.x

    • Who supports PMIx, and who doesn't
  • If PMIx becomes a first class citizen and rest of code base just makes PMIx calls, how do we support these things?

    • mpirun would still have to launch orteds via plm.
    • srun wouldn't need
    • But this is how it works today. Torque doesn't support PMIx at all, but TM just launches ORTEDs
    • ALPS - aprun ./a.out - requires a.out to connect up to ALPS daemons.
      • Cray still supports PMI - someone would need to write a PMI -> PMIX adapter.
    • ORTE does not have the concept of persistant daemons
  • Is there a situation where we might have a launcher launching ortes and we'd need to relay pmix calls to the correct pmix server layer?

    • Generally we won't have that situation, since the launcher won't launch ORTEds.
  • George's work currently depends on PRRTE

    • If ORTEDs provides PMIx_Events, would that be enough?
      • No George needs PRRTE's fault-tollerant overlay network.
      • George will scope the effort to port that feature from PRRTE to ORTE.
  • ACTION - Please gather list of resource managers, and Tools that we care about supporting in Open-MPI v5.0.x

  • Today - Howard

    • Summary - make PMIx a first class citizen.
    • Then whittle away ORTE as much as possible.
    • We think the only one who uses PMI1, and PMI2 might be cray.
      • Howard doesn't think Cray's even going to go that direction, might be adopting pmix for future direciton. Good super computing question.
      • Most places will be whatever SLURM does.
      • What will MPICH do? suspect PMIx
    • Howard thinks that by the time Open-MPI v5 gets out
    • Is SLURM + PMIx dead? No, it's supported, just not all of the
  • George looked into scoping the amount of work to bring reliable overlay network from

    • PRRTE frameworks not in
  • Howard also brought up that Sessions only works with PRRTE right now, so would need to backport this as well.

  • Only thing that depends on PRRTE is Sessions, Reliable connections, and Resource allocation support. Thing Geoffry Valle was working on before. Howard will investigate.

  • William Zhang has not yet committed some graph code for reachability similar to usnic.
    • Brian/William will get with Josh Hursey to potentially test some more.
    • Not sure what wanted the behavior of the netlinks reachability component.
    • Wasn't detecting mark's
    • Linux is always going to give you local route before localhost. This is one place where using reachability framework changes behavior.
    • Options, where do we want to fix this?
    1. Truthfully (even if you use 192.168...., linux will route over localhost device).
      • can say if these two addresses are the same it's always "reachable".
      • could put this down in the frameworks itself.
      • This is what netmasks does today, they're the same, so have high reachability.
    2. Can handle it in the higher layers
    3. Users could always specify localhost, or we could specify it for them.
      • You don't use a device, you give the OS a hint of what device it should use.
    • What does usnic do?
      • usnic ONLY uses reachability for remote hosts.
    • We encourage customers to say ifinclude localhost.
      • This path has worked for years, and should probably keep it working.
  • What happens if I have 3 devices, loopback, and 2 eth not wired together
    • What if I say my source is 192, but dest is the 10. path.
    • regardless of what reachability tells us, will this actually work?
      • netlink will say it WONT work, but the OS will just make it work by routing over loopback device.
  • Probably right answer is to special case in the netlink module, to return a

Face to face

  • It's official! Portland Oregon, Feb 17, 2020.
    • Safe to begin booking travel now.
  • Please register on Wiki page, since Jeff has to register you.
  • Date looks good. Feb 17th right before MPI Forum
    • 2pm monday, and maybe most of Tuesday
    • Cisco has a portland facility and is happy to host.
    • But willing to step asside if others want to host.
    • about 20-30 min drive from MPI Forum, will probably need a car.

Infrastrastructure

Submodule prototype

  • No update 11/12

  • Can we just turn on locbot / probot until we can get AWS bot online? *

  • OMPI has been waiting for some git submodule work in Jenkins on AWS.

    • Need someone to have someone to figure out why Jenkins doesn't like Jeff's PR.
      • Anyone with github account for ompi team should have access.
      • PR 6821
      • Apparently Jenkin's isn't behaving as it should.
    • Three pieces: Jenkins, CI, bot.
      • AWS has a libfabirc setup like this for testing.
      • Issue is that they're reworking the design, and will rollout for both libfabric and open-mpi.
    • William Zhang talked to Brian
      • Not something AWS team will work on, but Brian will work on it.
    • Jeff will talk to Brian as well.
  • Howard and Jeff have access to Jenkins on AWS. Part of the problem is that we don't have much expertise on Jenkins/AWS.

    • William will probably be admining the Jenkins/AWS or communicating with those who will.
  • Merged --recurse-submodules update into ompi-scripts Jenkins script as first step. Let's see if that works.

  • Modular thread re-write (noah)

    • UGNI and Vader BTLs were getting better performance, not sure why.
    • For modular threading library, might be interesting to decide at compile time or runtime.
    • Previously similar things seemed to be related to ICACHE.
    • Howard will lok at.

Release Branches

Review v3.0.x Milestones v3.0.4

Review v3.1.x Milestones v3.1.4

  • release v3.0.5 and v3.1.5 tomorrow.

Review v4.0.x Milestones v4.0.2

  • v4.0.3 in the works.
    • Schedule: Originally end of january.
      • PR 1752 may drive an earlier release in case if UCX will be released sooner.
  • PR 7151 - enhancement -
  • UCX 1.7 release schedule - was an RC 1.
    • Artem can check.
    • There's a problem in Open MPI v4.0.2, that packagers will hit in UCX 1.7
      • PR 1752 may drive an earlier release in case if UCX will be released sooner.

v5.0.0

  • Schedule: April 2020?
    • Wiki - go look at items, and we should discuss a bit in weekly calls.
    • Some items:
      • MPI1 removed stuff.

Review Master Master Pull Requests

CI status

  • IBM's PGI test has NEVER worked. Is it a real issue or local to IBM.
    • Austen is looking into
  • Absoft 32bit fortran failures.

Depdendancies

PMIx Update

ORTE/PRRTE

  • No discussion this week.

MTT


Back to 2019 WeeklyTelcon-2019

Clone this wiki locally