-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One method of debugger attach broken #1225
Comments
@rhc54 have you had a chance to look at this? |
I've looked at it and have been working on a fix, but no ETA for committing it. |
Discussion from today's call... There was a webex discussing this issue, and what to do about it. Two main options emerged:
Restoring the OOB functionality for this one message seems like a lot of work, and it also seems like a step backwards. The consensus seems to be to move forward and use the PMIx notification system (which means: finish implementing the PMIx notification system). @rhc54 is working on it. This will likely take a little time to finish and test. @gpaulsen thinks that IBM may be able to contribute some resources to help. Note, too, that the PMIx notification stuff will be part of PMIx v1.2. Meaning: if we want TotalView attach to work in Open MPI v2.0.0 (as of today: we do), we'll need to update the v2.x branch with PMIx v1. |
From Telcon Call: https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20160112 |
When this issue is fixed, please also revert the change (on v2.x) from open-mpi/ompi-release#905. |
Per discussion on 24 Feb 2016, moving this milestone back to v2.0.1. Rationale: it's a bug fix, and it does not affect our backwards compatibility promises for the 2.x series. |
@jsquyres @hppritcha @hjelmn Just an FYI: I noticed that something is broken on the pmix120 component - I'm getting hangs during regular init (i.e., no debugger) on my Mac. Not sure what may have broken, but I'll fix in on Fri. |
Any update? Is this a 2.0 blocker? |
We're iterating with Totalview. The first version we sent to them didn't work; we sent them another one last night. |
…thus fixing show_help aggregation. Fixes open-mpi#1467 Restore debugger attach operations Fixes open-mpi#1225 (cherry picked from commit open-mpi/ompi@c146c49) Fix the debugger attach - previous commit had fixed one instance of a check prior to sending the release message, but there was a second code path that included a similar check that was missed. Thanks to John DelSignore for spotting it! (cherry picked from commit open-mpi/ompi@4a62377) Very minor typo (cherry picked from commit open-mpi/ompi@6e6bbfd)
v2.x: ompi/datatype: Fix args of DARRAY
…supported by the pmix120 component, which is not selected by default. All other components will ignore error registration requests, and thus do not support debugger attach when launched via mpirun. Note that direct launched applications will support such attachment, but may not do so in a scalable fashion. Fixes #open-mpi#1225
…thus fixing show_help aggregation. Fixes open-mpi#1467 Restore debugger attach operations Fixes open-mpi#1225
In master and v2.x, debugger attach for TotalView is currently broken. The problem is that when we upgraded to PMIx, we removed the OOB support from apps as it was no longer necessary. However, we currently send a message from mpirun to rank0 indicating that the debugger is ready, and therefore releasing rank0 to complete a barrier.
There are several ways of fixing this; there's ongoing discussion to pick the best one.
Just to summarize:
The text was updated successfully, but these errors were encountered: