Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORTE -> PRRTE changes for users need to be documented, users educated #7668

Closed
awlauria opened this issue Apr 29, 2020 · 12 comments
Closed

ORTE -> PRRTE changes for users need to be documented, users educated #7668

awlauria opened this issue Apr 29, 2020 · 12 comments

Comments

@awlauria
Copy link
Contributor

awlauria commented Apr 29, 2020

From ompi email chain, sent by @rhc54:

So here is an interesting consequence of moving from ORTE to PRRTE. In ORTE, you 
could express any mapping policy as an MCA param - e.g., the following:

OMPI_MCA_rmaps_base_mapping_policy=core
OMPI_MCA_rmaps_base_display_map=1

would be the equivalent of a cmd line that included "--map-by core --display-map"

When defining what we wanted on the OMPI v5 cmd line, we removed some options like 
--display-map and replaced them with modifiers, so the above would have been replaced with:

OMPI_MCA_rmaps_base_mapping_policy=core:display

The move to PRRTE, however, means more than just changing the "OMPI" to "PRRTE".
PRRTE doesn't support setting the default mapping policy to include "report" as that
would mean we would be reporting the map for every job that was ever launched.
Definitely not something the persistent DVM users would appreciate!

So if you put:

PRRTE_MCA_rmaps_default_mapping_policy=core:display

(note the name change!!!) in your environment, you are going to get an error when you
execute "mpirun":

=====
A mapping policy modifier was provided that is not supported as a default value:

 Modifier:  display

You can provide this modifier on a per-job basis, but it cannot
be the default setting.
=======

And you will error out. However, it is perfectly okay to put "--map-by core:display" on your
cmd line - that is legit and understood as it only applies to that specific job.

It's these changes, plus the name changes (e.g., we replace "base" with "default" to
emphasize these are ONLY the default settings), that will need to be communicated.
@jsquyres
Copy link
Member

jsquyres commented Jun 2, 2020

@jsquyres volunteered to take a first pass at this.

@jjhursey
Copy link
Member

jjhursey commented Jun 2, 2020

Related PRRTE items:

@gpaulsen
Copy link
Member

gpaulsen commented Apr 5, 2021

What's the status of this documentation now?

How much more docs do we need to do before rc1?

@naughtont3
Copy link
Contributor

dropping note here based on user interaction today, the details on map/bind has been mentioned but we also need to include how to set those via MCA envvars (e.g., PRTE_MCA_rmaps_default_mapping_policy=core, PRTE_MCA_hwloc_default_binding_policy=package, etc.)

@rhc54
Copy link
Contributor

rhc54 commented Mar 7, 2023

You can document those, but that won't help users with existing scripts and/or default param files. I'd suggest you also (or instead - your choice) add the required translation logic to the schizo/ompi component to make that transparent.

@naughtont3
Copy link
Contributor

Good suggestion. The specific case today was that the envvars will have a different prefix, namely OMPI_MCA_xxx moving to PRTE_MCA_xxx. And the xxx likely different, but that can be found via prte_info and friends. Just wanted to have a note on this ticket related to rte docs.

@rhc54
Copy link
Contributor

rhc54 commented Mar 7, 2023

It's a touchy issue to navigate. If you require users to look for things by layer (e.g., using ompi_info, prte_info, or pmix_info), then you are asking a naive MPI user to have to know the OMPI code architecture, which is a pretty big ask. Someone was supposedly going to "unify" the user-facing side of things to help alleviate that burden, but that never happened.

Likewise, telling users what prefix to use for which param can lead to confusion - e.g., if I need to set the include/exclude on TCP transports, I now have to set that for three different prefixes, each of them having a different framework name and/or component. Pretty burdensome. And it again forces the user to become familiar with the OMPI code architecture.

Translation would at least help alleviate things, though it can be fragile as params in the underlying layers can come/go, especially between releases. Still, better than having a user thrash as they can't figure out why their MCA param no longer works.

No perfect answer, I fear.

@qkoziol
Copy link
Contributor

qkoziol commented Jun 16, 2023

I've read this issue's history and believe that the requirement here is to document the change in OMPI's online documentation. Does anyone think otherwise?

@qkoziol
Copy link
Contributor

qkoziol commented Jun 28, 2023

@naughtont3 - What's the acceptance criteria for moving this ticket to the "done" column?

@qkoziol qkoziol self-assigned this Aug 29, 2023
qkoziol added a commit to qkoziol/ompi that referenced this issue Sep 2, 2023
Document MCA parameter changes from move from ORTE -> PRRTE.

Addresses Github issue open-mpi#7668

Signed-off-by: Quincey Koziol <[email protected]>
@qkoziol
Copy link
Contributor

qkoziol commented Sep 2, 2023

PR that documents the MCA parameter changes up for review: #11890

@jsquyres @naughtont3 @gpaulsen @janjust @rhc54

jsquyres pushed a commit to qkoziol/ompi that referenced this issue Sep 6, 2023
Document MCA parameter changes from move from ORTE -> PRRTE.

Addresses Github issue open-mpi#7668

Signed-off-by: Quincey Koziol <[email protected]>
qkoziol added a commit to qkoziol/ompi that referenced this issue Sep 9, 2023
Addresses Github issue open-mpi#7668

Co-authored-by: [email protected]

Signed-off-by: Quincey Koziol <[email protected]>
qkoziol added a commit to qkoziol/ompi that referenced this issue Sep 9, 2023
Addresses Github issue open-mpi#7668

Co-authored-by: [email protected]

Signed-off-by: Quincey Koziol <[email protected]>
qkoziol added a commit to qkoziol/ompi that referenced this issue Sep 9, 2023
Addresses Github issue open-mpi#7668

Co-authored-by: [email protected]

Signed-off-by: Quincey Koziol <[email protected]>
qkoziol added a commit to qkoziol/ompi that referenced this issue Sep 11, 2023
Addresses Github issue open-mpi#7668

Co-authored-by: [email protected]

Signed-off-by: Quincey Koziol <[email protected]>
qkoziol added a commit to qkoziol/ompi that referenced this issue Sep 11, 2023
Addresses Github issue open-mpi#7668

Co-authored-by: [email protected]

Signed-off-by: Quincey Koziol <[email protected]>
(cherry picked from commit 864caf3)
Signed-off-by: Quincey Koziol <[email protected]>
@qkoziol
Copy link
Contributor

qkoziol commented Sep 11, 2023

PR for merge to 5.0 branch: #11926

@jsquyres jsquyres modified the milestones: v5.0.0, v5.0.1 Oct 30, 2023
@wenduwan
Copy link
Contributor

5.0.0 released. Closing.

bosilca pushed a commit to bosilca/ompi that referenced this issue Feb 14, 2024
Addresses Github issue open-mpi#7668

Co-authored-by: [email protected]

Signed-off-by: Quincey Koziol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants