Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hydra: refactor kvs and fix duplicate keys #6564

Merged
merged 4 commits into from
Jun 23, 2023
Merged

Conversation

hzhou
Copy link
Contributor

@hzhou hzhou commented Jun 20, 2023

Pull Request Description

Refactor kvs to hide internals in lib/pmiserv_common.c.

Always add new keypair to the head, so that we always find the latest
key pair in the case of duplicate keys. Although PMI require users to
not put duplicate keys, this fix makes the behavior more expected.

Note: this fixes the session_re_init test. Session reinit skips
PMI_Finalize between sessions. The proper fix should introduce new API
e.g. PMI_KVS_Clear. For now, we'll simply make the later key overwirte
the previous duplicates.

Also rename the HYD_pmcd_pmi_kvs_ prefix to HYD_kvs_

Fixes #6446

[skip warnings]

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@hzhou
Copy link
Contributor Author

hzhou commented Jun 20, 2023

test:mpich/ch3/tcp ✔️
test:mpich/ch4/ofi ✔️
test:mpich/pmi ✔️ except session_re_init on pmix. We called PMIx_Init 4 times but PMIx_Finalize only once at atexit

@hzhou
Copy link
Contributor Author

hzhou commented Jun 21, 2023

test:mpich/pmi ✔️

@raffenet
Copy link
Contributor

Do we want to also incorporate this change while we are at it? #5370

@hzhou
Copy link
Contributor Author

hzhou commented Jun 21, 2023

Do we want to also incorporate this change while we are at it? #5370

Sure. Let me pick that PR.

Jayyee-HPC and others added 4 commits June 21, 2023 10:02
Refactor kvs to hide internals in lib/pmiserv_common.c.

Always add new key pair to the head, so that we always find the latest
key pair in the case of duplicate keys. Although PMI require users to
not put duplicate keys, this fix makes the behavior more expected.

Note: this fixes the session_re_init test. Session reinit skips
PMI_Finalize between sessions. The proper fix should introduce new API
e.g. PMI_KVS_Clear. For now, we'll simply make the later key overwrite
the previous duplicates.
Since now we only call PMIx_Finalize once in the at exit handler, we
need make sure that we only call PMIx_Init once. Otherwise, the openpmix
server may complain abnormal exit.
@hzhou
Copy link
Contributor Author

hzhou commented Jun 21, 2023

test:mpich/pmi

@hzhou hzhou merged commit b578077 into pmodels:main Jun 23, 2023
@hzhou hzhou deleted the 2306_hydra_kvs branch June 23, 2023 15:32
@@ -137,25 +156,18 @@ HYD_status HYD_pmcd_pmi_add_kvs(const char *key, const char *val, struct HYD_pmc

if (kvs->key_pair == NULL) {
kvs->key_pair = key_pair;
kvs->tail = key_pair;
} else {
#ifdef PMI_KEY_CHECK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would make sense to enable duplicate keys also with debugging of MPICH/ hydra enabled, e.g., by changing the PMI_KEY_CHECK into a debug message of hydra for duplicate keys instead of an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes sense. Do you want to make a PR for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug/jenkins: init/session_re_init fails with pmi2
4 participants