Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs/ufs: change default locking protocol - v4.1.x #12760

Merged

Conversation

edgargabriel
Copy link
Member

The fs/ufs component by default disabled all file locking before read/write operations (except for NFS file systems). This was based on the assumption, that the OS itself performs the required locking operation and hence we don't have to add to it.

This assumption is incorrect when using data sieving. In data sieving, the code 'ignore' small gaps when we write to a file, and perform instead a read-modify-write sequence ourselves for performance reasons. The problem is however that even within a collective operation not all aggregators might want to use data sieving. Hence, enabling locking just for the data-sieving routines is insufficient, all processes have to perform the locking. Therefore, our two options are: a) either disable write data-sieving by default, or b) enable range-locking by default.

After some testing, I think enabling range-locking be default is the safer and better approach. It doesn't seem to show any significant performance impact on my test systems.

Note, that on Lustre file systems, we can keep the default to no-locking as far as I can see, since the collective algorithm used by Lustre is unlikely to produce this pattern. I did add in however an mca parameter that allows us to control the locking algorithm used by the Lustre component as well, in case we need to change that for a particular use-case or platform.

Fixes Issue #12718

Signed-off-by: Edgar Gabriel [email protected]
(cherry picked from commit c697f28)

The fs/ufs component by default disabled all file locking before
read/write operations (except for NFS file systems). This was based on
the assumption, that the file system itself performs the required
locking operation and hence we don't have to add to it.

This assumption is incorrect when using data sieving. In data sieving,
the code 'ignore' small gaps when we write to a file, and perform
instead a read-modify-write sequence ourselves for performance reasons.
The problem is however that even within a collective operation not all
aggregators might want to use data sieving. Hence, enabling locking just
for the data-sieving routines is insufficient, all processes have to
perform the locking. Therefore, our two options are: a) either disable
write data-sieving by default, or b) enable range-locking by default.

After some testing, I think enabling range-locking be default is the
safer and better approach. It doesn't seem to show any significant
performance impact on my test systems.

Note, that on Lustre file systems, we can keep the default to no-locking
as far as I can see, since the collective algorithm used by Lustre is
unlikely to produce this pattern. I did add in however an mca parameter
that allows us to control the locking algorithm used by the Lustre
component as well, in case we need to change that for a particular
use-case or platform.

Fixes Issue open-mpi#12718

Signed-off-by: Edgar Gabriel <[email protected]>
(cherry picked from commit c697f28)
@github-actions github-actions bot added this to the v4.1.7 milestone Aug 14, 2024
@edgargabriel edgargabriel changed the title fs/ufs: change default locking protocol fs/ufs: change default locking protocol - v4.1.x Aug 14, 2024
(void) mca_base_component_var_register(&mca_fs_lustre_component.fsm_version,
"lock_algorithm", "Locking algorithm used by the fs ufs component. "
" 0: auto (default), 1: skip locking, 2: always lock entire file, "
"3: lock only specific ranges",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this MCA parameter only accepts specific values, is there any reason not to use the enum MCA type? (such that the MCA base will take care of ensuring that only allowable values are set)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsquyres you are technically correct that we could enum MCA types, this is more a feature of what I am used to doing vs. what is possible. That being said, the same pr has already been merged to 5.0.x as well, do we really want to go back and change that as well? We could still fix it in main branch for future releases.

@bwbarrett bwbarrett merged commit 1d02355 into open-mpi:v4.1.x Sep 30, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants