Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[object store refactor 2/n] More refactor on PlasmaAllocator, and add unit tests #17313

Merged
merged 1 commit into from
Aug 2, 2021

Conversation

scv119
Copy link
Contributor

@scv119 scv119 commented Jul 24, 2021

This the 2nd PR that refactor PlasmaStore for modularization and testability. Previous PR: #17307 , Next PR: #17332

In this PR, we further refactor PlasmaAllocator for better testability. Specifically

  • we move the initial mmap/unmmap call into PlasmaAllocator constructor
  • we move the dlmalloc setting into PlasmaAllocator constructor, so that we "almost" hide dlmalloc fully into PlasmaAllocator (except for one outstanding SetMallocGranularity call in store_runner.cc/and dlmalloc.cc reads RayConfig::instance().preallocate_plasma_memory()).
  • Fix a concurrency issue where object_manager reads fallback allocated bytes without synchronization
  • Added unit tests for basic behavior of PlasmaStoreAllocator. (note we have to separate them into different files as we can't reset dlmalloc for now).

Test Plan

-[x] existing tests
-[x] added unit tests

@scv119 scv119 added @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. do-not-merge Do not merge this PR! labels Jul 24, 2021
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch from 943f74a to 900fde1 Compare July 26, 2021 01:20
@scv119 scv119 changed the title [object store refactor 2/n][WIP] More refactor on PlasmaAllocator, and add unit tests [object store refactor 2/n] More refactor on PlasmaAllocator, and add unit tests Jul 26, 2021
@scv119 scv119 removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jul 26, 2021
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch from 900fde1 to 3c845eb Compare July 26, 2021 01:50
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch from 3c845eb to 5e62e41 Compare July 26, 2021 04:56
@scv119 scv119 changed the base branch from test-wheels/introduce-allocator to master July 26, 2021 05:43
@scv119 scv119 changed the base branch from master to test-wheels/introduce-allocator July 26, 2021 05:45
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch from 5e62e41 to 73c6b97 Compare July 26, 2021 05:58
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch from 73c6b97 to 5bca4bf Compare July 28, 2021 23:46
Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nit comments

ci/travis/determine_tests_to_run.py Show resolved Hide resolved
return p.set_value(allocator_ ? allocator_->FallbackAllocated() : 0);
},
"PlasmaStoreRunner.GetFallbackAllocatedMemory");
return p.get_future().get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this blocking btw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an alternative is to make those counters std::atomic...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like an better option to me! Hopefully we can remove these issues when solving the threading model problrm…

src/ray/object_manager/plasma/common.h Show resolved Hide resolved
src/ray/object_manager/object_manager.cc Show resolved Hide resolved
bool fallback_enabled) {
dlmalloc_config.hugepages_enabled = hugepage_enabled;
dlmalloc_config.directory = plasma_directory;
dlmalloc_config.fallback_directory = fallback_directory;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add a check that this directory is not /dev/shm?

@rkooo567
Copy link
Contributor

Btw, I am down to merge this once we solve the blocking FallbackAllocated call

@rkooo567 rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jul 30, 2021
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch 6 times, most recently from 9885165 to 14fa38e Compare July 30, 2021 21:08
@scv119 scv119 force-pushed the test-wheels/introduce-allocator branch from 54bb716 to 28fe635 Compare July 30, 2021 23:09
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch from 14fa38e to c8bdc41 Compare July 30, 2021 23:10
RAY_LOG(ERROR) << "MapViewOfFile() failed. GetLastError() = " << GetLastError();
}
if (!allocated_once) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windows bug caught by unit test!

Base automatically changed from test-wheels/introduce-allocator to master July 31, 2021 02:08
@scv119 scv119 removed @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. do-not-merge Do not merge this PR! labels Jul 31, 2021
fix windows

fix windows
@scv119 scv119 force-pushed the test-wheels/allocator-refactor-test branch from fab8664 to bc03a0d Compare July 31, 2021 02:21
@ericl ericl merged commit 1b89fa8 into master Aug 2, 2021
@ericl ericl deleted the test-wheels/allocator-refactor-test branch August 2, 2021 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants