-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-3958: [Plasma] Reduce number of IPCs #3124
Conversation
@@ -155,11 +158,12 @@ TEST_F(TestPlasmaStore, SealErrorsTest) { | |||
|
|||
// Create object. | |||
std::vector<uint8_t> data(100, 0); | |||
CreateObject(client_, object_id, {42}, data); | |||
CreateObject(client_, object_id, {42}, data, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug that was masked by the release_delay before. If we call Release before the second Seal, the result status will be "PlasmaObjectNonexistent" instead of "PlasmaObjectAlreadySealed".
ARROW_CHECK_OK(client_.Contains(object_id2, &has_object)); | ||
ASSERT_TRUE(has_object); | ||
// The Delete call will flush release cache and send the Delete request. | ||
// Delete the objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is not needed any more (there is no release cache any more)
@@ -386,7 +384,6 @@ TEST_F(TestPlasmaStore, AbortTest) { | |||
// Test that we can get the object. | |||
ARROW_CHECK_OK(client_.Get({object_id}, -1, &object_buffers)); | |||
AssertObjectBufferEqual(object_buffers[0], {42, 43}, {1, 2, 3, 4, 5}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following line was a bug masked by the release cache. This form of get calls Release in the destructor of the PlasmaBuffer shared pointer.
@pcmoritz can you clarify what you mean in the PR description by "since the introduction of malloc"? Do you mean "jemalloc"? Also, it is normal in a lot of situations for clients to release all of their objects, in which case the unmap makes perfect sense. Can you elaborate on the rationale here? |
@robertnishihara The main rationale of this PR is to get rid of the release delay and the additional IPCs to send over the file descriptor. There is no advantage in unmapping in remapping the memory mapped files. To be hones I would prefer to statically allocate a memory mapped file at the beginning that everybody uses, but that's not feasible with the way malloc implementations are designed (both dlmalloc and jemalloc). |
@@ -308,7 +290,6 @@ PlasmaClient::Impl::~Impl() {} | |||
uint8_t* PlasmaClient::Impl::LookupOrMmap(int fd, int store_fd_val, int64_t map_size) { | |||
auto entry = mmap_table_.find(store_fd_val); | |||
if (entry != mmap_table_.end()) { | |||
close(fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file descriptor is not send any more if it had already been send, so we shouldn't close it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we close all the file descriptors in the PlasmaClient destructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they have already been closed by the first LookupOrMmap call (see the close call below), see also comment below
cpp/src/plasma/client.cc
Outdated
Status PerformRelease(const ObjectID& object_id); | ||
/// This is a helper method for marking an object as unused by this client. | ||
/// | ||
/// @param object_id The object ID we mark unused. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cpp/src/plasma/client.cc
Outdated
/// subsequent tasks so we do not unneccessarily invalidate cpu caches. | ||
/// TODO(pcm): replace this with a proper lru cache using the size of the L3 | ||
/// cache. | ||
std::deque<ObjectID> release_history_; | ||
/// The number of bytes in the combined objects that are held in the release | ||
/// history doubly-linked list. If this is too large then the client starts | ||
/// releasing objects. | ||
int64_t in_use_object_bytes_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we don't need this anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good point, I removed it
cpp/src/plasma/client.cc
Outdated
/// This is a helper method for unmapping objects for which all references have | ||
/// gone out of scope, either by calling Release or Abort. | ||
/// Check if store_fd has already been received from the store. If yes, | ||
/// return it. Otherwise, receive it from the store. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment that this logic requires analogous logic in ReturnFromGet
in store.cc
to guarantee that file descriptors are sent to a client if and only if they have not been sent before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And probably add a comment in the relevant place in the store referring to this logic in the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/// \param num_retries number of attempts to connect to IPC socket, default 50 | ||
/// \return The return status. | ||
Status Connect(const std::string& store_socket_name, | ||
const std::string& manager_socket_name, | ||
int release_delay = kPlasmaDefaultReleaseDelay, int num_retries = -1); | ||
const std::string& manager_socket_name, int release_delay = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to just get rid of release_delay
or print a deprecation warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the very least, can we remove it from any documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -308,7 +290,6 @@ PlasmaClient::Impl::~Impl() {} | |||
uint8_t* PlasmaClient::Impl::LookupOrMmap(int fd, int store_fd_val, int64_t map_size) { | |||
auto entry = mmap_table_.find(store_fd_val); | |||
if (entry != mmap_table_.end()) { | |||
close(fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we close all the file descriptors in the PlasmaClient destructor?
cpp/src/plasma/store.cc
Outdated
WarnIfSigpipe(send_fd(get_req->client->fd, store_fd), get_req->client->fd); | ||
if (get_req->client->used_fds.find(store_fd) == get_req->client->used_fds.end()) { | ||
WarnIfSigpipe(send_fd(get_req->client->fd, store_fd), get_req->client->fd); | ||
get_req->client->used_fds.emplace(store_fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why emplace
instead of insert
? same comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are equivalent here (I replaced it with insert, maybe that's a bit more idiomatic here, not sure)
Looks good to me! Thanks! |
Codecov Report
@@ Coverage Diff @@
## master #3124 +/- ##
=========================================
+ Coverage 87.23% 88.1% +0.87%
=========================================
Files 494 437 -57
Lines 69615 65900 -3715
=========================================
- Hits 60729 58064 -2665
+ Misses 8787 7836 -951
+ Partials 99 0 -99
Continue to review full report at Codecov.
|
This PR also removes the client unmap, which is not necessary any more since the introduction of malloc (since there is only few memory mapped files and they typically stay around for the lifetime of the application).
The PR also gets rid of a bunch of code that is not needed any more now (the release buffer, yay!).
Benchmarks: