Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline objects #3756

Merged
merged 44 commits into from
Feb 7, 2019
Merged

Inline objects #3756

merged 44 commits into from
Feb 7, 2019

Conversation

istoica
Copy link
Contributor

@istoica istoica commented Jan 12, 2019

What do these changes do?

This pull request adds object data to the object entry in GCS, for small objects. This helps with performance optimization, and fault tolerance, as there is no need to reconstruct an object as long as its entry is in GCS.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10782/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10799/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10797/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10801/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10800/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10824/
Test PASSed.

@pcmoritz
Copy link
Contributor

Travis is failing with

+/home/travis/build/ray-project/ray/build/src/ray/object_manager/object_manager_test /home/travis/build/ray-project/ray/python/ray/core/src/plasma/plasma_store_server
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from TestObjectManager
[ RUN      ] TestObjectManager.StartTestObjectManager
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0113 15:39:45.414286 16830 object_manager.cc:324] Invalid Push request ObjectID: dd20ed210f7ca5e66aa3b7b37e3999f7aee3e896 after waiting for 1000 ms.
	H������������&�XfU�9��+C���&
���s

Can you reproduce the failing of object_manager_test locally?

@pcmoritz
Copy link
Contributor

maybe valgrind gives us some hint on what's going wrong here: https://travis-ci.com/ray-project/ray/jobs/170048915

@pcmoritz
Copy link
Contributor

also some minor linting: https://travis-ci.com/ray-project/ray/jobs/170048914

io_service_.post([callback, object_id, locations, has_been_created]() {
callback(object_id, locations, has_been_created);
bool inline_object_flag = it->second.inline_object_flag;
std::vector<uint8_t> inline_object_data = it->second.inline_object_data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will do a copy, it might be better to do const auto& inline_object_data = ...

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10833/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/10860/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11275/
Test PASSed.

@stephanie-wang stephanie-wang changed the base branch from inline-objects to master January 30, 2019 06:21
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11292/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11305/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11323/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11348/
Test PASSed.

test/runtest.py Outdated
def get(self):
return

def flush(actor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The release delay is gone as of apache/arrow#3124, so we can get rid of this, yay!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah great, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually, if I remove the flush call, this test sometimes fails on my laptop. Do you know why that is (not sure about the semantics of ray.internal.free for plasma)?

Copy link
Contributor

@pcmoritz pcmoritz Jan 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a race condition (ray.internal.free is using the plasma client in the raylet which is different from the one in the worker that get is called on).

One way around that is to use ray.workers.global_worker.plasma_client.delete, which uses the same plasma client that will be used for the get, so that should work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this still seems to fail occasionally with plasma_client.delete (maybe 1/100 times). Any idea why or do you think it's a bug somewhere else?

if (!client_ids.empty()) {
const std::unordered_set<ClientID> &client_ids,
bool inline_object_flag,
const std::vector<uint8_t> inline_object_data,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not const std::vector<uint8_t>& inline_object_data here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thank you!

// Inline object is not in the local object store. Create it from
// inline_object_data, and inline_object_metadata, respectively.
//
// Since this function is called on notification or when reading the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call!

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11362/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11369/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11420/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/11581/
Test FAILed.

@stephanie-wang stephanie-wang merged commit f987572 into ray-project:master Feb 7, 2019
@guoyuhong guoyuhong mentioned this pull request Feb 10, 2019
pcmoritz added a commit to pcmoritz/ray-1 that referenced this pull request Feb 22, 2019
stephanie-wang pushed a commit that referenced this pull request Feb 23, 2019
* Revert "Inline objects (#3756)"

This reverts commit f987572.

* fix rebase problems

* more rebase fixes

* add back debug statement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants