-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][ObjectRef] Change default to not record call stack during ObjectRef creation #18078
Conversation
Can we add a message in ray memory telling users to set RAY_record_ref_creation_sites=1 for additional debugging information? Maybe we should group these under a RAY_DEBUG flag in general once @stephanie-wang adds her metadata extras too. In general, we need to make sure user instructions are exposed in the right messages. |
Makes sense. Will add the necessary info. |
…us: 2021-08-25 17:10:41.494492 ======== Grouping by node address... Sorting by object size... Display allentries per group... --- Summary for node address: 192.168.0.234 --- Mem Used by Objects Local References Pinned Pending Tasks Captured in Objects Actor Handles 0.0 B 1, (-1.0 B) 0, (0.0 B) 0, (0.0 B) 0, (0.0 B) 0, (0.0 B) --- Object references for node address: 192.168.0.234 --- IP Address | PID | Type | Call Site | Size | Reference Type | Object Ref 192.168.0.234 | 27804 | Driver | | ? | LOCAL_REFERENCE | a67dc375e60ddd1affffffffffffffffffffffff0100000001000000 To record callsite information for each ObjectRef created, set env variable RAY_record_ref_creation_sites=1 --- Aggregate object store stats across all nodes --- Plasma memory usage 0 MiB, 0 objects, 0.0% full, 0.0% needed message
Added a message to
After:
|
Looks good. Could we change the "?" to "disabled" for a bit of increased clarity? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will also likely need to fix the test suite that checks the "ray memory" output.
Co-authored-by: Eric Liang <[email protected]>
|
test_memstat seems to be failing in CI ^^ |
Looks like |
Merged, thanks! |
Why are these changes needed?
Recording call stack is 40~50% of CPU overhead for
ObjectRef
creation in Python (profile). I wonder if we should make the feature default to false, and ask user to enable it before runningray memory
. We can also use sampling or passing call site constants to optimize this. Another more limited change is to cache the call site just for contained object refs (#17882).On
m5.8xlarge
, before this change:After this change:
single client put calls
andsingle client get object containing 10k refs
show 80~150% increase in ops/sec.Related issue number
#17803
Checks
scripts/format.sh
to lint the changes in this PR.