-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] Truncate progress bar description #46801
[Data] Truncate progress bar description #46801
Conversation
Signed-off-by: Scott Lee <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One non-blocking question, otherwise lgtm.
# If True, disables name trunctating. | ||
self._display_full_name = display_full_name | ||
self._desc = self._truncate_name(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this be set to True by the user? Is the expectation that they would not want that or should we have something in data context or execution options to allow for this configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah good point, i will need to expose it from DataContext
Signed-off-by: Scott Lee <[email protected]>
@@ -83,6 +87,12 @@ def __init__( | |||
needs_warning = False | |||
self._bar = None | |||
|
|||
def _truncate_name(self, name: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a warn-once that the name is getting truncated and that the behavior can be disabled with DEFAULT_ENABLE_PROGRESS_BAR_NAME_TRUNCATION
? Not sure if users will know how to disable it otherwise
def _truncate_name(self, name: str) -> str: | ||
ctx = ray.data.context.DataContext.get_current() | ||
if ctx.enable_progress_bar_name_truncation and len(name) > self.MAX_NAME_LENGTH: | ||
return name[: self.MAX_NAME_LENGTH - 3] + "..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: (Don't need to do this, just throwing it out as an idea) I'm wondering if we should include some of the text at the end. So, for example, instead of:
Map(spam)->Map(ham)->...
We could do something like
Map(spam)->...->Map(ham)
Might make it clearer which operator it ends on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah sounds good. now the updated output looks like:
✔️ Dataset execution finished in 32.55 seconds: 100%|█████████████████████████████████████████████████████████████████████████| 20/20 [00:32<00:00, 1.63s/ bundle]]
- ReadCSV->SplitBlocks(20): 0 active, 0 queued, [cpu: 0.0, objects: 0.0B]: : 20 bundle [00:32, 1.63s/ bundle]]name): 3 active, 17 queued, [cpu: 3.0, objects: 768.0
- MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->...->MapBatches(f_with_really_long_name): 0 active, 0 queued, [cpu: 0.0, objects: 840.0B
Signed-off-by: Scott Lee <[email protected]>
Signed-off-by: Scott Lee <[email protected]>
op_names = name.split("->") | ||
# Include as many operators as possible without exceeding `MAX_NAME_LENGTH`. | ||
# Always include the first and last operator names so | ||
# it is easy to identify the DAG. | ||
truncated_op_names = [op_names[0]] | ||
for i, op_name in enumerate(op_names[1:-1]): | ||
if len("->".join(truncated_op_names)) + len(op_name) > self.MAX_NAME_LENGTH: | ||
truncated_op_names.append("...") | ||
break | ||
truncated_op_names.append(op_name) | ||
if len(op_names) > 1: | ||
truncated_op_names.append(op_names[-1]) | ||
return "->".join(truncated_op_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Not a big deal, but I think there are some edge cases where the truncated name can exceed MAX_NAME_LENGTH
because we don't account for the last name or the additional "->"s.
op_names = name.split("->") | |
# Include as many operators as possible without exceeding `MAX_NAME_LENGTH`. | |
# Always include the first and last operator names so | |
# it is easy to identify the DAG. | |
truncated_op_names = [op_names[0]] | |
for i, op_name in enumerate(op_names[1:-1]): | |
if len("->".join(truncated_op_names)) + len(op_name) > self.MAX_NAME_LENGTH: | |
truncated_op_names.append("...") | |
break | |
truncated_op_names.append(op_name) | |
if len(op_names) > 1: | |
truncated_op_names.append(op_names[-1]) | |
return "->".join(truncated_op_names) | |
op_names = name.split("->") | |
if len(op_names) == 1: | |
return op_names[0] | |
else: | |
# Include as many operators as possible without exceeding `MAX_NAME_LENGTH`. | |
# Always include the first and last operator names so | |
# it is easy to identify the DAG. | |
truncated_op_names = [op_names[0]] | |
for op_name in op_names[1:-1]: | |
if len("->".join(truncated_op_names)) + len("->") + len(op_name) + len("->") + len(op_names[-1]) > self.MAX_NAME_LENGTH: | |
truncated_op_names.append("...") | |
break | |
truncated_op_names.append(op_name) | |
truncated_op_names.append(op_names[-1]) | |
return "->".join(truncated_op_names) |
Signed-off-by: Scott Lee <[email protected]>
Why are these changes needed?
For
ProgressBar
s used by Ray Data's executor to display operator completion progress, the description of the bar is currently the full operator name. This can become very long and unwieldy with operators with long names, or datasets with many operators (e.g. consecutiveMapBatches
operators become fused into one giant operator with a really long name).This PR adds logic to truncate the
ProgressBar
's description if it exceeds 100 characters. There is also a parameter to disable this truncation, and always show the full progress bar description.Related issue number
For the following script:
We can compare the output before and after:
Before
After (note the `...` in the last line)
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.