[Data] Truncate progress bar description #46801

scottjlee · 2024-07-25T19:32:32Z

Why are these changes needed?

For ProgressBars used by Ray Data's executor to display operator completion progress, the description of the bar is currently the full operator name. This can become very long and unwieldy with operators with long names, or datasets with many operators (e.g. consecutive MapBatches operators become fused into one giant operator with a really long name).

This PR adds logic to truncate the ProgressBar's description if it exceeds 100 characters. There is also a parameter to disable this truncation, and always show the full progress bar description.

Related issue number

For the following script:

import ray
import time
import os

paths = ["s3://anonymous@air-example-data/iris.csv"]
ds = ray.data.read_csv(paths, override_num_blocks=20)
num_map_ops = 100

def f_with_really_long_name(batch):
    time.sleep(1)
    return batch

for _ in range(num_map_ops):
    ds = ds.map_batches(f_with_really_long_name)

ds.materialize()

We can compare the output before and after:

Before

Running 0: 0 bundle [00:00, ? bundle/s]lly_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatch
- ReadCSV->SplitBlocks(20): 1 active, 0 queued, [cpu: 1.0, objects: 5.6KB]: : 18 bundle [00:01, 16.37 bundle/s]es(f_with_really_long_name
- MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_rRunning: 2/10 CPU, 0/0 GPU, 512.0MB/1.0GB object_store_memory: : 0 bundle [00:01, ? bundle/s]apBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name): 2 active, 17 queued, [cpu: 2.0, objects: 512.0MB]: : 0 bundle [00:01, ? bundle/s]

After (note the `...` in the last line)

✔️  Dataset execution finished in 32.55 seconds: 100%|█████████████████████████████████████████████████████████████████████████| 20/20 [00:32<00:00,  1.63s/ bundle]]
- ReadCSV->SplitBlocks(20): 0 active, 0 queued, [cpu: 0.0, objects: 0.0B]: : 20 bundle [00:32,  1.63s/ bundle]]name): 3 active, 17 queued, [cpu: 3.0, objects: 768.0
- MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->...->MapBatches(f_with_really_long_name): 0 active, 0 queued, [cpu: 0.0, objects: 840.0B

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Scott Lee <[email protected]>

omatthew98

One non-blocking question, otherwise lgtm.

omatthew98 · 2024-07-25T20:34:38Z

python/ray/data/_internal/progress_bar.py

+        # If True, disables name trunctating.
+        self._display_full_name = display_full_name
+        self._desc = self._truncate_name(name)


How would this be set to True by the user? Is the expectation that they would not want that or should we have something in data context or execution options to allow for this configuration?

ah yeah good point, i will need to expose it from DataContext

Signed-off-by: Scott Lee <[email protected]>

bveeramani · 2024-07-29T16:41:59Z

python/ray/data/_internal/progress_bar.py

@@ -83,6 +87,12 @@ def __init__(
                needs_warning = False
            self._bar = None

+    def _truncate_name(self, name: str) -> str:


Should we add a warn-once that the name is getting truncated and that the behavior can be disabled with DEFAULT_ENABLE_PROGRESS_BAR_NAME_TRUNCATION? Not sure if users will know how to disable it otherwise

bveeramani · 2024-07-29T16:44:30Z

python/ray/data/_internal/progress_bar.py

+    def _truncate_name(self, name: str) -> str:
+        ctx = ray.data.context.DataContext.get_current()
+        if ctx.enable_progress_bar_name_truncation and len(name) > self.MAX_NAME_LENGTH:
+            return name[: self.MAX_NAME_LENGTH - 3] + "..."


Nit: (Don't need to do this, just throwing it out as an idea) I'm wondering if we should include some of the text at the end. So, for example, instead of:

Map(spam)->Map(ham)->...

We could do something like

Map(spam)->...->Map(ham)

Might make it clearer which operator it ends on.

yeah sounds good. now the updated output looks like:

✔️ Dataset execution finished in 32.55 seconds: 100%|█████████████████████████████████████████████████████████████████████████| 20/20 [00:32<00:00, 1.63s/ bundle]] - ReadCSV->SplitBlocks(20): 0 active, 0 queued, [cpu: 0.0, objects: 0.0B]: : 20 bundle [00:32, 1.63s/ bundle]]name): 3 active, 17 queued, [cpu: 3.0, objects: 768.0 - MapBatches(f_with_really_long_name)->MapBatches(f_with_really_long_name)->...->MapBatches(f_with_really_long_name): 0 active, 0 queued, [cpu: 0.0, objects: 840.0B

Signed-off-by: Scott Lee <[email protected]>

bveeramani · 2024-07-30T02:57:04Z

python/ray/data/_internal/progress_bar.py

+        op_names = name.split("->")
+        # Include as many operators as possible without exceeding `MAX_NAME_LENGTH`.
+        # Always include the first and last operator names so
+        # it is easy to identify the DAG.
+        truncated_op_names = [op_names[0]]
+        for i, op_name in enumerate(op_names[1:-1]):
+            if len("->".join(truncated_op_names)) + len(op_name) > self.MAX_NAME_LENGTH:
+                truncated_op_names.append("...")
+                break
+            truncated_op_names.append(op_name)
+        if len(op_names) > 1:
+            truncated_op_names.append(op_names[-1])
+        return "->".join(truncated_op_names)


Nit: Not a big deal, but I think there are some edge cases where the truncated name can exceed MAX_NAME_LENGTH because we don't account for the last name or the additional "->"s.

Suggested change

op_names = name.split("->")

# Include as many operators as possible without exceeding `MAX_NAME_LENGTH`.

# Always include the first and last operator names so

# it is easy to identify the DAG.

truncated_op_names = [op_names[0]]

for i, op_name in enumerate(op_names[1:-1]):

if len("->".join(truncated_op_names)) + len(op_name) > self.MAX_NAME_LENGTH:

truncated_op_names.append("...")

break

truncated_op_names.append(op_name)

if len(op_names) > 1:

truncated_op_names.append(op_names[-1])

return "->".join(truncated_op_names)

op_names = name.split("->")

if len(op_names) == 1:

return op_names[0]

else:

# Include as many operators as possible without exceeding `MAX_NAME_LENGTH`.

# Always include the first and last operator names so

# it is easy to identify the DAG.

truncated_op_names = [op_names[0]]

for op_name in op_names[1:-1]:

if len("->".join(truncated_op_names)) + len("->") + len(op_name) + len("->") + len(op_names[-1]) > self.MAX_NAME_LENGTH:

truncated_op_names.append("...")

break

truncated_op_names.append(op_name)

truncated_op_names.append(op_names[-1])

return "->".join(truncated_op_names)

Signed-off-by: Scott Lee <[email protected]>

truncate progress bar description

5816372

Signed-off-by: Scott Lee <[email protected]>

scottjlee marked this pull request as ready for review July 25, 2024 20:12

scottjlee requested review from ericl, scv119, c21, amogkam, bveeramani, raulchen, stephanie-wang and omatthew98 as code owners July 25, 2024 20:12

omatthew98 approved these changes Jul 25, 2024

View reviewed changes

add context var

670bb30

Signed-off-by: Scott Lee <[email protected]>

bveeramani approved these changes Jul 29, 2024

View reviewed changes

scottjlee added 2 commits July 29, 2024 11:07

include first/last op

dc0095a

Signed-off-by: Scott Lee <[email protected]>

handle last op

a963e89

Signed-off-by: Scott Lee <[email protected]>

bveeramani approved these changes Jul 30, 2024

View reviewed changes

comments

02a2b30

Signed-off-by: Scott Lee <[email protected]>

bveeramani enabled auto-merge (squash) July 30, 2024 18:44

github-actions bot added the go add ONLY when ready to merge, run all tests label Jul 30, 2024

bveeramani merged commit 727139c into ray-project:master Jul 30, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] Truncate progress bar description #46801

[Data] Truncate progress bar description #46801

scottjlee commented Jul 25, 2024 •

edited

Loading

omatthew98 left a comment

omatthew98 Jul 25, 2024

scottjlee Jul 26, 2024

bveeramani Jul 29, 2024

bveeramani Jul 29, 2024

scottjlee Jul 29, 2024

bveeramani Jul 30, 2024

[Data] Truncate progress bar description #46801

[Data] Truncate progress bar description #46801

Conversation

scottjlee commented Jul 25, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

omatthew98 left a comment

Choose a reason for hiding this comment

omatthew98 Jul 25, 2024

Choose a reason for hiding this comment

scottjlee Jul 26, 2024

Choose a reason for hiding this comment

bveeramani Jul 29, 2024

Choose a reason for hiding this comment

bveeramani Jul 29, 2024

Choose a reason for hiding this comment

scottjlee Jul 29, 2024

Choose a reason for hiding this comment

bveeramani Jul 30, 2024

Choose a reason for hiding this comment

scottjlee commented Jul 25, 2024 •

edited

Loading