Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Datasets] Persist Datasets statistics to log file #30557

Merged
merged 25 commits into from
Dec 7, 2022

Conversation

scottjlee
Copy link
Contributor

@scottjlee scottjlee commented Nov 22, 2022

Why are these changes needed?

Currently, when we print Dataset stats after execution, there is no way to retrieve this information in case of job failure/crash. By persisting the logs to a separate file, we can access the stats which could be helpful for debugging. By default, this is configured to write to /logs/ray-data.log:
Screenshot at Nov 23 14-28-54

The new logger, DatasetLogger, is configured to always write logs to the ray-data.log file, and optionally also writes to stdout (this is enabled by default). The motivation behind this is so that users can easily use the specific log file to filter for Dataset logs, while still maintaining console logs for those who use them.

Related issue number

Closes #29575

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Scott Lee <[email protected]>
Signed-off-by: Scott Lee <[email protected]>
Signed-off-by: Scott Lee <[email protected]>
Signed-off-by: Scott Lee <[email protected]>
Copy link
Contributor

@c21 c21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @scottjlee, looks solid, have some comments.

python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
python/ray/data/tests/test_dataset_logger.py Show resolved Hide resolved
python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
@c21
Copy link
Contributor

c21 commented Nov 29, 2022

Hi @clarkzinzow and @jianoaix - could you help take a look? Thanks.

@jianoaix
Copy link
Contributor

So currently, this log is written to the worker log which is persisted already as .out, and in case of job failure, we can access the stats from worker log, right?

python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
python/ray/data/_internal/dataset_logger.py Outdated Show resolved Hide resolved
"""
# Logger used to logging to log file (in addition to the root logger,
# which logs to stdout as normal). We set `logger.propagate` to False
# to ensure the file logger only logs to the file, and not stdout, by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation seems confusing to me since the class-level comment says it writes to the file in addition to stdout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded the comments and documentation to clarify here and in the class level comment, let me know if things are still confusing here. thanks!

Signed-off-by: Scott Lee <[email protected]>
@c21
Copy link
Contributor

c21 commented Dec 6, 2022

The failed CI test looks irrelevant - https://flakey-tests.ray.io/ .

Screen Shot 2022-12-06 at 12 52 15 PM

@c21 c21 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Dec 6, 2022
@c21
Copy link
Contributor

c21 commented Dec 7, 2022

@clarkzinzow, @jianoaix any more comments? Thanks

@clarkzinzow clarkzinzow merged commit a841c07 into ray-project:master Dec 7, 2022
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
Currently, when we print Dataset stats after execution, there is no way to retrieve this information in case of job failure/crash. By persisting the logs to a separate file, we can access the stats which could be helpful for debugging. By default, this is configured to write to /logs/ray-data.log.

The new logger, DatasetLogger, is configured to always write logs to the ray-data.log file, and optionally also writes to stdout (this is enabled by default). The motivation behind this is so that users can easily use the specific log file to filter for Dataset logs, while still maintaining console logs for those who use them.

Signed-off-by: Weichen Xu <[email protected]>
tamohannes pushed a commit to ju2ez/ray that referenced this pull request Jan 25, 2023
Currently, when we print Dataset stats after execution, there is no way to retrieve this information in case of job failure/crash. By persisting the logs to a separate file, we can access the stats which could be helpful for debugging. By default, this is configured to write to /logs/ray-data.log.

The new logger, DatasetLogger, is configured to always write logs to the ray-data.log file, and optionally also writes to stdout (this is enabled by default). The motivation behind this is so that users can easily use the specific log file to filter for Dataset logs, while still maintaining console logs for those who use them.

Signed-off-by: tmynn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Datasets] Persist the dataset statistics to log file
4 participants