Revert "Revert "Global logging format changes" (#34126)" #34182

peytondmurray · 2023-04-07T23:01:19Z

This PR changes how the logging configuration for Ray is set, and changes the format of log messages.

Why are these changes needed?

Changes

Attempts to consolidate logging configuration by introducing reasonable defaults in ray/log.py.
This new logging configuration is done once in ray/__init__.py at the top of the module. Subsequent calls to the configuration are ignored.
A logger for ray.rllib is configured at the WARN level, to address Revert "Simplify logging configuration. (#30863)" #31858. With this change,
Revert "Simplify logging configuration. (#30863)" #31858 can be reverted, again simplifying and consolidating logging configuration.
Modified test_output.py::test_logger_config to test only the logger config, not launch a ray cluster. The test was failing intermittently, I think due to a race condition between the launch of the cluster and the reading of the subprocess's stdout, and anyway it wasn't necessary to call ray.init here to check that logging was configured correctly.
Modified python/ray/tune/tests/test_commands.py::test_ls_with_cfg to test the underlying data, not what gets printed to stdout (which has changed with the new logging system).
Modified a logging message in ray.tune.automl.search_policy.AutoMLSearcher.on_trial_complete, which in certain cases emits a logging message which tries to format a NoneType into a %f during log message formatting. This was a previously-undetected bug which showed up because the default log level is now INFO. This fixes a test that was failing in test_automl_searcher.py::AutoMLSearcherTest.

New changes since the revert

Added propagate = False to the ray logger, which matches the behavior of the current master branch. The implication of this is that some tests broke, because in the current master branch some tests called ray.init(setup_logging=False), and no logging configuration was generated, i.e. the default behavior was to propagate messages up to the root logger. Tests which rely on caplog therefore expect to be able to examine log messages this way because caplog attaches a special logging handler to the root logger.

The fix for these tests is to always use the propagate_logs pytest fixture, which was already being used in many places, everywhere caplog was being used. This ensures messages are propagated beyond the ray logger to the root logger, and that caplog would be able to examine log messages once again.

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

rkooo567 · 2023-04-12T23:55:23Z

@krfricke @sihanwang41 can you guys approve this PR again? It is the revert-revert

rkooo567 · 2023-04-12T23:55:32Z

Also, @c21

python/ray/data/dataset.py

rkooo567 · 2023-04-14T14:23:14Z

cc @peytondmurray seems like it is ready to go? Can you convert it to regular PR?

@krfricke @c21 @sihanwang41 please approve the PR. It is the revert-revert, and there shouldn't be any new change (the issue was we set propagate=True to the root logger, which was the changed behavior)

peytondmurray · 2023-04-14T15:39:41Z

linux://python/ray/tests:test_advanced_9: Flaky
linux://python/ray/tests:test_autoscaler_e2e: Not sure about this one. No obvious connection to logging
Documentation: This one does not appear to be related to logging, there's some issue with a torch trainer example.
linux://python/ray/train:test_accelerate_trainer_gpu: Test has been broken for a while
linux://python/ray/tune:test_commands: A bit flaky, seems like it took a bit too long for a test to run.
windows://python/ray/tests:test_object_store_metrics: Flaky

angelinalg

LGTM

rkooo567 · 2023-05-10T01:13:24Z

cc @peytondmurray can you resolve the conflict before merging it?

peytondmurray · 2023-05-10T02:52:44Z

Huh, so I rebased on master and the huggingface test merge conflict went away, but then there was another conflict about ray/data/datastream.py, which I guess got reverted a little while ago. Anyway, it all seems fixed now.

peytondmurray · 2023-05-11T19:34:51Z

Rebased to get rid of the agent.py import conflict.

peytondmurray · 2023-05-11T21:17:26Z

caplog was used without propagate_logs fixture on the new widget tests. Fixed now, will check back in ~2h to see if it passes.

peytondmurray · 2023-05-12T23:38:33Z

@rkooo567 About the test_autoscaler.py failure, I tried testing locally and every time I ran the AutoscalingTest.testDynamicScaling<number> tests different ones passed/failed:

So after talking with @DmitriGekhtman about this:

It's probably not related to logging.
You'd probably see the same intermittent failures if you run the tests against the master branch.

I tried testing on the master branch and indeed I am seeing the same flaky test failures:

peytondmurray · 2023-05-15T19:17:43Z

@rkooo567 Looks like the failing test was fixed on master. Is there any chance we can get this merged?

This reverts commit 45d5f65. Signed-off-by: pdmurray <[email protected]>

Signed-off-by: pdmurray <[email protected]>

rkooo567 · 2023-05-19T00:09:39Z

Failures seem unrelated

rkooo567 · 2023-05-19T00:54:21Z

Test failing tests seem to fail in the master as well

Since #34182, the docs build on M1 mac had been broken

…ay-project#34182) Attempts to consolidate logging configuration by introducing reasonable defaults in ray/log.py. This new logging configuration is done once in ray/__init__.py at the top of the module. Subsequent calls to the configuration are ignored. A logger for ray.rllib is configured at the WARN level, to address Revert "Simplify logging configuration. (ray-project#30863)" ray-project#31858. With this change, Revert "Simplify logging configuration. (ray-project#30863)" ray-project#31858 can be reverted, again simplifying and consolidating logging configuration. Modified test_output.py::test_logger_config to test only the logger config, not launch a ray cluster. The test was failing intermittently, I think due to a race condition between the launch of the cluster and the reading of the subprocess's stdout, and anyway it wasn't necessary to call ray.init here to check that logging was configured correctly. Modified python/ray/tune/tests/test_commands.py::test_ls_with_cfg to test the underlying data, not what gets printed to stdout (which has changed with the new logging system). Modified a logging message in ray.tune.automl.search_policy.AutoMLSearcher.on_trial_complete, which in certain cases emits a logging message which tries to format a NoneType into a %f during log message formatting. This was a previously-undetected bug which showed up because the default log level is now INFO. This fixes a test that was failing in test_automl_searcher.py::AutoMLSearcherTest.

Since ray-project#34182, the docs build on M1 mac had been broken

…ay-project#34182) Attempts to consolidate logging configuration by introducing reasonable defaults in ray/log.py. This new logging configuration is done once in ray/__init__.py at the top of the module. Subsequent calls to the configuration are ignored. A logger for ray.rllib is configured at the WARN level, to address Revert "Simplify logging configuration. (ray-project#30863)" ray-project#31858. With this change, Revert "Simplify logging configuration. (ray-project#30863)" ray-project#31858 can be reverted, again simplifying and consolidating logging configuration. Modified test_output.py::test_logger_config to test only the logger config, not launch a ray cluster. The test was failing intermittently, I think due to a race condition between the launch of the cluster and the reading of the subprocess's stdout, and anyway it wasn't necessary to call ray.init here to check that logging was configured correctly. Modified python/ray/tune/tests/test_commands.py::test_ls_with_cfg to test the underlying data, not what gets printed to stdout (which has changed with the new logging system). Modified a logging message in ray.tune.automl.search_policy.AutoMLSearcher.on_trial_complete, which in certain cases emits a logging message which tries to format a NoneType into a %f during log message formatting. This was a previously-undetected bug which showed up because the default log level is now INFO. This fixes a test that was failing in test_automl_searcher.py::AutoMLSearcherTest. Signed-off-by: e428265 <[email protected]>

Since ray-project#34182, the docs build on M1 mac had been broken Signed-off-by: e428265 <[email protected]>

peytondmurray assigned rkooo567 Apr 7, 2023

peytondmurray force-pushed the global-logging-2 branch 9 times, most recently from 2318d38 to 3a9a520 Compare April 12, 2023 17:59

rkooo567 assigned c21, sihanwang41 and krfricke Apr 12, 2023

c21 reviewed Apr 13, 2023

View reviewed changes

python/ray/data/dataset.py Outdated Show resolved Hide resolved

peytondmurray force-pushed the global-logging-2 branch 2 times, most recently from edb8fbc to 42f540c Compare April 13, 2023 07:16

rkooo567 approved these changes Apr 13, 2023

View reviewed changes

peytondmurray force-pushed the global-logging-2 branch 8 times, most recently from ef402d4 to 1772154 Compare April 14, 2023 05:22

peytondmurray force-pushed the global-logging-2 branch from 053379e to 9f0f798 Compare April 14, 2023 17:18

angelinalg approved these changes May 4, 2023

View reviewed changes

rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label May 10, 2023

peytondmurray force-pushed the global-logging-2 branch from 52fce63 to a4a1fa6 Compare May 10, 2023 02:47

peytondmurray requested a review from raulchen as a code owner May 10, 2023 02:47

peytondmurray force-pushed the global-logging-2 branch from a4a1fa6 to c4a19ce Compare May 10, 2023 02:49

peytondmurray force-pushed the global-logging-2 branch 2 times, most recently from 574e1fb to a97b7fd Compare May 11, 2023 19:34

peytondmurray force-pushed the global-logging-2 branch from a97b7fd to 2bcafd7 Compare May 11, 2023 21:16

peytondmurray force-pushed the global-logging-2 branch from 2bcafd7 to 266d2fb Compare May 15, 2023 16:35

peytondmurray added 3 commits May 18, 2023 14:10

Revert "Revert "Global logging format changes" (ray-project#34126)"

8211f2e

This reverts commit 45d5f65. Signed-off-by: pdmurray <[email protected]>

Use propagate_logs with caplog fixtures instead of catch_logs

df1b01e

Signed-off-by: pdmurray <[email protected]>

Fix the kubernetes operator test

6754aac

Signed-off-by: pdmurray <[email protected]>

peytondmurray force-pushed the global-logging-2 branch from 266d2fb to 6754aac Compare May 18, 2023 21:12

rkooo567 merged commit ce7764b into ray-project:master May 19, 2023

peytondmurray deleted the global-logging-2 branch May 19, 2023 00:55

peytondmurray mentioned this pull request May 23, 2023

[Core] Logging config cleanup #33821

Closed

8 tasks

pcmoritz mentioned this pull request May 23, 2023

[Doc] Fix doc build on M1 #35689

Merged

8 tasks

pcmoritz added a commit that referenced this pull request May 24, 2023

[Doc] Fix doc build on M1 (#35689)

ec68b86

Since #34182, the docs build on M1 mac had been broken

scv119 pushed a commit to scv119/ray that referenced this pull request Jun 16, 2023

[Doc] Fix doc build on M1 (ray-project#35689)

e6ef938

Since ray-project#34182, the docs build on M1 mac had been broken

arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023

[Doc] Fix doc build on M1 (ray-project#35689)

76a9b81

Since ray-project#34182, the docs build on M1 mac had been broken Signed-off-by: e428265 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Revert "Global logging format changes" (#34126)" #34182

Revert "Revert "Global logging format changes" (#34126)" #34182

peytondmurray commented Apr 7, 2023 •

edited

Loading

rkooo567 commented Apr 12, 2023

rkooo567 commented Apr 12, 2023

rkooo567 commented Apr 14, 2023

peytondmurray commented Apr 14, 2023

angelinalg left a comment

rkooo567 commented May 10, 2023

peytondmurray commented May 10, 2023

peytondmurray commented May 11, 2023

peytondmurray commented May 11, 2023

peytondmurray commented May 12, 2023

peytondmurray commented May 15, 2023

rkooo567 commented May 19, 2023

rkooo567 commented May 19, 2023

Revert "Revert "Global logging format changes" (#34126)" #34182

Revert "Revert "Global logging format changes" (#34126)" #34182

Conversation

peytondmurray commented Apr 7, 2023 • edited Loading

Why are these changes needed?

Changes

New changes since the revert

Checks

rkooo567 commented Apr 12, 2023

rkooo567 commented Apr 12, 2023

rkooo567 commented Apr 14, 2023

peytondmurray commented Apr 14, 2023

angelinalg left a comment

Choose a reason for hiding this comment

rkooo567 commented May 10, 2023

peytondmurray commented May 10, 2023

peytondmurray commented May 11, 2023

peytondmurray commented May 11, 2023

peytondmurray commented May 12, 2023

peytondmurray commented May 15, 2023

rkooo567 commented May 19, 2023

rkooo567 commented May 19, 2023

peytondmurray commented Apr 7, 2023 •

edited

Loading