Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alter framework logging.yml configuration #1491

Merged
merged 17 commits into from
May 12, 2022

Conversation

AhdraMeraliQB
Copy link
Contributor

@AhdraMeraliQB AhdraMeraliQB commented Apr 29, 2022

Signed-off-by: Ahdra Merali [email protected]

Description

Adds the changes outlined in #1470 to update the inbuilt logging setup.

Development notes

The changes were tested by running some cli commands and noting the changes in the output between this and the unchanged version, and between valid and invalid pipelines. The only divergence in output was noted when running kedro ipython

Original output

...
Python 3.7.11 (default, Jul 27 2021, 07:03:16) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.32.0 -- An enhanced Interactive Python. Type '?' for help.
2022-04-29 13:59:26,965 - root - INFO - Updated path to Kedro project: /Users/ahdra_merali/Desktop/kedrotest
2022-04-29 13:59:27,249 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-04-29 13:59:27,299 - kedro.framework.session.session - WARNING - Unable to git describe /Users/ahdra_merali/Desktop/kedrotest
2022-04-29 13:59:27,600 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2022-04-29 13:59:28,576 - root - INFO - ** Kedro project kedrotest
2022-04-29 13:59:28,576 - root - INFO - Defined global variable `context`, `session`, `catalog` and `pipelines`
2022-04-29 13:59:30,174 - root - INFO - Registered line magic `run_viz`

Updated output

...
Python 3.7.11 (default, Jul 27 2021, 07:03:16) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.32.0 -- An enhanced Interactive Python. Type '?' for help.
WARNING:kedro.framework.session.session:Unable to git describe /Users/ahdra_merali/Desktop/kedrotest
2022-04-29 13:47:55,995 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2022-04-29 13:47:56,921 - root - INFO - ** Kedro project kedrotest
2022-04-29 13:47:56,921 - root - INFO - Defined global variable `context`, `session`, `catalog` and `pipelines`
2022-04-29 13:47:58,538 - root - INFO - Registered line magic `run_viz`

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

@AhdraMeraliQB AhdraMeraliQB linked an issue Apr 29, 2022 that may be closed by this pull request
@@ -33,15 +33,21 @@ handlers:

loggers:
anyconfig:
level: WARNING
handlers: [console]
level: INFO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why anyconfig level is changed from WARNING to INFO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops - thanks for the catch!

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit surprise the kedro.framework.session related log seems missing after the changes?

@AhdraMeraliQB AhdraMeraliQB marked this pull request as draft April 29, 2022 14:10
@AhdraMeraliQB
Copy link
Contributor Author

AhdraMeraliQB commented Apr 29, 2022

@noklam

I am a bit surprise the kedro.framework.session related log seems missing after the changes?

As am I 🤔 ; converting to draft to address the e2e tests and investigate further

@antonymilne
Copy link
Contributor

antonymilne commented May 4, 2022

Thanks for all very thorough testing and observations noted here.

I've written a comment in #1461 to explain better how kedro logging works so that everyone can understand this better, because it is quite confusing.

Given the above let me explain the differences that I see here (please do say if I missed any!):

  1. root - INFO - Updated path to Kedro project message has disappeared
  2. kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore` has disappeared
  3. kedro.framework.session.session - WARNING - Unable to git describe is still there but looks gross

All the above logs are emitted before the project-side logging.yml is loaded and hence just the framework-side logging.yml is used, which is why changing the framework-side logging.yml has changed them. When you removed our custom root logger, it falls back on the Python default one which only emits messages of level WARNING and above. This explains the disappearance of 1 and 2. The Python default logger looks gross which explains 3.

Now, which (if any) of these are problems and do we need to rethink what we're doing here? I think we're on the right lines and this is ok so far, but once I've had a chance to think a bit more about it later this morning I will say if that's not the case!

@noklam
Copy link
Contributor

noklam commented May 4, 2022

Follow up on @AntonyMilneQB comments about the 3 changes.

  1. I think it is ok to not show this message, if we ever want to see this message we need to set the root logger to INFO level
  2. This should be disappear but I think not for the right cause. It should be a DEBUG level message.
  3. I think we may missed some configuration. The format of this warning message is off. Maybe we need to have a kedro library level logging for anything not specified. Same apply to point 2, I suspect if it is a warning level log it will also be printed in the wrong format.

@antonymilne antonymilne mentioned this pull request May 5, 2022
18 tasks
@antonymilne
Copy link
Contributor

antonymilne commented May 5, 2022

Following some changes I've made here and #1506, I think the behaviour after this PR will not result in any observable differences compared to how things currently run. The only (very small) change is that messages of level <= INFO emitted by things outside kedro before project-side logging config is setup used to be logged and no longer will be. This is an improvement and means we can drop the anyconfig special case we used to have.

So in reference to the 3 above messages:

  1. Should re-appear once Remove all root logging from kedro #1506 is merged
  2. Will re-appear but, as @noklam says, will disappear in the future when it gets downgraded to DEBUG level in Revisit the levels of all logging in Kedro #1473
  3. Will be nicely formatted but also should disappear in Revisit the levels of all logging in Kedro #1473

@noklam
Copy link
Contributor

noklam commented May 5, 2022

Thanks, I think changing the hook specific logger to kedro logger is the right thing to do!

@antonymilne
Copy link
Contributor

antonymilne commented May 9, 2022

Now that #1506 has been merged I've tested this again and all behaves as expected:

➜ kedro ipython
2022-05-09 10:41:38,334 - kedro.framework.cli.hooks.manager - INFO - Registered CLI hooks from 1 installed plugin(s): kedro-telemetry-0.2.0
Kedro-Telemetry is installed, but you have opted out of sharing usage analytics so none will be collected.
ipython --ext kedro.extras.extensions.ipython
Python 3.8.13 (default, Mar 28 2022, 06:16:26) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.33.0 -- An enhanced Interactive Python. Type '?' for help.
2022-05-09 10:41:42,257 - kedro.extras.extensions.ipython - INFO - Updated path to Kedro project: /Users/antony_milne/kedro_stuff/projects/spaceflights
2022-05-09 10:41:42,499 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-05-09 10:41:42,569 - kedro.framework.session.session - WARNING - Unable to git describe /Users/antony_milne/kedro_stuff/projects/spaceflights
2022-05-09 10:41:43,464 - kedro.extras.extensions.ipython - INFO - ** Kedro project spaceflights
2022-05-09 10:41:43,464 - kedro.extras.extensions.ipython - INFO - Defined global variable `context`, `session`, `catalog` and `pipelines`
2022-05-09 10:41:44,762 - kedro.extras.extensions.ipython - INFO - Registered line magic `run_viz`

Marking this as ready to review since I think it we're all good here now and it would be nice to get it merged in so we can make progress on the following tickets that depend on it 🙂

Looks like I need to fix up a couple of tests - working on that now...

@antonymilne antonymilne marked this pull request as ready for review May 9, 2022 09:47
Signed-off-by: Antony Milne <[email protected]>
@antonymilne antonymilne requested review from noklam and merelcht and removed request for merelcht May 9, 2022 10:06
Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! The log now looks good to me. Good to be merged after fixing the tests.

Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
antonymilne and others added 7 commits May 9, 2022 21:55
Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
…fig' into feat/alter-framework-logging-config

Signed-off-by: Antony Milne <[email protected]>

Signed-off-by: Antony Milne <[email protected]>
@@ -39,7 +39,7 @@ min-public-methods = 1
[tool.coverage.report]
fail_under = 100
show_missing = true
omit = ["kedro/templates/*", "kedro/framework/cli/hooks/specs.py", "kedro/framework/hooks/specs.py", "kedro/extras/datasets/tensorflow/*", "kedro/extras/datasets/holoviews/*", "tests/*"]
omit = ["kedro/templates/*", "kedro/extras/logging/color_logger.py", "kedro/framework/cli/hooks/specs.py", "kedro/framework/hooks/specs.py", "kedro/extras/datasets/tensorflow/*", "kedro/extras/datasets/holoviews/*", "tests/*"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the color logger here. I actually have no idea why we were getting 100% coverage before because this file has never been fully tested... But given that we're going to deprecate it I decided just to add it to the coverage ignore rather than write tests for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a test_color_logger.py? It's probably not related but seen this from Kiyo yesterday. https://921kiyo.com/code-coverage-level-with-pytest-cov/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is, but it only tests how the color logger handles INFO level messages. Ideally it would go through every level of logging and test them all. I don't know how it was passing coverage before given none of the other levels are tested 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting post on the different coverage levels, I didn't know about that before! I wonder what we'd get if we added --cov-branch to our tests.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question to understand the changes better, but not blocking and I'm happy with merging this!

Comment on lines +40 to +42
kedro:
level: INFO

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally in #1470 we needed to add handlers and propagate to all loggers. Why is this no longer needed? And what exactly do these things do? I still find the logging behaviour a bit mysterious 😅

Copy link
Contributor

@antonymilne antonymilne May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I only fully understood this after reading the logging docs for the 1000th time last week 😀 This will become clearer with examples when we complete #1474 and I'm happy to present to the team to explain further, but in brief:

  • loggers work hierarchically and activate all loggers above them. e.g. when kedro.x.y does logging it will also log to any loggers named kedro.x and kedro. root is the at the top of the hierarchy
  • the above is the case when propagate: true (the default if propagate is not specified explicitly). If you set propagate: false then a logging call to kedro.x.y will not activate any other loggers
  • if you propagate to higher up loggers then if they both have the same handler, this handler will be called separately by each logger that uses it. This results in duplicated log messages
  • our general strategy according to the Python docs should be to delegate as much as possible to the highest up logger (so root if possible), attach handlers to that logger and then allow lower level handlers just to propagate their calls up the chain

The docs on propagate are quite clear and helpful for explaining all this more.

In practice, what happens now is that any logging done anywhere inside kedro will be propagated up to the root logger, which attaches the required handlers. i.e. if kedro.x.y does a logging call then it will check if there's any logger defined explicitly for kedro.x.y (there isn't), then it will check for kedro.x (there isn't), then kedro (which is defined). kedro doesn't define any handlers, so it gets propagates up to root, which is the thing that actually calls the handlers. The point in defining kedro as a logger is just so we can set level: INFO so that any info log messages coming from kedro are shown. The default level set by root (and hence by all other logging calls outside kedro) is WARNING.

kedro.framework.cli.hooks.manager is a weird special case that will hopefully disappear in the near future if we remove the file handlers from the root logger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this explanation @AntonyMilneQB! It makes a lot more sense now 😄

@antonymilne antonymilne merged commit 805da32 into main May 12, 2022
@antonymilne antonymilne deleted the feat/alter-framework-logging-config branch May 12, 2022 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alter framework logging.yml configuration
4 participants