Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace print with logger if they are logs #905

Merged
merged 10 commits into from
Feb 4, 2024

Conversation

shirayu
Copy link
Contributor

@shirayu shirayu commented Oct 29, 2023

This PR replaces print with logger if they are logs.

This helps distinguish logs from other outputs.
In addition, this replacement can also determine whether logs are warnings, information, or errors, and can understand what line of the file they come from.

@shirayu shirayu marked this pull request as ready for review October 29, 2023 08:27
@kohya-ss
Copy link
Owner

Thanks for the comprehensive and detailed PR. I appreciate your hard work and I am sure this will be of great help to the users.

Since the changes are so extensive, it seems to me that it would be better to wait until after we have taken care of all the other PRs that can be addressed sooner. I appreciate if you could wait for a while.

@shirayu
Copy link
Contributor Author

shirayu commented Dec 3, 2023

cf. rich

In this PR, the stream_handler is logging.StreamHandler in the python standard library.

Recently, I have used rich in my private project.
It enables color output easily.
I think that using rich is an option, after closing this PR.

Screenshot from the official repository of rich

Logging

An example of use of rich

# https://rich.readthedocs.io/en/latest/logging.html

from rich.logging import RichHandler

def get_my_logger(name: str):
    logger = logging.getLogger(name)
    logger.setLevel(logging.INFO)

    stream_handler = RichHandler()
    stream_handler.setLevel(logging.INFO)

    myformat = "%(message)s"
    date_format = "%Y-%m-%d %H:%M:%S"
    formatter = logging.Formatter(myformat, date_format)
    stream_handler.setFormatter(formatter)
    logger.addHandler(stream_handler)

    return logger

Copy link
Contributor

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing print()s to logging calls is a great idea, but I think this PR proposes a slight misuse of logging (please see my review comment within).

library/utils.py Outdated Show resolved Hide resolved
@shirayu
Copy link
Contributor Author

shirayu commented Jan 17, 2024

  • Merged current dev and resolved conflicts
  • Improved initialization of logger ( Thanks to @akx 's comments )
  • Added rich

@shirayu
Copy link
Contributor Author

shirayu commented Jan 17, 2024

@kohya-ss Hi, thanks a lot for always doing great work.
I've updated this PR to align with the latest dev branch.
I would appreciate it if you could consider merging.

@kohya-ss
Copy link
Owner

@shirayu
Thank you for updating! I think it is time to merge :) However, please give me a little time to understand this PR so that I can change something if I need to. I'd like to try the behavior in my env.

@kohya-ss
Copy link
Owner

I think the advantage of using a logging framework is that we can change the log output level, log output destination, and log format etc., perhaps by options. If we want to achieve this, should we initialize the logging framework in __main__ function immediately after argparse?

@akx
Copy link
Contributor

akx commented Jan 18, 2024

If we want to achieve this, should we initialize the logging framework in __main__ function immediately after argparse?

Logging is now initialized early in each script by

from library.utils import setup_logging
setup_logging()

I think a follow-up PR can change that to happen after argparse (to allow setting via options), if needed.

@kohya-ss
Copy link
Owner

I think a follow-up PR can change that to happen after argparse (to allow setting via options), if needed.

That's great!

However, just curious for me, what is the advantage of initializing the log at the beginning of the script instead of initializing it in main?

@akx
Copy link
Contributor

akx commented Jan 19, 2024

However, just curious for me, what is the advantage of initializing the log at the beginning of the script instead of initializing it in main?

No difference - unless something logs something "too early", in which case logging may fall back to its internal last resort logger for critical messages. For this project, that's not a big concern :)

@kohya-ss
Copy link
Owner

No difference - unless something logs something "too early", in which case logging may fall back to its internal last resort logger for critical messages. For this project, that's not a big concern :)

Thank you for clarification!

@kohya-ss
Copy link
Owner

@shirayu
I just noticed that the logging framework is initialized at the beginning of utility scripts as well as scripts executed directly from the command line.
It appears that the utility scripts are copied and used in other projects (including my own). The log output in that case seems to be troublesome. Do you have any good ideas?

In addition, it seems that sd-scripts code is sometimes incorporated into other projects. Proper use of the logging framework may actually be a difficult task...

@shirayu
Copy link
Contributor Author

shirayu commented Jan 19, 2024

@kohya-ss

It appears that the utility scripts are copied and used in other projects (including my own). The log output in that case seems to be troublesome. Do you have any good ideas?

The following is an example of ignoring errors in the log initialization section.
This way, we can simply copy the code to another project and not have to change it.

# current
from library.utils import setup_logging
setup_logging()
# an idea
try:
    from library.utils import setup_logging
    setup_logging()
except ImportError:
    pass

In addition, it seems that sd-scripts code is sometimes incorporated into other projects. Proper use of the logging framework may actually be a difficult task...

logging is a standard python module, so I think it should not be a problem.
However, if users want to maintain compatibility with conventional print, announce to add the following statement is an option.

# current
logger = logging.getLogger(__name__)
# an idea
logger = logging.getLogger(__name__)
for f in [logger.debug, logger.info, logger.warning, logger.error, logger.critical]:
     f = print

@akx
Copy link
Contributor

akx commented Jan 19, 2024

I just noticed that the logging framework is initialized at the beginning of utility scripts as well as scripts executed directly from the command line.

setup_logging() getting called multiple times should not be an issue, but I don't think any module in library/ should be calling it. (I didn't check if anything in there is doing that right now.)

It appears that the utility scripts are copied and used in other projects (including my own). The log output in that case seems to be troublesome.

Mm, troublesome how? Using logging instead of print is more easily controlled, since an application embedding those scripts can selectively turn logger levels to what they like:

logging.getLogger("library.hoge").setLevel(logging.ERROR)  # quiesce all hoge except errors
logging.getLogger("library.puyo").setLevel(logging.DEBUG)  # make puyo louder!

Whereas with print(), there's no easy way to filter it at all 😁

However, if users want to maintain compatibility with conventional print, announce to add the following statement is an option.

Unfortunately the suggested replacements are not a good idea – logger.debug and friends have a different signature than print.

Users who want log output to look like prints should just configure logging to output to stdout without formatting – and since loggers cascade, they can even do

logging.getLogger("library").addHandler(...)

and all library.whatever logging will get handled by that handler instead of the root logger.

@akx
Copy link
Contributor

akx commented Jan 19, 2024

Oh, and by the way – to avoid people needing to copy-paste these scripts into their other projects, I will gladly lend a hand to make this repository installable as a Python package (like I did yesterday with huchenlei/HandRefinerPortable#2) – but that's a matter for a different PR, of course 😄

@kohya-ss
Copy link
Owner

Thank you for suggestions!

Hmm... If I use other libraries, I don't like the detailed formatted log output from these libraries. Therefore, I personally would like the default behavior of each scripts to be the same as the current print.

So that, would the following code do it?

try:
    from library.utils import setup_logging
    setup_logging()
except ImportError:
    # setup logging handler here same as print

@shirayu
Copy link
Contributor Author

shirayu commented Jan 19, 2024

@kohya-ss

So that, would the following code do it?

I think this is what you expect.

import logging

try:
    from library.utils import setup_logging

    setup_logging()
except ImportError:

    class PrintHandler(logging.Handler):
        def emit(self, record):
            print(self.format(record))

    handler = PrintHandler()
    logging.root.addHandler(handler)
    logging.root.setLevel(logging.DEBUG) # print ALL logs

logger = logging.getLogger(__name__)

@akx
Copy link
Contributor

akx commented Jan 19, 2024

Ah, @shirayu, I'm not quite following why there would ever be an import error for library.utils? 😅

Also, could you remove the setup_logging() calls from library (and other modules that are supposed to be only imported and not run directly)? The idea with the logging module is that importing a library module in general should not be configuring logging.

Hmm... If I use other libraries, I don't like the detailed formatted log output from these libraries. Therefore, I personally would like the default behavior of each scripts to be the same as the current print.

If I understand you correctly here, you'd want logger output from the main scripts (e.g. sdxl_train) to look like regular prints (when run stand-alone)? Then we could

logger = logging.getLogger(__name__)
# ...
if __name__ == "__main__":
    from library.utils import configure_logger_to_print
    configure_logger_to_print(logger)
    # ... other initialization ...

and something like

def configure_logger_to_print(logger):
    h = logging.StreamHandler(sys.stdout)
    h.setLevel(logging.DEBUG)
    h.propagate = False  # prevent parent loggers from handling this
    logger.addHandler(h)

?

@kohya-ss
Copy link
Owner

I'm not quite following why there would ever be an import error for library.utils? 😅

In some projects (and me), some scripts are copied and reused for some features, such as model conversions, etc. In this case, an import error may occur.

If I understand you correctly here, you'd want logger output from the main scripts (e.g. sdxl_train) to look like regular prints (when run stand-alone)? Then we could

My concern is when the scripts are reused. However, well, I would like to have the option to make logger output from the main scripts as same as regular print (I think consideration should be given as to whether rich output or print should be the default).

@kohya-ss kohya-ss changed the base branch from dev to dev_improve_log February 4, 2024 08:52
@kohya-ss kohya-ss merged commit 5f6bf29 into kohya-ss:dev_improve_log Feb 4, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants