Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add info log message about duration taken to load plugins #12308

Merged
merged 1 commit into from
Nov 16, 2020

Conversation

ashb
Copy link
Member

@ashb ashb commented Nov 12, 2020

Loading plugins, particularly from setuptools entry points can be slow,
and since by default this happens per-task, it can slow down task
execution unexpectedly.

By having this log message users can know the source of the delay


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@ashb
Copy link
Member Author

ashb commented Nov 12, 2020

/cc @mjpieters

@ashb ashb requested review from mik-laj and kaxil November 12, 2020 13:21
@github-actions
Copy link

The PR needs to run all tests because it modifies core of Airflow! Please rebase it to latest master or ask committer to re-run it!

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Nov 12, 2020
@ashb ashb force-pushed the show-plugin-loading-duration branch from 476a67b to c8f4dc8 Compare November 12, 2020 22:06
@ashb ashb requested review from mik-laj and kaxil November 12, 2020 22:06
@@ -90,8 +90,9 @@ def _color_record_traceback(self, record: LogRecord) -> LogRecord:

def format(self, record: LogRecord) -> str:
try:
record = self._color_record_args(record)
record = self._color_record_traceback(record)
if self.stream.isatty():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that caplog fixture from pytest doesn't get poluted with escape code.

from airflow import plugins_manager

# assert not logs
with self.assertRaises(AssertionError), self.assertLogs(plugins_manager.log):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this a huge code smell, and a funny/hard to follow way of saying "does not log".

So I switched it all out to pytests caplog fixture.


def test_should_warning_about_incompatible_plugins(self):
assert caplog.records[0].levelname == 'INFO'
assert caplog.records[0].msg == 'Loading %d plugin(s) took %.2f seconds'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

his is the un-formatted message string

@ashb ashb force-pushed the show-plugin-loading-duration branch from c8f4dc8 to 330c542 Compare November 12, 2020 22:16
@@ -262,6 +263,8 @@ def ensure_plugins_loaded():

log.debug("Loading plugins")

start = time.monotonic()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might there be a usecase for a general timing context manager here?

with timer() as t:
    # ...

log.info("Frobbing the foonars took %.2f seconds", t.elapsed.total_seconds())

Encapsulation makes it easier to make sure everyone gets this right (using time.monotonic()!), and might make it easier to automatically export metrics to statd / prometheus if desired (with timer(metric="metric_name") as t:).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be. For metrics there is already a with Stats.timer("my_metric_name":) though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only problem I see is that context manager introduces additional level of indentation which can require a pylint disable sometimes... Apart from that I like the idea!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also do it like this:

timer = Stats.timer().start()
...
timer.end()
# use timer.duration

@kaxil
Copy link
Member

kaxil commented Nov 14, 2020

https://github.com/apache/airflow/pull/12308/checks?check_run_id=1399669598#step:6:1249

FAILED tests/plugins/test_plugins_manager.py::TestPluginsManager::test_should_load_plugins_from_property
FAILED tests/plugins/test_plugins_manager.py::TestPluginsManager::test_should_warning_about_incompatible_plugins
======= 2 failed, 368 passed, 8 skipped, 2 warnings in 203.14s (0:03:23) =======
=================================== FAILURES ===================================
__________ TestPluginsManager.test_should_load_plugins_from_property ___________

self = <tests.plugins.test_plugins_manager.TestPluginsManager object at 0x7efdc0b1c890>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7efdc0a27d50>

    def test_should_load_plugins_from_property(self, caplog):
        class AirflowTestPropertyPlugin(AirflowPlugin):
            name = "test_property_plugin"
    
            @property
            def hooks(self):
                class TestPropertyHook(BaseHook):
                    pass
    
                return [TestPropertyHook]
    
        with mock_plugin_manager(plugins=[AirflowTestPropertyPlugin()]):
            from airflow import plugins_manager
    
            plugins_manager.ensure_plugins_loaded()
    
            assert 'AirflowTestPropertyPlugin' in str(plugins_manager.plugins)
            assert 'TestPropertyHook' in str(plugins_manager.registered_hooks)
    
>       assert caplog.records[0].levelname == 'INFO'
E       IndexError: list index out of range

tests/plugins/test_plugins_manager.py:135: IndexError
----------------------------- Captured stdout call -----------------------------
[2020-11-14 11:18:08,276] {{plugins_manager.py:282}} INFO - Loading 1 plugin(s) took 0.00 seconds
______ TestPluginsManager.test_should_warning_about_incompatible_plugins _______

self = <tests.plugins.test_plugins_manager.TestPluginsManager object at 0x7efdc0a8b310>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7efdc0a8b910>

    def test_should_warning_about_incompatible_plugins(self, caplog):
        class AirflowAdminViewsPlugin(AirflowPlugin):
            name = "test_admin_views_plugin"
    
            admin_views = [mock.MagicMock()]
    
        class AirflowAdminMenuLinksPlugin(AirflowPlugin):
            name = "test_menu_links_plugin"
    
            menu_links = [mock.MagicMock()]
    
        with mock_plugin_manager(
            plugins=[AirflowAdminViewsPlugin(), AirflowAdminMenuLinksPlugin()]
        ), caplog.at_level(logging.WARNING):
            from airflow import plugins_manager
    
            plugins_manager.initialize_web_ui_plugins()
    
>       assert caplog.record_tuples == [
            (
                "airflow.plugins_manager",
                logging.WARNING,
                "Plugin 'test_admin_views_plugin' may not be compatible with the current Airflow version. "
                "Please contact the author of the plugin.",
            ),
            (
                "airflow.plugins_manager",
                logging.WARNING,
                "Plugin 'test_menu_links_plugin' may not be compatible with the current Airflow version. "
                "Please contact the author of the plugin.",
            ),
        ]
E       assert [] == [('airflow.pl...the plugin.")]
E         Right contains 2 more items, first extra item: ('airflow.plugins_manager', 30, "Plugin 'test_admin_views_plugin' may not be compatible with the current Airflow version. Please contact the author of the plugin.")
E         Use -v to get the full diff

tests/plugins/test_plugins_manager.py:156: AssertionError

@ashb
Copy link
Member Author

ashb commented Nov 14, 2020

Hmmmmm, passes for me locally. Fun.

Including when using ./breeze --github-image-id 363059764 --python 3.7 -b postgres -- some other test must be having a side-effect.

Found it:

tests/task/task_runner/test_standard_task_runner.py::TestStandardTaskRunner  tests/plugins/test_plugins_manager.py::TestPluginsManager

That fails. Any test in test_standard_task_runner causes the problem. I think the problem may be an "errant" logging.shutdown.

This test had a dictConfig in the setUpClass that changed the config, but didn't reset things back. Fixed it now

Loading plugins, particularly from setuptools entry points can be slow,
and since by default this happens per-task, it can slow down task
execution unexpectedly.

By having this log message users can know the source of the delay
@ashb ashb force-pushed the show-plugin-loading-duration branch from a46acfe to 1170c72 Compare November 15, 2020 21:53
@ashb ashb merged commit 6d05108 into apache:master Nov 16, 2020
@ashb ashb deleted the show-plugin-loading-duration branch November 16, 2020 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:plugins full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants