Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-6347] BugFix: Can't get task logs when serialization is enabled #7092

Merged
merged 5 commits into from
Jan 7, 2020

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Jan 7, 2020

Quoting from Jira:

Problem:
When I set next config options in airflow.cfg:

[core]
store_serialized_dags = True
min_serialized_dag_update_interval = 3

View with task logs shows infinity Js spinner, while in webserver log I see next error:

[2019-12-25 14:48:57,640] {{app.py:1891}} ERROR - Exception on /get_logs_with_metadata [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/www_rbac/decorators.py", line 121, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/flask_appbuilder/security/decorators.py", line 101, in wraps
    return f(self, *args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/www_rbac/decorators.py", line 56, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/www_rbac/views.py", line 637, in get_logs_with_metadata
    logs, metadata = _get_logs_with_metadata(try_number, metadata)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/www_rbac/views.py", line 628, in _get_logs_with_metadata
    logs, metadatas = handler.read(ti, try_number, metadata=metadata)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/log/file_task_handler.py", line 169, in read
    log, metadata = self._read(task_instance, try_number_element, metadata)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/log/file_task_handler.py", line 98, in _read
    log_relative_path = self._render_filename(ti, try_number)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/log/file_task_handler.py", line 75, in _render_filename
    jinja_context = ti.get_template_context()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1149, in get_template_context
    if 'tables' in task.params:
TypeError: argument of type 'NoneType' is not iterable

Solution:
This is because we use self.params = params or {} and where default value of params is None in DAG and Operators and Dag S10n ignore all the None field.

While de-serializing params was being set to None because of the following line:

keys_to_set_none = dag.get_serialized_fields() - encoded_dag.keys() - cls._CONSTRUCTOR_PARAMS.keys()
        for k in keys_to_set_none:
            setattr(dag, k, None)

Issue link: AIRFLOW-6347

  • Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@mik-laj
Copy link
Member

mik-laj commented Jan 7, 2020

Maybe we should create a new context for Jinja used to render log filename templates?

@@ -265,12 +266,9 @@ class SerializedBaseOperator(BaseOperator, BaseSerialization):
Class specific attributes used by UI are move to object attributes.
"""

_decorated_fields = {'executor_config', }
_decorated_fields = {'executor_config', 'params'}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding this change now changes the format of the serialization we store, and since the previous format was in 1.10.7 I thin kthis means we now need to think about adding a v2 of the format and how we deal with upgrade etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, agree

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this change

@codecov-io
Copy link

codecov-io commented Jan 7, 2020

Codecov Report

Merging #7092 into master will decrease coverage by 0.06%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7092      +/-   ##
==========================================
- Coverage   84.93%   84.87%   -0.07%     
==========================================
  Files         680      680              
  Lines       38826    38824       -2     
==========================================
- Hits        32978    32951      -27     
- Misses       5848     5873      +25
Impacted Files Coverage Δ
airflow/serialization/serialized_objects.py 90.72% <100%> (-0.41%) ⬇️
airflow/utils/decorators.py 90.47% <100%> (ø) ⬆️
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/kubernetes/volume.py 52.94% <0%> (-47.06%) ⬇️
airflow/kubernetes/pod_launcher.py 45.25% <0%> (-46.72%) ⬇️
airflow/kubernetes/refresh_config.py 50.98% <0%> (-23.53%) ⬇️
...rflow/contrib/operators/kubernetes_pod_operator.py 78.75% <0%> (-20%) ⬇️
airflow/hooks/dbapi_hook.py 91.73% <0%> (+0.82%) ⬆️
airflow/models/connection.py 68.78% <0%> (+0.97%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b37472...e2f4e24. Read the comment docs.

@ashb
Copy link
Member

ashb commented Jan 7, 2020

Maybe we should create a new context for Jinja used to render log filename templates?

We've had problems in the past (around 1.8.2->1.9 time) where we changed the logging config, and old logs were no longer visible in the UI unless someone went and renamed the old files/S3 objects etc. I think the thing to do instead (and this might be a 2.0 thing) is to:

  1. Store the rendered log file name in the TI table
  2. Have a different row for each TI, so that we have a sensible place to store this info.

@kaxil kaxil merged commit 96ee8bd into apache:master Jan 7, 2020
@kaxil kaxil deleted the AIRFLOW-6437 branch January 7, 2020 18:29
kaxil added a commit to astronomer/airflow that referenced this pull request Jan 8, 2020
kaxil added a commit that referenced this pull request Jan 10, 2020
kaxil added a commit that referenced this pull request Jan 10, 2020
kaxil added a commit to astronomer/airflow that referenced this pull request Jan 11, 2020
galuszkak pushed a commit to FlyrInc/apache-airflow that referenced this pull request Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants