Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error traceback missing from task logs when json_format=True #14950

Closed
mhaalme opened this issue Mar 23, 2021 · 1 comment
Closed

Error traceback missing from task logs when json_format=True #14950

mhaalme opened this issue Mar 23, 2021 · 1 comment
Labels
area:logging kind:bug This is a clearly a bug
Milestone

Comments

@mhaalme
Copy link

mhaalme commented Mar 23, 2021

Apache Airflow version: 2.0.1

Environment:

  • OS: Oracle Linux Server 7.9

What happened:

One of our new BashOperator tasks failed and no useful information could be found about the cause of failure in the task logs. This seemed odd, so I temporarily turned off JSON formatting, and the non-JSON logs contained a lot more information. With JSON logging turned on, only the first line was logged and the entire traceback was discarded.

I've replaced the values of dag_id, task_id and the bash_command path in the following logs with similar placeholders. Their actual values are not relevant in reproducing the error.

ERROR level logs if json_format=True:

{"asctime": "2021-03-19 15:33:05,620", "filename": "taskinstance.py", "lineno": 1455, "levelname": "ERROR", "message": "sh /path/to/my/script.sh", "created": 1616160785.620597, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_19T13_33_04_356460", "try_number": "1", "log_id": "my_dag-my_task-2021-03-19T13:33:04.356460+00:00-1", "offset": 1616160785567723008}

ERROR level logs if json_format=False:

[2021-03-19 15:37:19,468] {{taskinstance.py:1455}} ERROR - sh /path/to/my/script.sh
Traceback (most recent call last):
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1112, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1249, in _prepare_and_execute_task_with_callbacks
    self.render_templates(context=context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1750, in render_templates
    self.task.render_template_fields(context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py", line 859, in render_template_fields
    self._do_render_template_fields(self, self.template_fields, context, jinja_env, set())
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py", line 872, in _do_render_template_fields
    rendered_content = self.render_template(content, context, jinja_env, seen_oids)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py", line 907, in render_template
    return jinja_env.get_template(content).render(**context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/environment.py", line 883, in get_template
    return self._load_template(name, self.make_globals(globals))
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/environment.py", line 857, in _load_template
    template = self.loader.load(self, name, globals)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/loaders.py", line 115, in load
    source, filename, uptodate = self.get_source(environment, name)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/loaders.py", line 197, in get_source
    raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: sh /path/to/my/script.sh

The actual error itself is irrelevant. I understand that it happened because the command was interpreted as a template and I've fixed the command. The problem is that the error cause was not logged properly with json logging turned on, which made it a lot more difficult to identify the cause of the error.

What you expected to happen:

Log messages should contain the same information regardless of log format. The entire traceback should be included in json logs, preferably within a single json record with escaped linebreaks.

How to reproduce it:

  1. Set json_format=True in airflow.cfg
  2. Create a dag with one BashOperator task. Set bash_command to something that cannot be successfully run, e.g the wrong path to a template.
  3. Run the task and check task logs
  4. Repeat with json_format=False and compare

Anything else we need to know:
I don't know if this problem is specific to BashOperator or happens with other operators as well.

The same behaviour can be observed if another kind of invalid command is supplied, e.g a bash_command to execute a file that does not exist. Please see the ERROR level log line in the following examples.

json_format=False

[2021-03-23 09:57:29,977] {{taskinstance.py:851}} INFO - Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [queued]>
[2021-03-23 09:57:29,997] {{taskinstance.py:851}} INFO - Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [queued]>
[2021-03-23 09:57:29,997] {{taskinstance.py:1042}} INFO -
--------------------------------------------------------------------------------
[2021-03-23 09:57:29,997] {{taskinstance.py:1043}} INFO - Starting attempt 1 of 1
[2021-03-23 09:57:29,997] {{taskinstance.py:1044}} INFO -
--------------------------------------------------------------------------------
[2021-03-23 09:57:30,012] {{taskinstance.py:1063}} INFO - Executing <Task(BashOperator): my_task> on 2021-03-23T07:56:28.535261+00:00
[2021-03-23 09:57:30,015] {{standard_task_runner.py:52}} INFO - Started process 3461 to run task
[2021-03-23 09:57:30,021] {{standard_task_runner.py:76}} INFO - Running: ['airflow', 'tasks', 'run', 'my_dag', 'my_task', '2021-03-23T07:56:28.535261+00:00', '--job-id', '169438', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dags/my_dag.py', '--cfg-path', '/tmp/tmpcgczuh72', '--error-file', '/tmp/tmpj7gc2ajl']
[2021-03-23 09:57:30,023] {{standard_task_runner.py:77}} INFO - Job 169438: Subtask my_task
[2021-03-23 09:57:30,092] {{logging_mixin.py:104}} INFO - Running <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [running]> on host airflow.host
[2021-03-23 09:57:30,183] {{taskinstance.py:1257}} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=Teamplatform
AIRFLOW_CTX_DAG_ID=my_dag
AIRFLOW_CTX_TASK_ID=my_task
AIRFLOW_CTX_EXECUTION_DATE=2021-03-23T07:56:28.535261+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-03-23T07:56:28.535261+00:00
[2021-03-23 09:57:30,185] {{bash.py:135}} INFO - Tmp dir root location:
 /tmp
[2021-03-23 09:57:30,185] {{bash.py:158}} INFO - Running command: /wrong/path/to/zabbix_sender -z zabbix.host -s airflow.host -k airflow.scheduler.delay -o 61
[2021-03-23 09:57:30,192] {{bash.py:169}} INFO - Output:
[2021-03-23 09:57:30,193] {{bash.py:173}} INFO - bash: /wrong/path/to/zabbix_sender: No such file or directory
[2021-03-23 09:57:30,194] {{bash.py:177}} INFO - Command exited with return code 127
[2021-03-23 09:57:30,209] {{taskinstance.py:1455}} ERROR - Bash command failed. The command returned a non-zero exit code.
Traceback (most recent call last):
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1112, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1285, in _prepare_and_execute_task_with_callbacks
    result = self._execute_task(context, task_copy)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1315, in _execute_task
    result = task_copy.execute(context=context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/operators/bash.py", line 180, in execute
    raise AirflowException('Bash command failed. The command returned a non-zero exit code.')
airflow.exceptions.AirflowException: Bash command failed. The command returned a non-zero exit code.
[2021-03-23 09:57:30,212] {{taskinstance.py:1503}} INFO - Marking task as FAILED. dag_id=my_dag, task_id=my_task, execution_date=20210323T075628, start_date=20210323T075729, end_date=20210323T075730
[2021-03-23 09:57:30,275] {{local_task_job.py:146}} INFO - Task exited with return code 1
end_of_log

json_format=True

{"asctime": "2021-03-23 09:54:29,572", "filename": "taskinstance.py", "lineno": 851, "levelname": "INFO", "message": "Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [queued]>", "created": 1616486069.5725589, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 851, "levelname": "INFO", "message": "Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [queued]>", "created": 1616486069.5971286, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 1042, "levelname": "INFO", "message": "\n--------------------------------------------------------------------------------", "created": 1616486069.5976064, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 1043, "levelname": "INFO", "message": "Starting attempt 1 of 1", "created": 1616486069.5978124, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 1044, "levelname": "INFO", "message": "\n--------------------------------------------------------------------------------", "created": 1616486069.5979714, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,611", "filename": "taskinstance.py", "lineno": 1063, "levelname": "INFO", "message": "Executing <Task(BashOperator): my_task> on 2021-03-23T07:53:28.535261+00:00", "created": 1616486069.611475, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,616", "filename": "standard_task_runner.py", "lineno": 52, "levelname": "INFO", "message": "Started process 2610 to run task", "created": 1616486069.6161618, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,622", "filename": "standard_task_runner.py", "lineno": 76, "levelname": "INFO", "message": "Running: ['airflow', 'tasks', 'run', 'my_dag', 'my_task', '2021-03-23T07:53:28.535261+00:00', '--job-id', '169433', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dags/my_dag.py', '--cfg-path', '/tmp/tmp09133rp4', '--error-file', '/tmp/tmpyr5c5hkv']", "created": 1616486069.622188, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,623", "filename": "standard_task_runner.py", "lineno": 77, "levelname": "INFO", "message": "Job 169433: Subtask my_task", "created": 1616486069.6236992, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,769", "filename": "logging_mixin.py", "lineno": 104, "levelname": "INFO", "message": "Running <TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [running]> on host airflow.host", "created": 1616486069.7698724, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,869", "filename": "taskinstance.py", "lineno": 1257, "levelname": "INFO", "message": "Exporting the following env vars:\nAIRFLOW_CTX_DAG_OWNER=Teamplatform\nAIRFLOW_CTX_DAG_ID=my_dag\nAIRFLOW_CTX_TASK_ID=my_task\nAIRFLOW_CTX_EXECUTION_DATE=2021-03-23T07:53:28.535261+00:00\nAIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-03-23T07:53:28.535261+00:00", "created": 1616486069.8690863, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,870", "filename": "bash.py", "lineno": 135, "levelname": "INFO", "message": "Tmp dir root location: \n /tmp", "created": 1616486069.8703568, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,871", "filename": "bash.py", "lineno": 158, "levelname": "INFO", "message": "Running command: /wrong/path/to/zabbix_sender -z zabbix.host -s airflow.host -k airflow.scheduler.delay -o 61", "created": 1616486069.871164, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,880", "filename": "bash.py", "lineno": 169, "levelname": "INFO", "message": "Output:", "created": 1616486069.8801942, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,882", "filename": "bash.py", "lineno": 173, "levelname": "INFO", "message": "bash: /wrong/path/to/zabbix_sender: No such file or directory", "created": 1616486069.8820558, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,882", "filename": "bash.py", "lineno": 177, "levelname": "INFO", "message": "Command exited with return code 127", "created": 1616486069.882363, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,899", "filename": "taskinstance.py", "lineno": 1455, "levelname": "ERROR", "message": "Bash command failed. The command returned a non-zero exit code.", "created": 1616486069.8990078, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,901", "filename": "taskinstance.py", "lineno": 1503, "levelname": "INFO", "message": "Marking task as FAILED. dag_id=my_dag, task_id=my_task, execution_date=20210323T075328, start_date=20210323T075429, end_date=20210323T075429", "created": 1616486069.9011128, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,961", "filename": "local_task_job.py", "lineno": 146, "levelname": "INFO", "message": "Task exited with return code 1", "created": 1616486069.9613628, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
end_of_log

In that case, the bash command output is logged anyway in INFO level and so it's not that big of an issue compared to the case I've described above. Either way it would be good to have this information in json logs as well.

Our logging configuration:

[logging]
base_log_folder = <path to log folder>
remote_logging = True
remote_log_conn_id =
google_key_path =
remote_base_log_folder =
encrypt_s3_logs = False
logging_level = INFO
fab_logging_level = WARN
logging_config_class =
colored_console_log = True
colored_log_format = [%(blue)s%(asctime)s%(reset)s] {{%(blue)s%(filename)s:%(reset)s%(lineno)d}} %(log_color)s%(levelname)s%(reset)s - %(log_color)s%(message)s%(reset)s
colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
log_format = [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
simple_log_format = %(asctime)s %(levelname)s - %(message)s
task_log_prefix_template =
log_filename_template = run-{{ ti.dag_id }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = <path to log file>
task_log_reader = task
extra_loggers =

[elasticsearch]
host = <elasticsearch connection string>
log_id_template = {{ti.dag_id}}-{{ti.task_id}}-{{execution_date}}-{{try_number}}
end_of_log_mark = end_of_log
frontend =
write_stdout = False
json_format = True
json_fields = asctime, filename, lineno, levelname, message, created

@mhaalme mhaalme added the kind:bug This is a clearly a bug label Mar 23, 2021
@kaxil kaxil added this to the Airflow 2.0.3 milestone Mar 24, 2021
@ashb
Copy link
Member

ashb commented May 7, 2021

Fixed by #15414

@ashb ashb closed this as completed May 7, 2021
@jhtimmins jhtimmins modified the milestones: Airflow 2.0.3, Airflow 2.1 May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

4 participants