Error traceback missing from task logs when json_format=True #14950

mhaalme · 2021-03-23T10:13:18Z

Apache Airflow version: 2.0.1

Environment:

OS: Oracle Linux Server 7.9

What happened:

One of our new BashOperator tasks failed and no useful information could be found about the cause of failure in the task logs. This seemed odd, so I temporarily turned off JSON formatting, and the non-JSON logs contained a lot more information. With JSON logging turned on, only the first line was logged and the entire traceback was discarded.

I've replaced the values of dag_id, task_id and the bash_command path in the following logs with similar placeholders. Their actual values are not relevant in reproducing the error.

ERROR level logs if json_format=True:

{"asctime": "2021-03-19 15:33:05,620", "filename": "taskinstance.py", "lineno": 1455, "levelname": "ERROR", "message": "sh /path/to/my/script.sh", "created": 1616160785.620597, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_19T13_33_04_356460", "try_number": "1", "log_id": "my_dag-my_task-2021-03-19T13:33:04.356460+00:00-1", "offset": 1616160785567723008}

ERROR level logs if json_format=False:

[2021-03-19 15:37:19,468] {{taskinstance.py:1455}} ERROR - sh /path/to/my/script.sh
Traceback (most recent call last):
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1112, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1249, in _prepare_and_execute_task_with_callbacks
    self.render_templates(context=context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1750, in render_templates
    self.task.render_template_fields(context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py", line 859, in render_template_fields
    self._do_render_template_fields(self, self.template_fields, context, jinja_env, set())
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py", line 872, in _do_render_template_fields
    rendered_content = self.render_template(content, context, jinja_env, seen_oids)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py", line 907, in render_template
    return jinja_env.get_template(content).render(**context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/environment.py", line 883, in get_template
    return self._load_template(name, self.make_globals(globals))
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/environment.py", line 857, in _load_template
    template = self.loader.load(self, name, globals)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/loaders.py", line 115, in load
    source, filename, uptodate = self.get_source(environment, name)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/loaders.py", line 197, in get_source
    raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: sh /path/to/my/script.sh

The actual error itself is irrelevant. I understand that it happened because the command was interpreted as a template and I've fixed the command. The problem is that the error cause was not logged properly with json logging turned on, which made it a lot more difficult to identify the cause of the error.

What you expected to happen:

Log messages should contain the same information regardless of log format. The entire traceback should be included in json logs, preferably within a single json record with escaped linebreaks.

How to reproduce it:

Set json_format=True in airflow.cfg
Create a dag with one BashOperator task. Set bash_command to something that cannot be successfully run, e.g the wrong path to a template.
Run the task and check task logs
Repeat with json_format=False and compare

Anything else we need to know:
I don't know if this problem is specific to BashOperator or happens with other operators as well.

The same behaviour can be observed if another kind of invalid command is supplied, e.g a bash_command to execute a file that does not exist. Please see the ERROR level log line in the following examples.

json_format=False

[2021-03-23 09:57:29,977] {{taskinstance.py:851}} INFO - Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [queued]>
[2021-03-23 09:57:29,997] {{taskinstance.py:851}} INFO - Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [queued]>
[2021-03-23 09:57:29,997] {{taskinstance.py:1042}} INFO -
--------------------------------------------------------------------------------
[2021-03-23 09:57:29,997] {{taskinstance.py:1043}} INFO - Starting attempt 1 of 1
[2021-03-23 09:57:29,997] {{taskinstance.py:1044}} INFO -
--------------------------------------------------------------------------------
[2021-03-23 09:57:30,012] {{taskinstance.py:1063}} INFO - Executing <Task(BashOperator): my_task> on 2021-03-23T07:56:28.535261+00:00
[2021-03-23 09:57:30,015] {{standard_task_runner.py:52}} INFO - Started process 3461 to run task
[2021-03-23 09:57:30,021] {{standard_task_runner.py:76}} INFO - Running: ['airflow', 'tasks', 'run', 'my_dag', 'my_task', '2021-03-23T07:56:28.535261+00:00', '--job-id', '169438', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dags/my_dag.py', '--cfg-path', '/tmp/tmpcgczuh72', '--error-file', '/tmp/tmpj7gc2ajl']
[2021-03-23 09:57:30,023] {{standard_task_runner.py:77}} INFO - Job 169438: Subtask my_task
[2021-03-23 09:57:30,092] {{logging_mixin.py:104}} INFO - Running <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [running]> on host airflow.host
[2021-03-23 09:57:30,183] {{taskinstance.py:1257}} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=Teamplatform
AIRFLOW_CTX_DAG_ID=my_dag
AIRFLOW_CTX_TASK_ID=my_task
AIRFLOW_CTX_EXECUTION_DATE=2021-03-23T07:56:28.535261+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-03-23T07:56:28.535261+00:00
[2021-03-23 09:57:30,185] {{bash.py:135}} INFO - Tmp dir root location:
 /tmp
[2021-03-23 09:57:30,185] {{bash.py:158}} INFO - Running command: /wrong/path/to/zabbix_sender -z zabbix.host -s airflow.host -k airflow.scheduler.delay -o 61
[2021-03-23 09:57:30,192] {{bash.py:169}} INFO - Output:
[2021-03-23 09:57:30,193] {{bash.py:173}} INFO - bash: /wrong/path/to/zabbix_sender: No such file or directory
[2021-03-23 09:57:30,194] {{bash.py:177}} INFO - Command exited with return code 127
[2021-03-23 09:57:30,209] {{taskinstance.py:1455}} ERROR - Bash command failed. The command returned a non-zero exit code.
Traceback (most recent call last):
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1112, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1285, in _prepare_and_execute_task_with_callbacks
    result = self._execute_task(context, task_copy)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 1315, in _execute_task
    result = task_copy.execute(context=context)
  File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/operators/bash.py", line 180, in execute
    raise AirflowException('Bash command failed. The command returned a non-zero exit code.')
airflow.exceptions.AirflowException: Bash command failed. The command returned a non-zero exit code.
[2021-03-23 09:57:30,212] {{taskinstance.py:1503}} INFO - Marking task as FAILED. dag_id=my_dag, task_id=my_task, execution_date=20210323T075628, start_date=20210323T075729, end_date=20210323T075730
[2021-03-23 09:57:30,275] {{local_task_job.py:146}} INFO - Task exited with return code 1
end_of_log

json_format=True

{"asctime": "2021-03-23 09:54:29,572", "filename": "taskinstance.py", "lineno": 851, "levelname": "INFO", "message": "Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [queued]>", "created": 1616486069.5725589, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 851, "levelname": "INFO", "message": "Dependencies all met for <TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [queued]>", "created": 1616486069.5971286, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 1042, "levelname": "INFO", "message": "\n--------------------------------------------------------------------------------", "created": 1616486069.5976064, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 1043, "levelname": "INFO", "message": "Starting attempt 1 of 1", "created": 1616486069.5978124, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py", "lineno": 1044, "levelname": "INFO", "message": "\n--------------------------------------------------------------------------------", "created": 1616486069.5979714, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,611", "filename": "taskinstance.py", "lineno": 1063, "levelname": "INFO", "message": "Executing <Task(BashOperator): my_task> on 2021-03-23T07:53:28.535261+00:00", "created": 1616486069.611475, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,616", "filename": "standard_task_runner.py", "lineno": 52, "levelname": "INFO", "message": "Started process 2610 to run task", "created": 1616486069.6161618, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,622", "filename": "standard_task_runner.py", "lineno": 76, "levelname": "INFO", "message": "Running: ['airflow', 'tasks', 'run', 'my_dag', 'my_task', '2021-03-23T07:53:28.535261+00:00', '--job-id', '169433', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dags/my_dag.py', '--cfg-path', '/tmp/tmp09133rp4', '--error-file', '/tmp/tmpyr5c5hkv']", "created": 1616486069.622188, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,623", "filename": "standard_task_runner.py", "lineno": 77, "levelname": "INFO", "message": "Job 169433: Subtask my_task", "created": 1616486069.6236992, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
{"asctime": "2021-03-23 09:54:29,769", "filename": "logging_mixin.py", "lineno": 104, "levelname": "INFO", "message": "Running <TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [running]> on host airflow.host", "created": 1616486069.7698724, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,869", "filename": "taskinstance.py", "lineno": 1257, "levelname": "INFO", "message": "Exporting the following env vars:\nAIRFLOW_CTX_DAG_OWNER=Teamplatform\nAIRFLOW_CTX_DAG_ID=my_dag\nAIRFLOW_CTX_TASK_ID=my_task\nAIRFLOW_CTX_EXECUTION_DATE=2021-03-23T07:53:28.535261+00:00\nAIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-03-23T07:53:28.535261+00:00", "created": 1616486069.8690863, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,870", "filename": "bash.py", "lineno": 135, "levelname": "INFO", "message": "Tmp dir root location: \n /tmp", "created": 1616486069.8703568, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,871", "filename": "bash.py", "lineno": 158, "levelname": "INFO", "message": "Running command: /wrong/path/to/zabbix_sender -z zabbix.host -s airflow.host -k airflow.scheduler.delay -o 61", "created": 1616486069.871164, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,880", "filename": "bash.py", "lineno": 169, "levelname": "INFO", "message": "Output:", "created": 1616486069.8801942, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,882", "filename": "bash.py", "lineno": 173, "levelname": "INFO", "message": "bash: /wrong/path/to/zabbix_sender: No such file or directory", "created": 1616486069.8820558, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,882", "filename": "bash.py", "lineno": 177, "levelname": "INFO", "message": "Command exited with return code 127", "created": 1616486069.882363, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,899", "filename": "taskinstance.py", "lineno": 1455, "levelname": "ERROR", "message": "Bash command failed. The command returned a non-zero exit code.", "created": 1616486069.8990078, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,901", "filename": "taskinstance.py", "lineno": 1503, "levelname": "INFO", "message": "Marking task as FAILED. dag_id=my_dag, task_id=my_task, execution_date=20210323T075328, start_date=20210323T075429, end_date=20210323T075429", "created": 1616486069.9011128, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069723135744}
{"asctime": "2021-03-23 09:54:29,961", "filename": "local_task_job.py", "lineno": 146, "levelname": "INFO", "message": "Task exited with return code 1", "created": 1616486069.9613628, "dag_id": "my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset": 1616486069531250688}
end_of_log

In that case, the bash command output is logged anyway in INFO level and so it's not that big of an issue compared to the case I've described above. Either way it would be good to have this information in json logs as well.

Our logging configuration:

[logging]
base_log_folder = <path to log folder>
remote_logging = True
remote_log_conn_id =
google_key_path =
remote_base_log_folder =
encrypt_s3_logs = False
logging_level = INFO
fab_logging_level = WARN
logging_config_class =
colored_console_log = True
colored_log_format = [%(blue)s%(asctime)s%(reset)s] {{%(blue)s%(filename)s:%(reset)s%(lineno)d}} %(log_color)s%(levelname)s%(reset)s - %(log_color)s%(message)s%(reset)s
colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
log_format = [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
simple_log_format = %(asctime)s %(levelname)s - %(message)s
task_log_prefix_template =
log_filename_template = run-{{ ti.dag_id }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = <path to log file>
task_log_reader = task
extra_loggers =

[elasticsearch]
host = <elasticsearch connection string>
log_id_template = {{ti.dag_id}}-{{ti.task_id}}-{{execution_date}}-{{try_number}}
end_of_log_mark = end_of_log
frontend =
write_stdout = False
json_format = True
json_fields = asctime, filename, lineno, levelname, message, created

The text was updated successfully, but these errors were encountered:

ashb · 2021-05-07T11:46:27Z

Fixed by #15414

mhaalme added the kind:bug This is a clearly a bug label Mar 23, 2021

kaxil added the area:logging label Mar 24, 2021

kaxil added this to the Airflow 2.0.3 milestone Mar 24, 2021

ashb closed this as completed May 7, 2021

jhtimmins modified the milestones: Airflow 2.0.3, Airflow 2.1 May 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error traceback missing from task logs when json_format=True #14950

Error traceback missing from task logs when json_format=True #14950

mhaalme commented Mar 23, 2021

ashb commented May 7, 2021

Error traceback missing from task logs when json_format=True #14950

Error traceback missing from task logs when json_format=True #14950

Comments

mhaalme commented Mar 23, 2021

ashb commented May 7, 2021