Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow stable API taskInstance call fails if a task is removed from running DAG #14331

Closed
soltanianalytics opened this issue Feb 20, 2021 · 2 comments · Fixed by #14381
Closed
Labels
affected_version:2.0 Issues Reported for 2.0 area:API Airflow's REST/HTTP API kind:bug This is a clearly a bug
Milestone

Comments

@soltanianalytics
Copy link
Contributor

Apache Airflow version: 2.0.1

Environment: Docker on Win 10 with WSL, image based on apache/airflow:2.0.1-python3.8

What happened:

I'm using the airflow API and the following (what I believe to be a) bug popped up:

>>> import requests
>>> r = requests.get("http://localhost:8084/api/v1/dags/~/dagRuns/~/taskInstances", auth=HTTPBasicAuth('username', 'password'))
>>> r.status_code
500
>>> print(r.text)
{
  "detail": "'removed' is not one of ['success', 'running', 'failed', 'upstream_failed', 'skipped', 'up_for_retry', 'up_for_reschedule', 'queued', 'none', 'scheduled']\n\nFailed validating 'enum' in schema['allOf'][0]['properties']['task_instances']['items']['properties']['state']:\n    {'description': 'Task state.',\n     'enum': ['success',\n              'running',\n              'failed',\n              'upstream_failed',\n              'skipped',\n              'up_for_retry',\n              'up_for_reschedule',\n              'queued',\n              'none',\n              'scheduled'],\n     'nullable': True,\n     'type': 'string',\n     'x-scope': ['',\n                 '#/components/schemas/TaskInstanceCollection',\n                 '#/components/schemas/TaskInstance']}\n\nOn instance['task_instances'][16]['state']:\n    'removed'",
  "status": 500,
  "title": "Response body does not conform to specification",
  "type": "https://airflow.apache.org/docs/2.0.1rc2/stable-rest-api-ref.html#section/Errors/Unknown"
}
>>> print(r.json()["detail"])
'removed' is not one of ['success', 'running', 'failed', 'upstream_failed', 'skipped', 'up_for_retry', 'up_for_reschedule', 'queued', 'none', 'scheduled']

Failed validating 'enum' in schema['allOf'][0]['properties']['task_instances']['items']['properties']['state']:
    {'description': 'Task state.',
     'enum': ['success',
              'running',
              'failed',
              'upstream_failed',
              'skipped',
              'up_for_retry',
              'up_for_reschedule',
              'queued',
              'none',
              'scheduled'],
     'nullable': True,
     'type': 'string',
     'x-scope': ['',
                 '#/components/schemas/TaskInstanceCollection',
                 '#/components/schemas/TaskInstance']}

On instance['task_instances'][16]['state']:
    'removed'

This happened after I changed a DAG in the corresponding instance, thus a task was removed from a DAG while the DAG was running.

What you expected to happen:

Give me all task instances, whether including the removed ones or not is up to the airflow team to decide (no preferences from my side, though I'd guess it makes more sense to supply all data as it is available).

How to reproduce it:

  • Run airflow
  • Create a DAG with multiple tasks
  • While the DAG is running, remove one of the tasks (ideally one that did not yet run)
  • Make the API call as above
@soltanianalytics soltanianalytics added the kind:bug This is a clearly a bug label Feb 20, 2021
@kaxil kaxil added this to the Airflow 2.0.2 milestone Feb 20, 2021
@ephraimbuddy
Copy link
Contributor

From here:

task_states = (
SUCCESS,
RUNNING,
FAILED,
UPSTREAM_FAILED,
SKIPPED,
UP_FOR_RETRY,
UP_FOR_RESCHEDULE,
QUEUED,
NONE,
SCHEDULED,
SENSING,
)

The task states do not include removed, I think that's why it was omitted here:
- success
- running
- failed
- upstream_failed
- skipped
- up_for_retry
- up_for_reschedule
- queued
- none
- scheduled

Should we change this @kaxil?

@kaxil
Copy link
Member

kaxil commented Feb 23, 2021

We just need to check it (`task_states) isn't used anywhere else to cause domino effect

@vikramkoka vikramkoka added area:API Airflow's REST/HTTP API affected_version:2.0 Issues Reported for 2.0 labels Feb 23, 2021
ashb pushed a commit that referenced this issue Mar 19, 2021
kaxil pushed a commit to astronomer/airflow that referenced this issue Apr 1, 2021
…ning DAG (apache#14381)

Closes: apache#14331

(cherry picked from commit 7418679)
(cherry picked from commit 0cb2a96)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.0 Issues Reported for 2.0 area:API Airflow's REST/HTTP API kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants