Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow scheduler with Kubernetes executor trying to adopt pod from other deployment #20203

Closed
2 tasks done
bparhy opened this issue Dec 10, 2021 · 7 comments
Closed
2 tasks done
Labels
area:core kind:bug This is a clearly a bug

Comments

@bparhy
Copy link

bparhy commented Dec 10, 2021

Apache Airflow version

2.1.3

What happened

We are using Airflow version 2.1.3
Using kubernetes executor
We have other deployment in this k8s namespace which are running using airflow 1.10.10 and 1.10.12 etc.
For the airflow version 2.1.3 we are seeing an error in the scheduler logs where it looks like the scheduler is trying to adopt pods which are scheduled by other airflow deployments. Below is the error.

Failed to adopt pod <pod_id> Reason: (422)

Reason: Unprocessable Entity
HTTP response headers:

This mainly happens for pods/task which are completed already.

What you expected to happen

Airflow scheduler should not attempt to adopt pods from other deployments.

How to reproduce

Create two airflow deployments one with version 1.10.10 and other with 2.1.3

Set up a dag in 1.10.10 with a frequency of 5 minutes and then check the logs for airflow version 2.1.3.

You should see 2.1.3 scheduler trying to adopt pods with the below error

Failed to adopt pod <pod_id> Reason: (422)

Reason: Unprocessable Entity
HTTP response headers:

Operating System

linux

Versions of Apache Airflow Providers

Not dependent on provider version

Deployment

Other 3rd-party Helm chart

Deployment details

We are deploying on k8s instance with 2 scheduler and 1 web server with airflow 2.1.3
We are using helm to deploy and metadata DB is postgres with AWS Aurora.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@bparhy bparhy added area:core kind:bug This is a clearly a bug labels Dec 10, 2021
@potiuk
Copy link
Member

potiuk commented Dec 10, 2021

You should make sure each airflow deployment is in a different namespace if I am not mistaken @jedcunningham @kaxil ? Can you please confirm it?

@bparhy
Copy link
Author

bparhy commented Dec 10, 2021

Is this a new requirement as all our previous deployment prior to 2.1.3 is running without this issue. Also I am seeing this fix #14795 is that related ?
Thanks for responding so quick.

@potiuk
Copy link
Member

potiuk commented Dec 10, 2021

Is this a new requirement as all our previous deployment prior to 2.1.3 is running without this issue. Also I am seeing this fix #14795 is that related ? Thanks for responding so quick.

I am not sure - it's just seems logical to me to keep each airlfow deployment in a separate namespace (that 's what I'd do in general at least). But yeah. I might be wrong and Airlfow should work this way (but it also could be accidental that it worked)

@potiuk
Copy link
Member

potiuk commented Dec 10, 2021

That's why I am asking those who likely know better :)

@jedcunningham
Copy link
Member

Right now, the most reliable way is to run in separate namespaces.

Both running multiple instances in a single namespace and using the multi_namespace_mode option have various edge cases that just aren't handled currently (though, it is on my radar!). This hasn't changed in recent versions and it's likely you just got lucky or didn't notice issues previously.

@potiuk
Copy link
Member

potiuk commented Dec 11, 2021

Yeah. My though exactly

@potiuk
Copy link
Member

potiuk commented Dec 11, 2021

Moving that into a discussion then. I think, if we want to make multiple airflows in one namespace that shoudl be a separate feature. So maybe @bparhy - if you really think this is needed and you cannot rearrange your airflows to multipe instances - opening a feature request might be a good thing,

@apache apache locked and limited conversation to collaborators Dec 11, 2021
@potiuk potiuk converted this issue into discussion #20219 Dec 11, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
area:core kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

3 participants