Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove retry for now #16150

Merged
merged 1 commit into from
May 29, 2021
Merged

Conversation

zachliu
Copy link
Contributor

@zachliu zachliu commented May 28, 2021

We started to use @AwsBaseHook.retry(should_retry) on _start_task of the ECSOperator in Airflow 2.1.0 (or providers-amazon >= 1.3.0).

@AwsBaseHook.retry(should_retry)

which is a little bit tricky according to AWS support:

  • The response['failures'] is not always reliable, which means an ECS task may still be provisioned successfully even when we get "failures" in the response. The only way to be sure is to use DescribeTasks to see if the task is provisioned correctly.

With the @AwsBaseHook.retry(should_retry) on _start_task(), i started to see my Airflow sending multiple RunTask events to AWS for the same task and I ended up with multiple instances running and they are interfering with each other since they require the same back-end resources. This also creates many "ghost" ECS tasks that are not under Airflow's radar and quite difficult to debug 😿

I don't know if using the retry is a way of dealing with this #15000 issue.
But i'd rather go back to the old way at the moment.

Happy to discuss further on this.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels May 28, 2021
@github-actions
Copy link

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label May 28, 2021
@potiuk potiuk merged commit 8d16638 into apache:master May 29, 2021
@zachliu zachliu deleted the remove-retry-on-ecsoperator branch March 4, 2022 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers okay to merge It's ok to merge this PR as it does not require more tests provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants