Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Batch Executor #37618

Merged
merged 19 commits into from
Mar 5, 2024
Merged

Conversation

syedahsn
Copy link
Contributor

Overview

This PR introduces the AWS Batch Executor. This Executor can be configured to run Airflow tasks using AWS Batch. It is based on an initial contribution from @aelzeiny.

From the README:

This is an Airflow executor powered by Amazon Batch. Each task scheduled by Airflow is run inside a
separate container, scheduled by Batch. Some benefits of an executor like this include:

1. Scalability and Lower Costs: AWS Batch allows the ability to dynamically provision the resources needed to execute tasks.
   Depending on the resources allocated, AWS Batch can autoscale up or down based on the workload,
   ensuring efficient resource utilization and reducing costs.
2. Job Queues and Priority: AWS Batch provides the concept of job queues, allowing
   the ability to prioritize and manage the execution of tasks. This ensures that when multiple
   tasks are scheduled simultaneously, they are executed in the desired order of priority.
3. Flexibility: AWS Batch supports Fargate (ECS), EC2 and EKS compute environments. This range of
   compute environments, as well as the ability to finely define the resources allocated to
   the compute environments gives a lot of flexibility to users in choosing the most suitable
   execution environment for their workloads.
4. Rapid Task Execution: By maintaining an active worker within AWS Batch, tasks submitted to
   the service can be executed swiftly. With a ready-to-go worker, there's minimal startup delay,
   ensuring tasks commence immediately upon submission. This feature is particularly advantageous
   for time-sensitive workloads or applications requiring near-real-time processing,
   enhancing overall workflow efficiency and responsiveness.

This PR comes with most features now included in the ECS Executor, but we will continue to update both executors as we work to improve them.

Review Notes

Similar to the ECS Executor, this PR comes as a fully functional feature, which is great for testing, but it does mean that the PR is large. When reviewing, it isn't necessary to go through every single line. Instead, it would be more beneficial to have read through the documentation, and glance at how the Executor is implemented.

There are a lot of similarities in the code between the Batch Executor and the ECS Executor - this is by design. As we continue to refine the process of writing custom Executors, we are converging on an optimal framework to write efficient, reliable, and fault-tolerant custom Executors.

Testing

There is extensive unit testing which has near 100% line coverage in most cases:
BatchExecutorCodeCoverage


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:production-image Production image improvements and fixes area:providers kind:documentation provider:amazon-aws AWS/Amazon - related issues labels Feb 22, 2024
Copy link
Contributor

@Taragolis Taragolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor things that I've found

@syedahsn
Copy link
Contributor Author

@Lee-W @eladkal I'd like to hear your thoughts on this feature. If you have some time to take a look, I'd really appreciate it!

Copy link
Contributor

@o-nikolas o-nikolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm halfway through, but need to go for lunch, so I'm "committing" the first batch (no pun intended) of comments :)

Looking great so far!

@syedahsn syedahsn force-pushed the syedahsn/aws_executors_batch branch from 22e5407 to a2e5d0b Compare March 1, 2024 06:16
@syedahsn syedahsn requested a review from Taragolis March 1, 2024 18:16
Copy link
Contributor

@Taragolis Taragolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me, if it take in account that executor has an experimental status

Copy link
Contributor

@o-nikolas o-nikolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments on old threads.

Can you add one last change to the docs to list this as an experimental executor (as we currently do for ECS here).

Remove assert statement
@syedahsn
Copy link
Contributor Author

syedahsn commented Mar 4, 2024

Sorry I missed those! Added them now, thanks!

@o-nikolas o-nikolas merged commit c52ec9d into apache:main Mar 5, 2024
59 checks passed
@o-nikolas o-nikolas deleted the syedahsn/aws_executors_batch branch March 5, 2024 23:01
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 6, 2024
utkarsharma2 pushed a commit to astronomer/airflow that referenced this pull request Apr 22, 2024
* Introduce AWS Batch Executor to the Amazon provider package. The BatchExecutor launches Airflow tasks on AWS ECS/Fargate/EC2 services.

* Added option to allow users to pass in config options.

* prevent any failures in the executor from killing the scheduler

* Add documentation for Batch Executor

* Health check and Token Expiration

* Retries on task failures

* Apply recursive updates to exec_config parameter

* Apply waits between each task retry attempt

---------

Co-authored-by: Ahmed Elzeiny <[email protected]>
utkarsharma2 pushed a commit to astronomer/airflow that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:production-image Production image improvements and fixes area:providers kind:documentation provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants