Name		Name	Last commit message	Last commit date
parent directory ..
pyspark_examples		pyspark_examples
scala_examples		scala_examples
scripts		scripts
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
sandbox.config		sandbox.config

README.md

Back to Plugins Menu

Spark Jobs as Flyte Tasks/Workflows

This section provides examples of how to author spark tasks and workflows using FlyteKit as well as additional setup required to run Spark Jobs via Flyte.

Setup

For Spark, the image must contain spark dependencies as well as the correct entrypoint for the Spark driver/executors. This can be achieved by using the flytekit_install_spark.sh script provided as referenced in the Dockerfile included here.
In-addition, Flyte uses the SparkOperator to run Spark Jobs as well as separate K8s Service Account/Role per namespace. All of these are created as part of the standard Flyte deploy. Please refer to Getting Started guidefor more details on how to deploy Flyte.
Based on the resources required for your spark job (across driver/executors), you might have to tweak resourcequotas for the namespace.

Spark Example Workflows

Flyte supports both python and scala/java spark tasks:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark

spark

README.md

Spark Jobs as Flyte Tasks/Workflows

Setup

Spark Example Workflows

Files

spark

Directory actions

More options

Directory actions

More options

Latest commit

History

spark

Folders and files

parent directory

README.md

Spark Jobs as Flyte Tasks/Workflows

Setup

Spark Example Workflows