Skip to content

Commit

Permalink
[MINOR][DOCS] Clarify that Spark apps should mark Spark as a 'provide…
Browse files Browse the repository at this point in the history
…d' dependency, not package it

## What changes were proposed in this pull request?

Spark apps do not need to package Spark. In fact it can cause problems in some cases. Our examples should show depending on Spark as a 'provided' dependency.

Packaging Spark makes the app much bigger by tens of megabytes. It can also bring in conflicting dependencies that wouldn't otherwise be a problem. https://issues.apache.org/jira/browse/SPARK-26146 was what reminded me of this.

## How was this patch tested?

Doc build

Closes apache#23938 from srowen/Provided.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit 3909223)
Signed-off-by: Sean Owen <[email protected]>
  • Loading branch information
srowen authored and kai-chi committed Aug 1, 2019
1 parent 9c152b2 commit db6c470
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 1 deletion.
1 change: 1 addition & 0 deletions docs/cloud-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ is set to the chosen version of Spark:
<groupId>org.apache.spark</groupId>
<artifactId>hadoop-cloud_2.11</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
...
</dependencyManagement>
Expand Down
1 change: 1 addition & 0 deletions docs/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,7 @@ Note that Spark artifacts are tagged with a Scala version.
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_{{site.SCALA_BINARY_VERSION}}</artifactId>
<version>{{site.SPARK_VERSION}}</version>
<scope>provided</scope>
</dependency>
</dependencies>
</project>
Expand Down
3 changes: 2 additions & 1 deletion docs/streaming-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,11 +385,12 @@ Similar to Spark, Spark Streaming is available through Maven Central. To write y
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_{{site.SCALA_BINARY_VERSION}}</artifactId>
<version>{{site.SPARK_VERSION}}</version>
<scope>provided</scope>
</dependency>
</div>
<div data-lang="SBT" markdown="1">

libraryDependencies += "org.apache.spark" % "spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}"
libraryDependencies += "org.apache.spark" % "spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}" % "provided"
</div>
</div>

Expand Down

0 comments on commit db6c470

Please sign in to comment.