Skip to content

Commit

Permalink
[MINOR][DOCS] Clarify that Spark apps should mark Spark as a 'provide…
Browse files Browse the repository at this point in the history
…d' dependency, not package it

## What changes were proposed in this pull request?

Spark apps do not need to package Spark. In fact it can cause problems in some cases. Our examples should show depending on Spark as a 'provided' dependency.

Packaging Spark makes the app much bigger by tens of megabytes. It can also bring in conflicting dependencies that wouldn't otherwise be a problem. https://issues.apache.org/jira/browse/SPARK-26146 was what reminded me of this.

## How was this patch tested?

Doc build

Closes #23938 from srowen/Provided.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
  • Loading branch information
srowen committed Mar 5, 2019
1 parent 940626b commit 3909223
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 1 deletion.
1 change: 1 addition & 0 deletions docs/cloud-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ is set to the chosen version of Spark:
<groupId>org.apache.spark</groupId>
<artifactId>hadoop-cloud_{{site.SCALA_BINARY_VERSION}}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
...
</dependencyManagement>
Expand Down
1 change: 1 addition & 0 deletions docs/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,7 @@ Note that Spark artifacts are tagged with a Scala version.
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_{{site.SCALA_BINARY_VERSION}}</artifactId>
<version>{{site.SPARK_VERSION}}</version>
<scope>provided</scope>
</dependency>
</dependencies>
</project>
Expand Down
3 changes: 2 additions & 1 deletion docs/streaming-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,11 +385,12 @@ Similar to Spark, Spark Streaming is available through Maven Central. To write y
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_{{site.SCALA_BINARY_VERSION}}</artifactId>
<version>{{site.SPARK_VERSION}}</version>
<scope>provided</scope>
</dependency>
</div>
<div data-lang="SBT" markdown="1">

libraryDependencies += "org.apache.spark" % "spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}"
libraryDependencies += "org.apache.spark" % "spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}" % "provided"
</div>
</div>

Expand Down

0 comments on commit 3909223

Please sign in to comment.