[SPARK-7481] [build] Add spark-hadoop-cloud module to pull in object store access. #17834

steveloughran · 2017-05-02T18:33:16Z

What changes were proposed in this pull request?

Add a new spark-hadoop-cloud module and maven profile to pull in object store support from hadoop-openstack, hadoop-aws and hadoop-azure (Hadoop 2.7+) JARs, along with their dependencies, fixing up the dependencies so that everything works, in particular Jackson.

It restores s3n:// access to S3, adds its s3a:// replacement, OpenStack swift:// and azure wasb://.

There's a documentation page, cloud_integration.md, which covers the basic details of using Spark with object stores, referring the reader to the supplier's own documentation, with specific warnings on security and the possible mismatch between a store's behavior and that of a filesystem. In particular, users are advised be very cautious when trying to use an object store as the destination of data, and to consult the documentation of the storage supplier and the connector.

(this is the successor to #12004; I can't re-open it)

How was this patch tested?

Downstream tests exist in https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples

Those verify that the dependencies are sufficient to allow downstream applications to work with s3a, azure wasb and swift storage connectors, and perform basic IO & dataframe operations thereon. All seems well.

Manually clean build & verify that assembly contains the relevant aws-* hadoop-* artifacts on Hadoop 2.6; azure on a hadoop-2.7 profile.

SBT build: build/sbt -Phadoop-cloud -Phadoop-2.7 package
maven build mvn install -Phadoop-cloud -Phadoop-2.7

This PR does not update dev/deps/spark-deps-hadoop-2.7 or dev/deps/spark-deps-hadoop-2.6, because unless the hadoop-cloud profile is enabled, no extra JARs show up in the dependency list. The dependency check in Jenkins isn't setting the property, so the new JARs aren't visible.

…azure, hadoop-aws, hadoop-openstack and implicitly their transitive dependencies are resolved. They don't verify all dependency setup, specifically that Jackson versions are consistent; that needs integration testing.

…opriately

…eam tests which pulled in spark-cloud but not spark-hive were ending up with inconsistent versions. Add a test for the missing class being there too.

…s; link to remaning docs from top level.

…CP is set up consistently for the later versions of the AWS SDK

…and docs

…anup

…ct references to AWS classes failing. Cut them and rely on transitive load through FS class instantation to force the load. All that happens is that failures to link will be slightly less easy to debug.

…ack JARs, transitive dependencies on aws and azure SDKs

…ranch-2, s3 ireland

Change-Id: I3dea2544f089615493163f0fae482992873f9c35

Change-Id: Ibd6d40df2bc8a2edf19a058c458bea233ba414fd

… as well as 2.7, keeping azure the 2.7.x dependency. All dependencies are scoped @ hadoop.scope Change-Id: I80bd95fd48e21cf2eb4d94907ac99081cd3bd375

Change-Id: I91f764aeed7d832df1538453d869a7fd83964d65

Change-Id: I12ea6ed72ffa9edee964c90c862ff4c45bc4f47f

…urity, spark-specific options + links elsewhere Change-Id: I7e9efe20d116802a403af875b241b91178078d78

Change-Id: I1923a4b6a959d86aa2c5b3d71faaaf2541d3ba85

Change-Id: I6a0b0b9f06a4adcdf55ef75161dc1039961bc7a1

Change-Id: Ic4804667af8e52b7be11fb00621ad8b69a1d2569

Change-Id: I2b75a2722f0082b916b9be20bd23a0bdc2d36615

srowen · 2017-05-02T20:57:50Z

OK, but does this address the comments from the other PR? it didn't look like it (yet) from a quick look.

SparkQA · 2017-05-02T21:26:59Z

Test build #76390 has finished for PR 17834 at commit 72a03ed.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

steveloughran · 2017-05-03T10:39:58Z

The last one was on all the doc comments, and believe I've addressed them both with the little typos and by focusing the docs on the main points for Spark users: how stores differ from filesystems, and what it means

The big issue for spark users is "the commit problem", where I've listed the object store behaviours and said "this means things may not work —consult the docs". I'm not being explicit as what works/doesn't work as that's the moving target.

Right now, I don't trust commits with S3a using V1 or V2 FileOutputCommitter algorithms to work 100% of the time, because they rely on a list consistency which Amazon S3 doesn't guarantee. I could make that a lot clearer, something like

You cannot reliably use the FileOutputCommitter to commit work to Amazon S3 or Openstack Swift unless there is some form of consistency layer on top.

That probably is the core concept people need to know: it's not safe without something (EMR, S3Guard, Databricks Commit Service) to give you that consistent view.

Then add pointers to the talks by myself and Eric Liang on the topic

srowen

OK still a few comments but yes you're right most were addressed

srowen · 2017-05-03T12:56:31Z

assembly/pom.xml

+     Pull in spark-hadoop-cloud and its associated JARs,
+    -->
+    <profile>
+      <id>cloud</id>


Continuing from #12004 (comment) : I still think this should be hadoop-cloud just like the artifact name is spark-hadoop-cloud. The profile name itself doesn't cause an artifact to be generated called hadoop-cloud.

srowen · 2017-05-03T14:00:16Z

pom.xml

+        the AWS module pulls in jackson; its transitive dependencies can create
+        intra-jackson-module version problems.
+        -->
+      <dependency>


Continuing #12004 (comment) - I think it's useful to put things in the parent when they're used across multiple modules. For something only referenced in one child POM, it is probably less overhead to just put it the dependency in the child rather than split and repeat part of it.

OK, I'll do that. It's easily enough revisited in future.

The one thing that will probably need revisiting jackson dependency, as that's got the highest risk that someone will add it to another module. Though as it is driven by the same jackson.version variable as everywhere else, a duplicate declaration elsewhere isn't going to cause any harm

srowen · 2017-05-03T14:01:15Z

cloud/pom.xml

@@ -0,0 +1,106 @@
+<?xml version="1.0" encoding="UTF-8"?>


Oh, just noticed this -- should this perhaps be in a directory hadoop-cloud to match the profile and module name?

srowen · 2017-05-03T14:19:08Z

docs/cloud-integration.md

+
+This uses the "version 2" algorithm for committing files, which does less
+renaming than the "version 1" algorithm, though as it still uses `rename()`
+to commit files, it may be unsafe to use.


It's hard to give advice here, as we discussed. It's possible to link to external documentation. Here, however, it sounds like you're saying version 2 is recommended, then immediately saying it could be unsafe to use. What does the reader make of that? what's the upside, what's the nature and likelihood of failure?

If there's no clear answer, is this really a reliable recommendation?
Should this perhaps call out the issue and link to a larger discussion instead?

I'll try to think of a better phrasing, saying "if your object store is consistent enough use v2 for speed".

* module is renamed hadoop-cloud in POMs, sbt, docs * hadoop-aws/azure/openstack declarations pushed down to hadoop-cloud pom, along with jackson-cbor * docs around the commit algorithm option make clear that you should only worry about v1 vs v2 if the blobstore is consistent Change-Id: Ia114bc8cd2ef731d54a83774d9dc2cf9e4c6e7d4

steveloughran · 2017-05-03T21:08:59Z

I've just pushed up an update which changes the module name; tested in maven and SBT; hadoop cloud JAR dependencies pulled down.

A JAR is created, it's just a stub one. As a result, when you build the assembly with -Phadoop-cloud, you get something from spark mixed in with the hadoop modules

assembly/target/scala-2.11/jars/hadoop-aws-2.7.3.jar
assembly/target/scala-2.11/jars/hadoop-azure-2.7.3.jar
assembly/target/scala-2.11/jars/hadoop-client-2.7.3.jar
assembly/target/scala-2.11/jars/hadoop-cloud_2.11-2.3.0-SNAPSHOT.jar
assembly/target/scala-2.11/jars/hadoop-common-2.7.3.jar
assembly/target/scala-2.11/jars/hadoop-hdfs-2.7.3.jar
assembly/target/scala-2.11/jars/hadoop-mapreduce-client-app-2.7

Like I've said before, that's a bit dangerous, especially with a hadoop-cloud-projects POM module now upstream. We won't ever produce a hadoop-cloud-projects JAR, so there won't be direct conflict, more potential confusion if people see a jar beginning hadoop-* with different version info.

SparkQA · 2017-05-03T23:47:11Z

Test build #76426 has finished for PR 17834 at commit b788494.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-05-04T09:58:27Z

That's because you also changed the artifact name to hadoop-cloud-.... That should remain spark-hadoop-cloud-.... It's the directory, module name, and profile ID that should be hadoop-cloud. It is at least consistent with how other modules and profiles are named.

steveloughran · 2017-05-04T10:42:26Z

OK, now I understand. let me revert that bit of the patch

…oud. Added an extra callout to the docs "without some form of consistency layer, Amazon S3 cannot be safely used as the direct destination of work with the normal rename-based committer" That holds for all Spark-in-EC deployments; EMR has a consistency option, as do/will others; different committers can work reliably. Change-Id: Ibbf8b1b9de10b5bb83b647cc663ceb970f70ff2d Testing: SBT and maven, with & without the hadoop-2.7 option.

SparkQA · 2017-05-04T17:53:56Z

Test build #76457 has finished for PR 17834 at commit e173e3f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen

Just some minor changes now, this looks fine

srowen · 2017-05-05T11:19:20Z

docs/cloud-integration.md

+* Rename operations may be very slow and, on failure, leave the store in an unknown state.
+* Seeking within a file may require new HTTP calls, hurting performance. 
+
+How does affect Spark? 


Just noticed a small typo: How does this affect Spark?

fixed, "how does this affect Spark?"

srowen · 2017-05-05T11:19:40Z

docs/cloud-integration.md

+They cannot be used as a direct replacement for a cluster filesystem such as HDFS
+*except where this is explicitly stated*.
+
+Key differences are


Nit: I'd end the line with a colon to make it clear it's not dangling

srowen · 2017-05-05T11:20:11Z

docs/cloud-integration.md

+### Important: Cloud Object Stores are Not Real Filesystems
+
+While the stores appear to be filesystems, underneath
+they are still object stores, [and the difference is significant](http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/introduction.html)


Now that we're down to tiny nits, I'd link everywhere to https:// URLs

good catch. Done throughout the file except in the apache license header. Rendered HTML and clicked through the links.

srowen · 2017-05-05T11:22:10Z

hadoop-cloud/pom.xml

+
+  <artifactId>spark-hadoop-cloud_2.11</artifactId>
+  <packaging>jar</packaging>
+  <name>Spark Project Cloud Integration</name>


Maybe Spark Project Cloud Object Store Integration for Hadoop? I feel like getting "Hadoop" in there is useful to somehow emphasize this is adding connectivity from HDFS APIs to object stores in particular

As it works just as well standalone & on mesos, I'll try " through Hadoop Libraries" & see how that reads.

…uple of superflous blank lines Change-Id: Iee9f0e0527de7bb875d1c2a805a0847702bb4e11

SparkQA · 2017-05-05T15:11:04Z

Test build #76492 has finished for PR 17834 at commit 32ebc8c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-05-07T09:16:02Z

Merged to master

…tore access. ## What changes were proposed in this pull request? Add a new `spark-hadoop-cloud` module and maven profile to pull in object store support from `hadoop-openstack`, `hadoop-aws` and `hadoop-azure` (Hadoop 2.7+) JARs, along with their dependencies, fixing up the dependencies so that everything works, in particular Jackson. It restores `s3n://` access to S3, adds its `s3a://` replacement, OpenStack `swift://` and azure `wasb://`. There's a documentation page, `cloud_integration.md`, which covers the basic details of using Spark with object stores, referring the reader to the supplier's own documentation, with specific warnings on security and the possible mismatch between a store's behavior and that of a filesystem. In particular, users are advised be very cautious when trying to use an object store as the destination of data, and to consult the documentation of the storage supplier and the connector. (this is the successor to apache#12004; I can't re-open it) ## How was this patch tested? Downstream tests exist in [https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples](https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples) Those verify that the dependencies are sufficient to allow downstream applications to work with s3a, azure wasb and swift storage connectors, and perform basic IO & dataframe operations thereon. All seems well. Manually clean build & verify that assembly contains the relevant aws-* hadoop-* artifacts on Hadoop 2.6; azure on a hadoop-2.7 profile. SBT build: `build/sbt -Phadoop-cloud -Phadoop-2.7 package` maven build `mvn install -Phadoop-cloud -Phadoop-2.7` This PR *does not* update `dev/deps/spark-deps-hadoop-2.7` or `dev/deps/spark-deps-hadoop-2.6`, because unless the hadoop-cloud profile is enabled, no extra JARs show up in the dependency list. The dependency check in Jenkins isn't setting the property, so the new JARs aren't visible. Author: Steve Loughran <[email protected]> Author: Steve Loughran <[email protected]> Closes apache#17834 from steveloughran/cloud/SPARK-7481-current.

…tore access. Add a new `spark-hadoop-cloud` module and maven profile to pull in object store support from `hadoop-openstack`, `hadoop-aws` and `hadoop-azure` (Hadoop 2.7+) JARs, along with their dependencies, fixing up the dependencies so that everything works, in particular Jackson. It restores `s3n://` access to S3, adds its `s3a://` replacement, OpenStack `swift://` and azure `wasb://`. There's a documentation page, `cloud_integration.md`, which covers the basic details of using Spark with object stores, referring the reader to the supplier's own documentation, with specific warnings on security and the possible mismatch between a store's behavior and that of a filesystem. In particular, users are advised be very cautious when trying to use an object store as the destination of data, and to consult the documentation of the storage supplier and the connector. (this is the successor to apache#12004; I can't re-open it) Downstream tests exist in [https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples](https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples) Those verify that the dependencies are sufficient to allow downstream applications to work with s3a, azure wasb and swift storage connectors, and perform basic IO & dataframe operations thereon. All seems well. Manually clean build & verify that assembly contains the relevant aws-* hadoop-* artifacts on Hadoop 2.6; azure on a hadoop-2.7 profile. SBT build: `build/sbt -Phadoop-cloud -Phadoop-2.7 package` maven build `mvn install -Phadoop-cloud -Phadoop-2.7` This PR *does not* update `dev/deps/spark-deps-hadoop-2.7` or `dev/deps/spark-deps-hadoop-2.6`, because unless the hadoop-cloud profile is enabled, no extra JARs show up in the dependency list. The dependency check in Jenkins isn't setting the property, so the new JARs aren't visible. Author: Steve Loughran <[email protected]> Author: Steve Loughran <[email protected]> Closes apache#17834 from steveloughran/cloud/SPARK-7481-current.

…tore access. ## What changes were proposed in this pull request? Add a new `spark-hadoop-cloud` module and maven profile to pull in object store support from `hadoop-openstack`, `hadoop-aws` and `hadoop-azure` (Hadoop 2.7+) JARs, along with their dependencies, fixing up the dependencies so that everything works, in particular Jackson. It restores `s3n://` access to S3, adds its `s3a://` replacement, OpenStack `swift://` and azure `wasb://`. There's a documentation page, `cloud_integration.md`, which covers the basic details of using Spark with object stores, referring the reader to the supplier's own documentation, with specific warnings on security and the possible mismatch between a store's behavior and that of a filesystem. In particular, users are advised be very cautious when trying to use an object store as the destination of data, and to consult the documentation of the storage supplier and the connector. (this is the successor to apache#12004; I can't re-open it) ## How was this patch tested? Downstream tests exist in [https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples](https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples) Those verify that the dependencies are sufficient to allow downstream applications to work with s3a, azure wasb and swift storage connectors, and perform basic IO & dataframe operations thereon. All seems well. Manually clean build & verify that assembly contains the relevant aws-* hadoop-* artifacts on Hadoop 2.6; azure on a hadoop-2.7 profile. SBT build: `build/sbt -Phadoop-cloud -Phadoop-2.7 package` maven build `mvn install -Phadoop-cloud -Phadoop-2.7` This PR *does not* update `dev/deps/spark-deps-hadoop-2.7` or `dev/deps/spark-deps-hadoop-2.6`, because unless the hadoop-cloud profile is enabled, no extra JARs show up in the dependency list. The dependency check in Jenkins isn't setting the property, so the new JARs aren't visible. Author: Steve Loughran <[email protected]> Author: Steve Loughran <[email protected]> Closes apache#17834 from steveloughran/cloud/SPARK-7481-current. (cherry picked from commit 2cf83c4)

… `jvm-profiler` modules ### What changes were proposed in this pull request? This PR aims to fix `dev/scalastyle` to check `hadoop-cloud` and `jam-profiler` modules. Also, the detected scalastyle issues are fixed. ### Why are the changes needed? To prevent future scalastyle issues. Scala style violation was introduced here, but we missed because we didn't check all optional modules. - #46022 `jvm-profiler` module was added newly at Apache Spark 4.0.0 but we missed to add this to `dev/scalastyle`. Note that there was no scala style issues in that `module` at that time. - #44021 `hadoop-cloud` module was added at Apache Spark 2.3.0. - #17834 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs with newly revised `dev/scalastyle`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46376 from dongjoon-hyun/SPARK-48127. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

… `jvm-profiler` modules ### What changes were proposed in this pull request? This PR aims to fix `dev/scalastyle` to check `hadoop-cloud` and `jam-profiler` modules. Also, the detected scalastyle issues are fixed. ### Why are the changes needed? To prevent future scalastyle issues. Scala style violation was introduced here, but we missed because we didn't check all optional modules. - apache#46022 `jvm-profiler` module was added newly at Apache Spark 4.0.0 but we missed to add this to `dev/scalastyle`. Note that there was no scala style issues in that `module` at that time. - apache#44021 `hadoop-cloud` module was added at Apache Spark 2.3.0. - apache#17834 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs with newly revised `dev/scalastyle`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#46376 from dongjoon-hyun/SPARK-48127. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

steveloughran and others added 25 commits April 25, 2017 10:57

[SPARK-7481] stripped down packaging only module

1da9a3d

[SPARK-7481] tests restricted to instantiation; logging modified appr…

ace46e9

…opriately

[SPARK-7481] declare httpcomponents:httpclient explicitly, as downstr…

3f6dfda

…eam tests which pulled in spark-cloud but not spark-hive were ending up with inconsistent versions. Add a test for the missing class being there too.

[SPARK-7481] update docs by culling section on cloud integration test…

5f8f996

…s; link to remaning docs from top level.

[SPARK-7481] updated documentation as per review

e92a493

[SPARK-7481] SBT will build this now, optionally

97e80e1

[SPARK-7481] cloud POM includes jackson-dataformat-cbor, so that the …

ef3cebf

…CP is set up consistently for the later versions of the AWS SDK

[SPARK-7481] rebase with master; Pom had got out of sync

66650c7

[SPARK-7481] rename spark-cloud module to spark-hadoo-cloud, in POMs …

31cc37e

…and docs

[SPARK-7841] bump up cloud pom to 2.2.0-SNAPSHOT; other minor pom cle…

2fc6f23

…anup

[SPARK-7481] builds against Hadoop shaded 3.x clients failing as dire…

65f6814

…ct references to AWS classes failing. Cut them and rely on transitive load through FS class instantation to force the load. All that happens is that failures to link will be slightly less easy to debug.

[SPARK-7481] update 2.7 dependencies to include azure, aws and openst…

73820a3

…ack JARs, transitive dependencies on aws and azure SDKs

[SPARK-7481] add joda time as the dependency. Tested against hadoop b…

824d801

…ranch-2, s3 ireland

SPARK-7481 purge all tests from the cloud module

12a1b84

SPARK-7481 add cloud module to sbt sequence

a7a2dec

Change-Id: I3dea2544f089615493163f0fae482992873f9c35

SPARK-7481 break line of mvn XML declaration

02f6e19

Change-Id: Ibd6d40df2bc8a2edf19a058c458bea233ba414fd

SPARK-7481 cloud pom is still JAR (not pom). works against Hadoop 2.6…

ce042d2

… as well as 2.7, keeping azure the 2.7.x dependency. All dependencies are scoped @ hadoop.scope Change-Id: I80bd95fd48e21cf2eb4d94907ac99081cd3bd375

SPARK-7481 move to Spark 2.3.0-SNAPSHOT

a985753

Change-Id: I91f764aeed7d832df1538453d869a7fd83964d65

tweaked pom; updated docs

0e0527d

Change-Id: I12ea6ed72ffa9edee964c90c862ff4c45bc4f47f

SPARK-7481 strip down the docs to a bare minimum: FS differences, sec…

b78158f

…urity, spark-specific options + links elsewhere Change-Id: I7e9efe20d116802a403af875b241b91178078d78

SPARK-7481 doc review

de3e95b

Change-Id: I1923a4b6a959d86aa2c5b3d71faaaf2541d3ba85

review comments

9b1579b

Change-Id: I6a0b0b9f06a4adcdf55ef75161dc1039961bc7a1

SPARK-7481 more proofreading

844e255

Change-Id: Ic4804667af8e52b7be11fb00621ad8b69a1d2569

SPARK-7481 proofreading docs

72a03ed

Change-Id: I2b75a2722f0082b916b9be20bd23a0bdc2d36615

srowen requested changes May 3, 2017

View reviewed changes

srowen requested changes May 5, 2017

View reviewed changes

SPARK-7481: applied proofreading, moved links to https; also cut a co…

32ebc8c

…uple of superflous blank lines Change-Id: Iee9f0e0527de7bb875d1c2a805a0847702bb4e11

srowen approved these changes May 6, 2017

View reviewed changes

asfgit closed this in 2cf83c4 May 7, 2017

steveloughran deleted the cloud/SPARK-7481-current branch March 18, 2019 17:58

dongjoon-hyun mentioned this pull request May 4, 2024

[SPARK-48127][INFRA] Fix dev/scalastyle to check hadoop-cloud and jvm-profiler modules #46376

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-7481] [build] Add spark-hadoop-cloud module to pull in object store access. #17834

[SPARK-7481] [build] Add spark-hadoop-cloud module to pull in object store access. #17834

steveloughran commented May 2, 2017 •

edited

Loading

srowen commented May 2, 2017

SparkQA commented May 2, 2017

steveloughran commented May 3, 2017

srowen left a comment

srowen May 3, 2017

srowen May 3, 2017

steveloughran May 3, 2017

srowen May 3, 2017

steveloughran May 3, 2017

srowen May 3, 2017

steveloughran May 3, 2017

steveloughran commented May 3, 2017

SparkQA commented May 3, 2017

srowen commented May 4, 2017

steveloughran commented May 4, 2017

SparkQA commented May 4, 2017

srowen left a comment

srowen May 5, 2017

steveloughran May 5, 2017

srowen May 5, 2017

srowen May 5, 2017

steveloughran May 5, 2017

srowen May 5, 2017

steveloughran May 5, 2017

SparkQA commented May 5, 2017

srowen commented May 7, 2017

[SPARK-7481] [build] Add spark-hadoop-cloud module to pull in object store access. #17834

[SPARK-7481] [build] Add spark-hadoop-cloud module to pull in object store access. #17834

Conversation

steveloughran commented May 2, 2017 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

srowen commented May 2, 2017

SparkQA commented May 2, 2017

steveloughran commented May 3, 2017

srowen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steveloughran commented May 3, 2017

SparkQA commented May 3, 2017

srowen commented May 4, 2017

steveloughran commented May 4, 2017

SparkQA commented May 4, 2017

srowen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 5, 2017

srowen commented May 7, 2017

steveloughran commented May 2, 2017 •

edited

Loading