[DOC] Adjust coverage for partitionBy() #10499

ted-yu · 2015-12-28T20:49:28Z

This is the related thread: http://search-hadoop.com/m/q3RTtO3ReeJ1iF02&subj=Re+partitioning+json+data+in+spark

Michael suggested fixing the doc.

Please review.

JoshRosen · 2015-12-28T22:09:02Z

sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

@@ -119,7 +119,7 @@ final class DataFrameWriter private[sql](df: DataFrame) {
   * Partitions the output by the given columns on the file system. If specified, the output is
   * laid out on the file system similar to Hive's partitioning scheme.
   *
-   * This is only applicable for Parquet at the moment.
+   * This was initally applicable for Parquet but in 1.5.x covers JSON as well.


1.5.x or 1.5+?

SparkQA · 2015-12-28T22:31:05Z

Test build #48373 has finished for PR 10499 at commit 7884e87.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-12-29T01:57:17Z

Test build #48381 has finished for PR 10499 at commit f655bbe.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

marmbrus · 2015-12-29T02:22:19Z

sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

@@ -119,7 +119,7 @@ final class DataFrameWriter private[sql](df: DataFrame) {
   * Partitions the output by the given columns on the file system. If specified, the output is
   * laid out on the file system similar to Hive's partitioning scheme.
   *
-   * This is only applicable for Parquet at the moment.
+   * This was initally applicable for Parquet but in 1.5+ covers JSON as well.


also "text"

SparkQA · 2015-12-29T04:12:55Z

Test build #48391 has finished for PR 10499 at commit dff3935.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

marmbrus · 2015-12-29T19:59:53Z

sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

@@ -119,7 +119,7 @@ final class DataFrameWriter private[sql](df: DataFrame) {
   * Partitions the output by the given columns on the file system. If specified, the output is
   * laid out on the file system similar to Hive's partitioning scheme.
   *
-   * This is only applicable for Parquet at the moment.
+   * This was initially applicable for Parquet but in 1.5+ covers JSON as well.


And text, ORC, and avro

SparkQA · 2015-12-30T06:29:16Z

Test build #48443 has finished for PR 10499 at commit a021725.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

tedyu · 2015-12-30T19:25:24Z

The test failure was not related to the patch.
Looks like HiveThriftBinaryServerSuite timed out:

[info] HiveThriftBinaryServerSuite:
[info] - GetInfo Thrift API (431 milliseconds)
[info] - JDBC query execution (9 seconds, 740 milliseconds)
[info] - Checks Hive version (1 second, 386 milliseconds)
[info] - SPARK-3004 regression: result set containing NULL (1 second, 685 milliseconds)
[info] - SPARK-4292 regression: result set iterator issue (5 seconds, 77 milliseconds)
[info] - SPARK-4309 regression: Date type support (1 second, 328 milliseconds)
[info] - SPARK-4407 regression: Complex type support (2 seconds, 165 milliseconds)
[info] - test multiple session (5 seconds, 857 milliseconds)
Attempting to post to Github...
 > Post successful.
Build step 'Execute shell' marked build as failure

tedyu · 2015-12-30T19:25:47Z

@marmbrus :
Is there anything I need to do ?

tedyu · 2016-01-04T18:12:42Z

@marmbrus
Gentle ping

marmbrus · 2016-01-04T20:38:35Z

Thanks, merged to master and 1.6

This is the related thread: http://search-hadoop.com/m/q3RTtO3ReeJ1iF02&subj=Re+partitioning+json+data+in+spark Michael suggested fixing the doc. Please review. Author: tedyu <[email protected]> Closes #10499 from ted-yu/master. (cherry picked from commit 40d0396) Signed-off-by: Michael Armbrust <[email protected]>

Adjust coverage for partitionBy()

7884e87

JoshRosen reviewed Dec 28, 2015
View reviewed changes

Change wording for 1.5+

f655bbe

marmbrus reviewed Dec 29, 2015
View reviewed changes

Fix typo

dff3935

marmbrus reviewed Dec 29, 2015
View reviewed changes

Address Michael's comment

a021725

asfgit closed this in 40d0396 Jan 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC] Adjust coverage for partitionBy() #10499

[DOC] Adjust coverage for partitionBy() #10499

ted-yu commented Dec 28, 2015

JoshRosen Dec 28, 2015

SparkQA commented Dec 28, 2015

SparkQA commented Dec 29, 2015

marmbrus Dec 29, 2015

SparkQA commented Dec 29, 2015

marmbrus Dec 29, 2015

SparkQA commented Dec 30, 2015

tedyu commented Dec 30, 2015

tedyu commented Dec 30, 2015

tedyu commented Jan 4, 2016

marmbrus commented Jan 4, 2016

[DOC] Adjust coverage for partitionBy() #10499

[DOC] Adjust coverage for partitionBy() #10499

Conversation

ted-yu commented Dec 28, 2015

JoshRosen Dec 28, 2015

Choose a reason for hiding this comment

SparkQA commented Dec 28, 2015

SparkQA commented Dec 29, 2015

marmbrus Dec 29, 2015

Choose a reason for hiding this comment

SparkQA commented Dec 29, 2015

marmbrus Dec 29, 2015

Choose a reason for hiding this comment

SparkQA commented Dec 30, 2015

tedyu commented Dec 30, 2015

tedyu commented Dec 30, 2015

tedyu commented Jan 4, 2016

marmbrus commented Jan 4, 2016