[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins #17477

HyukjinKwon · 2017-03-30T06:57:38Z

What changes were proposed in this pull request?

This PR proposes to run Spark unidoc to test Javadoc 8 build as Javadoc 8 is easily re-breakable.

There are several problems with it:

It introduces little extra bit of time to run the tests. In my case, it took 1.5 mins more (Elapsed :[94.8746569157]). How it was tested is described in "How was this patch tested?".
One problem that I noticed was that Unidoc appeared to be processing test sources: if we can find a way to exclude those from being processed in the first place then that might significantly speed things up.

(see @JoshRosen's comment)

To complete this automated build, It also suggests to fix existing Javadoc breaks / ones introduced by test codes as described above.

There fixes are similar instances that previously fixed. Please refer #15999 and #16013

Note that this only fixes errors not warnings. Please see my observation #17389 (comment) for spurious errors by warnings.

How was this patch tested?

Manually via jekyll build for building tests. Also, tested via running ./dev/run-tests.

This was tested via manually adding time.time() as below:

     profiles_and_goals = build_profiles + sbt_goals

     print("[info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: ",
           " ".join(profiles_and_goals))

+    import time
+    st = time.time()
     exec_sbt(profiles_and_goals)
+    print("Elapsed :[%s]" % str(time.time() - st))

produces

...
========================================================================
Building Unidoc API Documentation
========================================================================
...
[info] Main Java API documentation successful.
...
Elapsed :[94.8746569157]
...

HyukjinKwon · 2017-03-30T06:59:07Z

mllib/src/test/scala/org/apache/spark/ml/PipelineSuite.scala

@@ -230,7 +230,9 @@ class PipelineSuite extends SparkFunSuite with MLlibTestSparkContext with Defaul
 }


-/** Used to test [[Pipeline]] with [[MLWritable]] stages */
+/**
+ * Used to test [[Pipeline]] with `MLWritable` stages


We should avoid inlined comment when there are code blocks (` ... `). See #16050

HyukjinKwon · 2017-03-30T07:00:24Z

mllib/src/main/scala/org/apache/spark/ml/classification/Classifier.scala

@@ -74,7 +74,7 @@ abstract class Classifier[
   *                 and features (`Vector`).
   * @param numClasses  Number of classes label can take.  Labels must be integers in the range
   *                    [0, numClasses).
-   * @throws SparkException  if any label is not an integer >= 0
+   * @note Throws `SparkException` if any label is not an integer is greater than or equal to 0


This case throws an error as below:

[error] .../spark/mllib/target/java/org/apache/spark/ml/classification/Classifier.java:28: error: reference not found [error] * @throws SparkException if any label is not an integer >= 0 [error] ^

is not a nonnegative integer? http://mathworld.wolfram.com/NonnegativeInteger.html

Or is a non-integer or is negative?

HyukjinKwon · 2017-03-30T07:03:09Z

FYI, If I haven't missed something, all the cases are the instances same with the ones previously fixed. cc @JoshRosen, @srowen and @jkbradley. Could you take a look and see if it makes sense?

HyukjinKwon · 2017-03-30T09:41:20Z

core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala

-   * For example, given <h1, [o1, o2, o3]>, <h2, [o4]>, <h1, [o5, o6]>, returns
-   * [o1, o5, o4, 02, o6, o3]
+   * For example, given a map consisting of h1 to [o1, o2, o3], h2 to [o4] and h3 to [o5, o6],
+   * returns a list, [o1, o5, o4, o2, o6, o3].


There look few typos here. 02 -> o2 and h1 -> h3.

Can we also wrap this in code or otherwise escape it or use a different symbol?

{h1: [o1, o2, o3], h2: [o4], ...}

is clearer.

SparkQA · 2017-03-30T10:05:58Z

Test build #75381 has finished for PR 17477 at commit 7ddb6eb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-30T12:40:13Z

Test build #75385 has finished for PR 17477 at commit 7a7cf04.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-31T06:36:39Z

Test build #75416 has finished for PR 17477 at commit 4d39544.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-04-02T09:14:14Z

(gentle ping @JoshRosen).

HyukjinKwon · 2017-04-12T00:38:44Z

@JoshRosen, I am fine with closing it for now if you are currently not sure of it.

JoshRosen · 2017-04-12T03:34:04Z

jenkins retest this please

JoshRosen · 2017-04-12T03:37:30Z

core/src/main/scala/org/apache/spark/rpc/RpcEndpoint.scala

@@ -33,9 +33,9 @@ private[spark] trait RpcEnvFactory {
 *
 * It is guaranteed that `onStart`, `receive` and `onStop` will be called in sequence.
 *
- * The life-cycle of an endpoint is:
+ * The life-cycle of an endpoint is as below in an order:


Can we just wrap this block as code? The rewording is confusing and doesn't read as clearly to me.

JoshRosen · 2017-04-12T03:42:43Z

Left a few comments.

@srowen @jkbradley, could you take a look and merge it after changes if it looks okay to you? Overall build change structure looks okay to me if we're fine with failing PR build on doc build breaks. I did a somewhat cursory examination of the actual doc changes, so additional review there is welcome if you have time.

SparkQA · 2017-04-12T03:50:40Z

Test build #75722 has finished for PR 17477 at commit 4d39544.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-04-12T03:55:50Z

Looks there is another break.

[error] /home/jenkins/workspace/SparkPullRequestBuilder/sql/core/target/java/org/apache/spark/sql/catalog/Catalog.java:453: error: reference not found
[error]    * Invalidates and refreshes all the cached data (and the associated metadata) for any {@link Dataset}
[error]

Let me clean up this and address comments. Thank you @JoshRosen.

HyukjinKwon · 2017-04-12T08:04:56Z

core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala

-   * For example, given <h1, [o1, o2, o3]>, <h2, [o4]>, <h1, [o5, o6]>, returns
-   * [o1, o5, o4, 02, o6, o3]
+   * For example, given {@literal <h1, [o1, o2, o3]>}, {@literal <h2, [o4]>} and
+   * {@literal <h3, [o5, o6]>}, returns {@literal [o1, o5, o4, o2, o6, o3]}.


It seems we can't use @code here if there are codes such as <A ...> (it seems < A...> case looks fine). I ran some tests with the comments below:

* For example, given {@code < h1, [o1, o2, o3] >}, {@code < h2, [o4]>} and {@code <h3, [o5, o6]>}, * returns {@code [o1, o5, o4, o2, o6, o3]}. * * For example, given * * {@code <h1, [o1, o2, o3]>}, * * {@code <h2, [o4]} and * * {@code h3, [o5, o6]>}, * * returns {@code [o1, o5, o4, o2, o6, o3]}.

Scaladoc

Javadoc

If we use @literal, it seems fine.

Scaladoc

Javadoc

This seems not exposed in the API documentation anyway.

HyukjinKwon · 2017-04-12T08:06:36Z

...gers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala

@@ -296,7 +296,7 @@ trait MesosSchedulerUtils extends Logging {

  /**
   * Parses the attributes constraints provided to spark and build a matching data struct:
-   *  Map[<attribute-name>, Set[values-to-match]]
+   *  {@literal Map[<attribute-name>, Set[values-to-match]}


Same instance with https://github.com/apache/spark/pull/17477/files#r111086455.

@code

@literal

HyukjinKwon · 2017-04-12T08:07:37Z

sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HttpAuthUtils.java

@@ -89,7 +89,7 @@ public static String getKerberosServiceTicket(String principal, String host,
   * @param clientUserName Client User name.
   * @return An unsigned cookie token generated from input parameters.
   * The final cookie generated is of the following format :
-   * cu=<username>&rn=<randomNumber>&s=<cookieSignature>
+   * {@code cu=<username>&rn=<randomNumber>&s=<cookieSignature>}


This is java code. So, @code should be fine. This also seems not exposed to the documentation anyway.

HyukjinKwon · 2017-04-12T08:10:46Z

core/src/main/scala/org/apache/spark/rpc/RpcEndpoint.scala

@@ -35,7 +35,7 @@ private[spark] trait RpcEnvFactory {
 *
 * The life-cycle of an endpoint is:
 *
- * constructor -> onStart -> receive* -> onStop
+ * {@code constructor -> onStart -> receive* -> onStop}


After this, it produces the documentation as below (manually tested)

Scaladoc

Javadoc

This also seems not exposed to API documentation anyway.

## What changes were proposed in this pull request? This PR proposes corrections related to JSON APIs as below: - Rendering links in Python documentation - Replacing `RDD` to `Dataset` in programing guide - Adding missing description about JSON Lines consistently in `DataFrameReader.json` in Python API - De-duplicating little bit of `DataFrameReader.json` in Scala/Java API ## How was this patch tested? Manually build the documentation via `jekyll build`. Corresponding snapstops will be left on the codes. Note that currently there are Javadoc8 breaks in several places. These are proposed to be handled in apache#17477. So, this PR does not fix those. Author: hyukjinkwon <[email protected]> Closes apache#17602 from HyukjinKwon/minor-json-documentation.

SparkQA · 2017-04-12T11:02:15Z

Test build #75739 has finished for PR 17477 at commit aefae0f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-04-12T11:39:37Z

Merged to master

HyukjinKwon · 2017-04-12T11:54:34Z

Thank you!

srowen · 2017-04-13T16:11:11Z

Hm, really weird @HyukjinKwon but this fails on the SBT master build but only for Hadoop 2.6. Maven and 2.7 are fine.

[error] /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.6/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala:123: value createDatumWriter is not a member of org.apache.avro.generic.GenericData
[error]     writerCache.getOrElseUpdate(schema, GenericData.get.createDatumWriter(schema))
[error]

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/2770/consoleFull

I can only assume it's some classpath-related problem, such that the classpath resolution for Hadoop 2.6 libs and for SBT, and only for the scaladoc plugin, doesn't see the right version of Avro.

This might be tricky to resolve. Have you seen anything like this before? I wonder if there is any clear difference between the classpath scaladoc would use vs scalac?

HyukjinKwon · 2017-04-14T02:49:52Z

Thank you for your pointer and details. No.. I haven't seen such things before. So, maven + hadoop 2.7, maven + hadoop 2.6 and sbt + hadoop 2.7 are fine but sbt + hadoop 2.6 is being failed. Let me try to reproduce this in my local and try to look into this. It sounds tricky..

HyukjinKwon · 2017-04-14T05:39:03Z

It seems we only build the documentation for sbt. I ran ./build/sbt clean(and also removed all cache) first. And then, I ran a build with

./build/sbt  -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive test:package streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly streaming-kinesis-asl-assembly/assembly

per

========================================================================
Building Spark
========================================================================
[info] Building Spark (w/Hive 1.2.1) using SBT with these arguments:  -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive test:package streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly streaming-kinesis-asl-assembly/assembly

and then,

./build/sbt  -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive unidoc

per


========================================================================
Building Unidoc API Documentation
========================================================================
[info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments:  -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive unidoc

I also, ran run-tests.py with the identical profiles to resemble the steps. However, I am unable to reproduce this. @srowen, are you able to reproduce this in your local maybe?

HyukjinKwon · 2017-04-14T05:43:47Z

What I don't get is, it seems the last test against this PR https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75739/console used the same profiles as below:

========================================================================
Building Spark
========================================================================
[info] Building Spark (w/Hive 1.2.1) using SBT with these arguments:  -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive test:package streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly streaming-kinesis-asl-assembly/assembly

========================================================================
Building Unidoc API Documentation
========================================================================
[info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments:  -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive unidoc

HyukjinKwon · 2017-04-14T06:42:47Z

BTW, It seems GenericData in avro 1.7.4 - https://github.com/apache/avro/blob/release-1.7.4/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java does not have the createDatumWriter whereas avro 1.7.5+ has - https://github.com/apache/avro/blob/release-1.7.5/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java

HyukjinKwon · 2017-04-14T09:31:13Z

Is it okay to just explicitly override the dependency to Avro 1.7.7 in sbt?

srowen · 2017-04-14T11:29:15Z

The build should use 1.7.7, yes. Hadoop pulls in 1.7.4, but, it does so in 2.6 and 2.7. And the SBT and Maven builds seem to get that right as intended because the POM directly overrides this version. (The only component on a different Avro is the Flume module but that's not the problem here.)

I also can't reproduce this locally. It builds fine for me too with the same commands.

I am open to workarounds, though I also don't know what will be sufficient because we can't reproduce it. I am pretty sure the Avro 1.7.4 dependency is coming from hadoop-common but no idea why only in 2.6.

sbt-unidoc has a newer version, 0.4.0, but updating it requires other changes I don't know how to make and I don't see a reason to think it's the problem.

I wonder if the problem is that core does not directly declare a dependency on org.apache.avro:avro but uses it. If so then adding this might do the trick in the core POM:

      <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro</artifactId>
      </dependency>

… failure in SBT Hadoop 2.6 master on Jenkins ## What changes were proposed in this pull request? This PR proposes to add ``` <dependency> <groupId>org.apache.avro</groupId> <artifactId>avro</artifactId> </dependency> ``` in core POM to see if it resolves the build failure as below: ``` [error] /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.6/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala:123: value createDatumWriter is not a member of org.apache.avro.generic.GenericData [error] writerCache.getOrElseUpdate(schema, GenericData.get.createDatumWriter(schema)) [error] ``` https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/2770/consoleFull ## How was this patch tested? I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this. Please refer the comments in #17477 (comment) Author: hyukjinkwon <[email protected]> Closes #17642 from HyukjinKwon/SPARK-20343.

…ailure in SBT Hadoop 2.6 master on Jenkins ## What changes were proposed in this pull request? This PR proposes to force Avro's version to 1.7.7 in core to resolve the build failure as below: ``` [error] /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.6/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala:123: value createDatumWriter is not a member of org.apache.avro.generic.GenericData [error] writerCache.getOrElseUpdate(schema, GenericData.get.createDatumWriter(schema)) [error] ``` https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/2770/consoleFull Note that this is a hack and should be removed in the future. ## How was this patch tested? I only tested this actually overrides the dependency. I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this. Please refer the comments in apache#17477 (comment) Author: hyukjinkwon <[email protected]> Closes apache#17651 from HyukjinKwon/SPARK-20343-sbt.

…tly set in SBT build ## What changes were proposed in this pull request? This PR proposes two things as below: - Avoid Unidoc build only if Hadoop 2.6 is explicitly set in SBT build Due to a different dependency resolution in SBT & Unidoc by an unknown reason, the documentation build fails on a specific machine & environment in Jenkins but it was unable to reproduce. So, this PR just checks an environment variable `AMPLAB_JENKINS_BUILD_PROFILE` that is set in Hadoop 2.6 SBT build against branches on Jenkins, and then disables Unidoc build. **Note that PR builder will still build it with Hadoop 2.6 & SBT.** ``` ======================================================================== Building Unidoc API Documentation ======================================================================== [info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive unidoc Using /usr/java/jdk1.8.0_60 as default JAVA_HOME. ... ``` I checked the environment variables from the logs (first bit) as below: - **spark-master-test-sbt-hadoop-2.6** (this one is being failed) - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 <- I use this variable AMPLAB_JENKINS="true" ``` - spark-master-test-sbt-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.7 AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.6 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.7 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - PR builder - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75843/consoleFull ``` JENKINS_MASTER_HOSTNAME=amp-jenkins-master JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 ``` Assuming from other logs in branch-2.1 - SBT & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 AMPLAB_JENKINS="true" ``` - Maven & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS="true" ``` We have been using the same convention for those variables. These are actually being used in `run-tests.py` script - here https://github.com/apache/spark/blob/master/dev/run-tests.py#L519-L520 - Revert the previous try After #17651, it seems the build still fails on SBT Hadoop 2.6 master. I am unable to reproduce this - #17477 (comment) and the reviewer was too. So, this got merged as it looks the only way to verify this is to merge it currently (as no one seems able to reproduce this). ## How was this patch tested? I only checked `is_hadoop_version_2_6 = os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6"` is working fine as expected as below: ```python >>> import collections >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.6"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") False >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.7"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True >>> os = collections.namedtuple('os', 'environ')(environ={}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True ``` I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this. Please refer the comments in #17477 (comment) Author: hyukjinkwon <[email protected]> Closes #17669 from HyukjinKwon/revert-SPARK-20343.

…tly set in SBT build ## What changes were proposed in this pull request? This PR proposes two things as below: - Avoid Unidoc build only if Hadoop 2.6 is explicitly set in SBT build Due to a different dependency resolution in SBT & Unidoc by an unknown reason, the documentation build fails on a specific machine & environment in Jenkins but it was unable to reproduce. So, this PR just checks an environment variable `AMPLAB_JENKINS_BUILD_PROFILE` that is set in Hadoop 2.6 SBT build against branches on Jenkins, and then disables Unidoc build. **Note that PR builder will still build it with Hadoop 2.6 & SBT.** ``` ======================================================================== Building Unidoc API Documentation ======================================================================== [info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive unidoc Using /usr/java/jdk1.8.0_60 as default JAVA_HOME. ... ``` I checked the environment variables from the logs (first bit) as below: - **spark-master-test-sbt-hadoop-2.6** (this one is being failed) - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 <- I use this variable AMPLAB_JENKINS="true" ``` - spark-master-test-sbt-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.7 AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.6 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.7 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - PR builder - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75843/consoleFull ``` JENKINS_MASTER_HOSTNAME=amp-jenkins-master JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 ``` Assuming from other logs in branch-2.1 - SBT & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 AMPLAB_JENKINS="true" ``` - Maven & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS="true" ``` We have been using the same convention for those variables. These are actually being used in `run-tests.py` script - here https://github.com/apache/spark/blob/master/dev/run-tests.py#L519-L520 - Revert the previous try After #17651, it seems the build still fails on SBT Hadoop 2.6 master. I am unable to reproduce this - #17477 (comment) and the reviewer was too. So, this got merged as it looks the only way to verify this is to merge it currently (as no one seems able to reproduce this). ## How was this patch tested? I only checked `is_hadoop_version_2_6 = os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6"` is working fine as expected as below: ```python >>> import collections >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.6"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") False >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.7"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True >>> os = collections.namedtuple('os', 'environ')(environ={}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True ``` I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this. Please refer the comments in #17477 (comment) Author: hyukjinkwon <[email protected]> Closes #17669 from HyukjinKwon/revert-SPARK-20343. (cherry picked from commit 3537876) Signed-off-by: Sean Owen <[email protected]>

## What changes were proposed in this pull request? This PR proposes to run Spark unidoc to test Javadoc 8 build as Javadoc 8 is easily re-breakable. There are several problems with it: - It introduces little extra bit of time to run the tests. In my case, it took 1.5 mins more (`Elapsed :[94.8746569157]`). How it was tested is described in "How was this patch tested?". - > One problem that I noticed was that Unidoc appeared to be processing test sources: if we can find a way to exclude those from being processed in the first place then that might significantly speed things up. (see joshrosen's [comment](https://issues.apache.org/jira/browse/SPARK-18692?focusedCommentId=15947627&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15947627)) To complete this automated build, It also suggests to fix existing Javadoc breaks / ones introduced by test codes as described above. There fixes are similar instances that previously fixed. Please refer apache#15999 and apache#16013 Note that this only fixes **errors** not **warnings**. Please see my observation apache#17389 (comment) for spurious errors by warnings. ## How was this patch tested? Manually via `jekyll build` for building tests. Also, tested via running `./dev/run-tests`. This was tested via manually adding `time.time()` as below: ```diff profiles_and_goals = build_profiles + sbt_goals print("[info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: ", " ".join(profiles_and_goals)) + import time + st = time.time() exec_sbt(profiles_and_goals) + print("Elapsed :[%s]" % str(time.time() - st)) ``` produces ``` ... ======================================================================== Building Unidoc API Documentation ======================================================================== ... [info] Main Java API documentation successful. ... Elapsed :[94.8746569157] ... Author: hyukjinkwon <[email protected]> Closes apache#17477 from HyukjinKwon/SPARK-18692.

HyukjinKwon commented Mar 30, 2017

View reviewed changes

HyukjinKwon force-pushed the SPARK-18692 branch from 7a7cf04 to 4d39544 Compare March 31, 2017 03:28

HyukjinKwon mentioned this pull request Apr 11, 2017

[MINOR][DOCS] JSON APIs related documentation fixes #17602

Closed

JoshRosen reviewed Apr 12, 2017

View reviewed changes

HyukjinKwon added 3 commits April 12, 2017 15:01

Test Java 8 unidoc build on Jenkins

814e1cb

Sliently rephrase maps related sentance

c18a9b6

Address comments

aefae0f

HyukjinKwon force-pushed the SPARK-18692 branch from 4d39544 to aefae0f Compare April 12, 2017 07:59

HyukjinKwon commented Apr 12, 2017

View reviewed changes

srowen approved these changes Apr 12, 2017

View reviewed changes

asfgit closed this in ceaf77a Apr 12, 2017

HyukjinKwon mentioned this pull request Apr 15, 2017

[SPARK-20343][BUILD] Add avro dependency in core POM to resolve build failure in SBT Hadoop 2.6 master on Jenkins #17642

Closed

HyukjinKwon mentioned this pull request Apr 16, 2017

[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build failure in SBT Hadoop 2.6 master on Jenkins #17651

Closed

HyukjinKwon mentioned this pull request Apr 18, 2017

[SPARK-20343][BUILD] Avoid Unidoc build only if Hadoop 2.6 is explicitly set in SBT build #17669

Closed

HyukjinKwon deleted the SPARK-18692 branch January 2, 2018 03:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins #17477

[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins #17477

HyukjinKwon commented Mar 30, 2017 •

edited

Loading

HyukjinKwon Mar 30, 2017

HyukjinKwon Mar 30, 2017

JoshRosen Apr 12, 2017

JoshRosen Apr 12, 2017

HyukjinKwon commented Mar 30, 2017 •

edited

Loading

HyukjinKwon Mar 30, 2017

JoshRosen Apr 12, 2017

SparkQA commented Mar 30, 2017

SparkQA commented Mar 30, 2017

SparkQA commented Mar 31, 2017

HyukjinKwon commented Apr 2, 2017

HyukjinKwon commented Apr 12, 2017

JoshRosen commented Apr 12, 2017

JoshRosen Apr 12, 2017

JoshRosen commented Apr 12, 2017 •

edited

Loading

SparkQA commented Apr 12, 2017

HyukjinKwon commented Apr 12, 2017

HyukjinKwon Apr 12, 2017 •

edited

Loading

HyukjinKwon Apr 12, 2017

HyukjinKwon Apr 12, 2017

HyukjinKwon Apr 12, 2017

SparkQA commented Apr 12, 2017

srowen commented Apr 12, 2017

HyukjinKwon commented Apr 12, 2017

srowen commented Apr 13, 2017

HyukjinKwon commented Apr 14, 2017 •

edited

Loading

HyukjinKwon commented Apr 14, 2017 •

edited

Loading

HyukjinKwon commented Apr 14, 2017

HyukjinKwon commented Apr 14, 2017

HyukjinKwon commented Apr 14, 2017

srowen commented Apr 14, 2017

[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins #17477

[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins #17477

Conversation

HyukjinKwon commented Mar 30, 2017 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HyukjinKwon commented Mar 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Mar 30, 2017

SparkQA commented Mar 30, 2017

SparkQA commented Mar 31, 2017

HyukjinKwon commented Apr 2, 2017

HyukjinKwon commented Apr 12, 2017

JoshRosen commented Apr 12, 2017

Choose a reason for hiding this comment

JoshRosen commented Apr 12, 2017 • edited Loading

SparkQA commented Apr 12, 2017

HyukjinKwon commented Apr 12, 2017

HyukjinKwon Apr 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Apr 12, 2017

srowen commented Apr 12, 2017

HyukjinKwon commented Apr 12, 2017

srowen commented Apr 13, 2017

HyukjinKwon commented Apr 14, 2017 • edited Loading

HyukjinKwon commented Apr 14, 2017 • edited Loading

HyukjinKwon commented Apr 14, 2017

HyukjinKwon commented Apr 14, 2017

HyukjinKwon commented Apr 14, 2017

srowen commented Apr 14, 2017

HyukjinKwon commented Mar 30, 2017 •

edited

Loading

HyukjinKwon commented Mar 30, 2017 •

edited

Loading

JoshRosen commented Apr 12, 2017 •

edited

Loading

HyukjinKwon Apr 12, 2017 •

edited

Loading

HyukjinKwon commented Apr 14, 2017 •

edited

Loading

HyukjinKwon commented Apr 14, 2017 •

edited

Loading