Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins #17477

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions core/src/main/scala/org/apache/spark/rpc/RpcEndpoint.scala
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ private[spark] trait RpcEnvFactory {
*
* The life-cycle of an endpoint is:
*
* constructor -> onStart -> receive* -> onStop
* {@code constructor -> onStart -> receive* -> onStop}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this, it produces the documentation as below (manually tested)

Scaladoc

2017-04-12 5 08 09

Javadoc

2017-04-12 5 07 58

This also seems not exposed to API documentation anyway.

*
* Note: `receive` can be called concurrently. If you want `receive` to be thread-safe, please use
* [[ThreadSafeRpcEndpoint]]
Expand Down Expand Up @@ -63,16 +63,16 @@ private[spark] trait RpcEndpoint {
}

/**
* Process messages from [[RpcEndpointRef.send]] or [[RpcCallContext.reply)]]. If receiving a
* unmatched message, [[SparkException]] will be thrown and sent to `onError`.
* Process messages from `RpcEndpointRef.send` or `RpcCallContext.reply`. If receiving a
* unmatched message, `SparkException` will be thrown and sent to `onError`.
*/
def receive: PartialFunction[Any, Unit] = {
case _ => throw new SparkException(self + " does not implement 'receive'")
}

/**
* Process messages from [[RpcEndpointRef.ask]]. If receiving a unmatched message,
* [[SparkException]] will be thrown and sent to `onError`.
* Process messages from `RpcEndpointRef.ask`. If receiving a unmatched message,
* `SparkException` will be thrown and sent to `onError`.
*/
def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
case _ => context.sendFailure(new SparkException(self + " won't reply anything"))
Expand Down
2 changes: 1 addition & 1 deletion core/src/main/scala/org/apache/spark/rpc/RpcTimeout.scala
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ import org.apache.spark.SparkConf
import org.apache.spark.util.{ThreadUtils, Utils}

/**
* An exception thrown if RpcTimeout modifies a [[TimeoutException]].
* An exception thrown if RpcTimeout modifies a `TimeoutException`.
*/
private[rpc] class RpcTimeoutException(message: String, cause: TimeoutException)
extends TimeoutException(message) { initCause(cause) }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -607,7 +607,7 @@ class DAGScheduler(
* @param resultHandler callback to pass each result to
* @param properties scheduler properties to attach to this job, e.g. fair scheduler pool name
*
* @throws Exception when the job fails
* @note Throws `Exception` when the job fails
*/
def runJob[T, U](
rdd: RDD[T],
Expand Down Expand Up @@ -644,7 +644,7 @@ class DAGScheduler(
*
* @param rdd target RDD to run tasks on
* @param func a function to run on each partition of the RDD
* @param evaluator [[ApproximateEvaluator]] to receive the partial results
* @param evaluator `ApproximateEvaluator` to receive the partial results
* @param callSite where in the user program this job was called
* @param timeout maximum time to wait for the job, in milliseconds
* @param properties scheduler properties to attach to this job, e.g. fair scheduler pool name
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ private[spark] trait ExternalClusterManager {

/**
* Create a scheduler backend for the given SparkContext and scheduler. This is
* called after task scheduler is created using [[ExternalClusterManager.createTaskScheduler()]].
* called after task scheduler is created using `ExternalClusterManager.createTaskScheduler()`.
* @param sc SparkContext
* @param masterURL the master URL
* @param scheduler TaskScheduler that will be used with the scheduler backend.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ import org.apache.spark.util.{AccumulatorV2, ThreadUtils, Utils}

/**
* Schedules tasks for multiple types of clusters by acting through a SchedulerBackend.
* It can also work with a local setup by using a [[LocalSchedulerBackend]] and setting
* It can also work with a local setup by using a `LocalSchedulerBackend` and setting
* isLocal to true. It handles common logic, like determining a scheduling order across jobs, waking
* up to launch speculative tasks, etc.
*
Expand Down Expand Up @@ -704,12 +704,12 @@ private[spark] object TaskSchedulerImpl {
* Used to balance containers across hosts.
*
* Accepts a map of hosts to resource offers for that host, and returns a prioritized list of
* resource offers representing the order in which the offers should be used. The resource
* resource offers representing the order in which the offers should be used. The resource
* offers are ordered such that we'll allocate one container on each host before allocating a
* second container on any host, and so on, in order to reduce the damage if a host fails.
*
* For example, given <h1, [o1, o2, o3]>, <h2, [o4]>, <h1, [o5, o6]>, returns
* [o1, o5, o4, 02, o6, o3]
* For example, given {@literal <h1, [o1, o2, o3]>}, {@literal <h2, [o4]>} and
* {@literal <h3, [o5, o6]>}, returns {@literal [o1, o5, o4, o2, o6, o3]}.
Copy link
Member Author

@HyukjinKwon HyukjinKwon Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we can't use @code here if there are codes such as <A ...> (it seems < A...> case looks fine). I ran some tests with the comments below:

 * For example, given {@code < h1, [o1, o2, o3] >}, {@code < h2, [o4]>} and {@code <h3, [o5, o6]>},
 * returns {@code [o1, o5, o4, o2, o6, o3]}.
 *
 * For example, given
 *
 * {@code <h1, [o1, o2, o3]>},
 *
 * {@code <h2, [o4]} and
 *
 * {@code h3, [o5, o6]>},
 *
 * returns {@code [o1, o5, o4, o2, o6, o3]}.

Scaladoc

2017-04-12 4 34 04

Javadoc

2017-04-12 4 34 38

If we use @literal, it seems fine.

Scaladoc

2017-04-12 4 46 54

Javadoc

2017-04-12 4 46 43

This seems not exposed in the API documentation anyway.

*/
def prioritizeContainers[K, T] (map: HashMap[K, ArrayBuffer[T]]): List[T] = {
val _keyList = new ArrayBuffer[K](map.size)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ private[spark] trait BlockData {
/**
* Returns a Netty-friendly wrapper for the block's data.
*
* @see [[ManagedBuffer#convertToNetty()]]
* Please see `ManagedBuffer.convertToNetty()` for more details.
*/
def toNetty(): Object

Expand Down
4 changes: 2 additions & 2 deletions core/src/test/scala/org/apache/spark/AccumulatorSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ private[spark] object AccumulatorSuite {
import InternalAccumulator._

/**
* Create a long accumulator and register it to [[AccumulatorContext]].
* Create a long accumulator and register it to `AccumulatorContext`.
*/
def createLongAccum(
name: String,
Expand All @@ -258,7 +258,7 @@ private[spark] object AccumulatorSuite {
}

/**
* Make an [[AccumulableInfo]] out of an [[Accumulable]] with the intent to use the
* Make an `AccumulableInfo` out of an [[Accumulable]] with the intent to use the
* info as an accumulator update.
*/
def makeInfo(a: AccumulatorV2[_, _]): AccumulableInfo = a.toInfo(Some(a.value), None)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ import org.apache.spark.network.shuffle.{ExternalShuffleBlockHandler, ExternalSh
/**
* This suite creates an external shuffle server and routes all shuffle fetches through it.
* Note that failures in this suite may arise due to changes in Spark that invalidate expectations
* set up in [[ExternalShuffleBlockHandler]], such as changing the format of shuffle files or how
* set up in `ExternalShuffleBlockHandler`, such as changing the format of shuffle files or how
* we hash files into folders.
*/
class ExternalShuffleServiceSuite extends ShuffleSuite with BeforeAndAfterAll {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ import org.scalatest.BeforeAndAfterAll
import org.scalatest.BeforeAndAfterEach
import org.scalatest.Suite

/** Manages a local `sc` {@link SparkContext} variable, correctly stopping it after each test. */
/** Manages a local `sc` `SparkContext` variable, correctly stopping it after each test. */
trait LocalSparkContext extends BeforeAndAfterEach with BeforeAndAfterAll { self: Suite =>

@transient var sc: SparkContext = _
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,12 +95,12 @@ abstract class SchedulerIntegrationSuite[T <: MockBackend: ClassTag] extends Spa
}

/**
* A map from partition -> results for all tasks of a job when you call this test framework's
* A map from partition to results for all tasks of a job when you call this test framework's
* [[submit]] method. Two important considerations:
*
* 1. If there is a job failure, results may or may not be empty. If any tasks succeed before
* the job has failed, they will get included in `results`. Instead, check for job failure by
* checking [[failure]]. (Also see [[assertDataStructuresEmpty()]])
* checking [[failure]]. (Also see `assertDataStructuresEmpty()`)
*
* 2. This only gets cleared between tests. So you'll need to do special handling if you submit
* more than one job in one test.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ import org.apache.spark.serializer.KryoTest.RegistratorWithoutAutoReset
/**
* Tests to ensure that [[Serializer]] implementations obey the API contracts for methods that
* describe properties of the serialized stream, such as
* [[Serializer.supportsRelocationOfSerializedObjects]].
* `Serializer.supportsRelocationOfSerializedObjects`.
*/
class SerializerPropertiesSuite extends SparkFunSuite {

Expand Down
15 changes: 15 additions & 0 deletions dev/run-tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,19 @@ def build_spark_sbt(hadoop_version):
exec_sbt(profiles_and_goals)


def build_spark_unidoc_sbt(hadoop_version):
set_title_and_block("Building Unidoc API Documentation", "BLOCK_DOCUMENTATION")
# Enable all of the profiles for the build:
build_profiles = get_hadoop_profiles(hadoop_version) + modules.root.build_profile_flags
sbt_goals = ["unidoc"]
profiles_and_goals = build_profiles + sbt_goals

print("[info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: ",
" ".join(profiles_and_goals))

exec_sbt(profiles_and_goals)


def build_spark_assembly_sbt(hadoop_version):
# Enable all of the profiles for the build:
build_profiles = get_hadoop_profiles(hadoop_version) + modules.root.build_profile_flags
Expand All @@ -352,6 +365,8 @@ def build_spark_assembly_sbt(hadoop_version):
print("[info] Building Spark assembly (w/Hive 1.2.1) using SBT with these arguments: ",
" ".join(profiles_and_goals))
exec_sbt(profiles_and_goals)
# Make sure that Java and Scala API documentation can be generated
build_spark_unidoc_sbt(hadoop_version)


def build_apache_spark(build_tool, hadoop_version):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

/**
* Provides a method to run tests against a {@link SparkContext} variable that is correctly stopped
* Provides a method to run tests against a `SparkContext` variable that is correctly stopped
* after each test.
*/
trait LocalSparkContext {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ abstract class Classifier[
* and features (`Vector`).
* @param numClasses Number of classes label can take. Labels must be integers in the range
* [0, numClasses).
* @throws SparkException if any label is not an integer >= 0
* @note Throws `SparkException` if any label is a non-integer or is negative
*/
protected def extractLabeledPoints(dataset: Dataset[_], numClasses: Int): RDD[LabeledPoint] = {
require(numClasses > 0, s"Classifier (in extractLabeledPoints) found numClasses =" +
Expand Down
8 changes: 6 additions & 2 deletions mllib/src/test/scala/org/apache/spark/ml/PipelineSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,9 @@ class PipelineSuite extends SparkFunSuite with MLlibTestSparkContext with Defaul
}


/** Used to test [[Pipeline]] with [[MLWritable]] stages */
/**
* Used to test [[Pipeline]] with `MLWritable` stages
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid inlined comment when there are code blocks (` ... `). See #16050

*/
class WritableStage(override val uid: String) extends Transformer with MLWritable {

final val intParam: IntParam = new IntParam(this, "intParam", "doc")
Expand All @@ -257,7 +259,9 @@ object WritableStage extends MLReadable[WritableStage] {
override def load(path: String): WritableStage = super.load(path)
}

/** Used to test [[Pipeline]] with non-[[MLWritable]] stages */
/**
* Used to test [[Pipeline]] with non-`MLWritable` stages
*/
class UnWritableStage(override val uid: String) extends Transformer {

final val intParam: IntParam = new IntParam(this, "intParam", "doc")
Expand Down
12 changes: 8 additions & 4 deletions mllib/src/test/scala/org/apache/spark/ml/feature/LSHTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,21 @@ private[ml] object LSHTest {
* the following property is satisfied.
*
* There exist dist1, dist2, p1, p2, so that for any two elements e1 and e2,
* If dist(e1, e2) <= dist1, then Pr{h(x) == h(y)} >= p1
* If dist(e1, e2) >= dist2, then Pr{h(x) == h(y)} <= p2
* If dist(e1, e2) is less than or equal to dist1, then Pr{h(x) == h(y)} is greater than
* or equal to p1
* If dist(e1, e2) is greater than or equal to dist2, then Pr{h(x) == h(y)} is less than
* or equal to p2
*
* This is called locality sensitive property. This method checks the property on an
* existing dataset and calculate the probabilities.
* (https://en.wikipedia.org/wiki/Locality-sensitive_hashing#Definition)
*
* This method hashes each elements to hash buckets using LSH, and calculate the false positive
* and false negative:
* False positive: Of all (e1, e2) sharing any bucket, the probability of dist(e1, e2) > distFP
* False negative: Of all (e1, e2) not sharing buckets, the probability of dist(e1, e2) < distFN
* False positive: Of all (e1, e2) sharing any bucket, the probability of dist(e1, e2) is greater
* than distFP
* False negative: Of all (e1, e2) not sharing buckets, the probability of dist(e1, e2) is less
* than distFN
*
* @param dataset The dataset to verify the locality sensitive hashing property.
* @param lsh The lsh instance to perform the hashing
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,7 @@ class ParamsSuite extends SparkFunSuite {
object ParamsSuite extends SparkFunSuite {

/**
* Checks common requirements for [[Params.params]]:
* Checks common requirements for `Params.params`:
* - params are ordered by names
* - param parent has the same UID as the object's UID
* - param name is the same as the param method name
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ private[ml] object TreeTests extends SparkFunSuite {
* Convert the given data to a DataFrame, and set the features and label metadata.
* @param data Dataset. Categorical features and labels must already have 0-based indices.
* This must be non-empty.
* @param categoricalFeatures Map: categorical feature index -> number of distinct values
* @param categoricalFeatures Map: categorical feature index to number of distinct values
* @param numClasses Number of classes label can take. If 0, mark as continuous.
* @return DataFrame with metadata
*/
Expand Down Expand Up @@ -69,7 +69,9 @@ private[ml] object TreeTests extends SparkFunSuite {
df("label").as("label", labelMetadata))
}

/** Java-friendly version of [[setMetadata()]] */
/**
* Java-friendly version of `setMetadata()`
*/
def setMetadata(
data: JavaRDD[LabeledPoint],
categoricalFeatures: java.util.Map[java.lang.Integer, java.lang.Integer],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,20 +81,20 @@ trait DefaultReadWriteTest extends TempDirectory { self: Suite =>
/**
* Default test for Estimator, Model pairs:
* - Explicitly set Params, and train model
* - Test save/load using [[testDefaultReadWrite()]] on Estimator and Model
* - Test save/load using `testDefaultReadWrite` on Estimator and Model
* - Check Params on Estimator and Model
* - Compare model data
*
* This requires that [[Model]]'s [[Param]]s should be a subset of [[Estimator]]'s [[Param]]s.
* This requires that `Model`'s `Param`s should be a subset of `Estimator`'s `Param`s.
*
* @param estimator Estimator to test
* @param dataset Dataset to pass to [[Estimator.fit()]]
* @param testEstimatorParams Set of [[Param]] values to set in estimator
* @param testModelParams Set of [[Param]] values to set in model
* @param checkModelData Method which takes the original and loaded [[Model]] and compares their
* data. This method does not need to check [[Param]] values.
* @tparam E Type of [[Estimator]]
* @tparam M Type of [[Model]] produced by estimator
* @param dataset Dataset to pass to `Estimator.fit()`
* @param testEstimatorParams Set of `Param` values to set in estimator
* @param testModelParams Set of `Param` values to set in model
* @param checkModelData Method which takes the original and loaded `Model` and compares their
* data. This method does not need to check `Param` values.
* @tparam E Type of `Estimator`
* @tparam M Type of `Model` produced by estimator
*/
def testEstimatorAndModelReadWrite[
E <: Estimator[M] with MLWritable, M <: Model[M] with MLWritable](
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,8 @@ class StopwatchSuite extends SparkFunSuite with MLlibTestSparkContext {
private object StopwatchSuite extends SparkFunSuite {

/**
* Checks the input stopwatch on a task that takes a random time (<10ms) to finish. Validates and
* returns the duration reported by the stopwatch.
* Checks the input stopwatch on a task that takes a random time (less than 10ms) to finish.
* Validates and returns the duration reported by the stopwatch.
*/
def checkStopwatch(sw: Stopwatch): Long = {
val ubStart = now
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ trait TempDirectory extends BeforeAndAfterAll { self: Suite =>

private var _tempDir: File = _

/** Returns the temporary directory as a [[File]] instance. */
/**
* Returns the temporary directory as a `File` instance.
*/
protected def tempDir: File = _tempDir

override def beforeAll(): Unit = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ import org.apache.spark.SparkFunSuite
import org.apache.spark.mllib.tree.impurity.{EntropyAggregator, GiniAggregator}

/**
* Test suites for [[GiniAggregator]] and [[EntropyAggregator]].
* Test suites for `GiniAggregator` and `EntropyAggregator`.
*/
class ImpuritySuite extends SparkFunSuite {
test("Gini impurity does not support negative labels") {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ trait MLlibTestSparkContext extends TempDirectory { self: Suite =>
* A helper object for importing SQL implicits.
*
* Note that the alternative of importing `spark.implicits._` is not possible here.
* This is because we create the [[SQLContext]] immediately before the first test is run,
* This is because we create the `SQLContext` immediately before the first test is run,
* but the implicits import is needed in the constructor.
*/
protected object testImplicits extends SQLImplicits {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ trait MesosSchedulerUtils extends Logging {
}

/**
* Converts the attributes from the resource offer into a Map of name -> Attribute Value
* Converts the attributes from the resource offer into a Map of name to Attribute Value
* The attribute values are the mesos attribute types and they are
*
* @param offerAttributes the attributes offered
Expand Down Expand Up @@ -296,7 +296,7 @@ trait MesosSchedulerUtils extends Logging {

/**
* Parses the attributes constraints provided to spark and build a matching data struct:
* Map[<attribute-name>, Set[values-to-match]]
* {@literal Map[<attribute-name>, Set[values-to-match]}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* The constraints are specified as ';' separated key-value pairs where keys and values
* are separated by ':'. The ':' implies equality (for singular values) and "is one of" for
* multiple values (comma separated). For example:
Expand Down Expand Up @@ -354,7 +354,7 @@ trait MesosSchedulerUtils extends Logging {
* container overheads.
*
* @param sc SparkContext to use to get `spark.mesos.executor.memoryOverhead` value
* @return memory requirement as (0.1 * <memoryOverhead>) or MEMORY_OVERHEAD_MINIMUM
* @return memory requirement as (0.1 * memoryOverhead) or MEMORY_OVERHEAD_MINIMUM
* (whichever is larger)
*/
def executorMemory(sc: SparkContext): Int = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,11 +117,11 @@ object RandomDataGenerator {
}

/**
* Returns a function which generates random values for the given [[DataType]], or `None` if no
* Returns a function which generates random values for the given `DataType`, or `None` if no
* random data generator is defined for that data type. The generated values will use an external
* representation of the data type; for example, the random generator for [[DateType]] will return
* instances of [[java.sql.Date]] and the generator for [[StructType]] will return a [[Row]].
* For a [[UserDefinedType]] for a class X, an instance of class X is returned.
* representation of the data type; for example, the random generator for `DateType` will return
* instances of [[java.sql.Date]] and the generator for `StructType` will return a [[Row]].
* For a `UserDefinedType` for a class X, an instance of class X is returned.
*
* @param dataType the type to generate values for
* @param nullable whether null values should be generated
Expand Down
Loading