Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will help unidoc/genjavadoc compatibility #15999

Closed
wants to merge 11 commits into from
4 changes: 2 additions & 2 deletions core/src/main/scala/org/apache/spark/SSLOptions.scala
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,8 @@ private[spark] object SSLOptions extends Logging {
* $ - `[ns].enabledAlgorithms` - a comma separated list of ciphers
*
* For a list of protocols and ciphers supported by particular Java versions, you may go to
* [[https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https Oracle
* blog page]].
* <a href="https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https">
* Oracle blog page</a>.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For hyperlinks, it seems

these are fine (try it):

<a href="https://.../...">link ABC</a>
<a href="https://.../...">
link ABC</a>
<a href="https://.../...">link
ABC</a>

*
* You can optionally specify the default configuration. If you do, for each setting which is
* missing in SparkConf, the corresponding setting is used from the default configuration.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -405,7 +405,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
* partitioning of the resulting key-value pair RDD by passing a Partitioner.
*
* @note If you are grouping in order to perform an aggregation (such as a sum or average) over
* each key, using [[JavaPairRDD.reduceByKey]] or [[JavaPairRDD.combineByKey]]
* each key, using `JavaPairRDD.reduceByKey` or `JavaPairRDD.combineByKey`
* will provide much better performance.
*/
def groupByKey(partitioner: Partitioner): JavaPairRDD[K, JIterable[V]] =
Expand All @@ -416,7 +416,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
* resulting RDD with into `numPartitions` partitions.
*
* @note If you are grouping in order to perform an aggregation (such as a sum or average) over
* each key, using [[JavaPairRDD.reduceByKey]] or [[JavaPairRDD.combineByKey]]
* each key, using `JavaPairRDD.reduceByKey` or `JavaPairRDD.combineByKey`
* will provide much better performance.
*/
def groupByKey(numPartitions: Int): JavaPairRDD[K, JIterable[V]] =
Expand Down Expand Up @@ -546,7 +546,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
* resulting RDD with the existing partitioner/parallelism level.
*
* @note If you are grouping in order to perform an aggregation (such as a sum or average) over
* each key, using [[JavaPairRDD.reduceByKey]] or [[JavaPairRDD.combineByKey]]
* each key, using `JavaPairRDD.reduceByKey` or `JavaPairRDD.combineByKey`
* will provide much better performance.
*/
def groupByKey(): JavaPairRDD[K, JIterable[V]] =
Expand Down
10 changes: 5 additions & 5 deletions core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
Original file line number Diff line number Diff line change
Expand Up @@ -103,10 +103,10 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: ClassTag[T])
* @param withReplacement can elements be sampled multiple times (replaced when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's size
* without replacement: probability that each element is chosen; fraction must be [0, 1]
* with replacement: expected number of times each element is chosen; fraction must be >= 0
* with replacement: expected number of times each element is chosen; fraction must be &gt;= 0
*
* @note This is NOT guaranteed to provide exactly the fraction of the count
* of the given [[RDD]].
* of the given `RDD`.
*/
def sample(withReplacement: Boolean, fraction: Double): JavaRDD[T] =
sample(withReplacement, fraction, Utils.random.nextLong)
Expand All @@ -117,11 +117,11 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: ClassTag[T])
* @param withReplacement can elements be sampled multiple times (replaced when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's size
* without replacement: probability that each element is chosen; fraction must be [0, 1]
* with replacement: expected number of times each element is chosen; fraction must be >= 0
* with replacement: expected number of times each element is chosen; fraction must be &gt;= 0
* @param seed seed for the random number generator
*
* @note This is NOT guaranteed to provide exactly the fraction of the count
* of the given [[RDD]].
* of the given `RDD`.
*/
def sample(withReplacement: Boolean, fraction: Double, seed: Long): JavaRDD[T] =
wrapRDD(rdd.sample(withReplacement, fraction, seed))
Expand Down Expand Up @@ -167,7 +167,7 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: ClassTag[T])
* Return an RDD with the elements from `this` that are not in `other`.
*
* Uses `this` partitioner/partition size, because even if `other` is huge, the resulting
* RDD will be <= us.
* RDD will be &lt;= us.
*/
def subtract(other: JavaRDD[T]): JavaRDD[T] = wrapRDD(rdd.subtract(other))

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,9 @@ class JavaSparkContext(val sc: SparkContext)
* }}}
*
* Do
* `JavaPairRDD<String, byte[]> rdd = sparkContext.dataStreamFiles("hdfs://a-hdfs-path")`,
* <code>
* JavaPairRDD&lt;String, byte[]&gt; rdd = sparkContext.dataStreamFiles("hdfs://a-hdfs-path")
* </code>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`JavaPairRDD&lt;String, byte[]&gt; rdd = sparkContext.dataStreamFiles("hdfs://a-hdfs-path")` prints

JavaPairRDD&lt;String, byte[]&gt; rdd = ... in scaladoc

JavaPairRDD<String, byte[]> rdd = ... in javadoc

So, I had to use <code>...</code>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to myself. It still prints the codes as above.

If we want to use < or >, we might better always wrap this with

{{{
...
}}}

rather than backticks or <code>...</code>.

*
* then `rdd` contains
* {{{
Expand Down Expand Up @@ -270,7 +272,9 @@ class JavaSparkContext(val sc: SparkContext)
* }}}
*
* Do
* `JavaPairRDD<String, byte[]> rdd = sparkContext.dataStreamFiles("hdfs://a-hdfs-path")`,
* <code>
* JavaPairRDD&lt;String, byte[]&gt; rdd = sparkContext.dataStreamFiles("hdfs://a-hdfs-path")
* </code>,
*
* then `rdd` contains
* {{{
Expand Down Expand Up @@ -749,7 +753,7 @@ class JavaSparkContext(val sc: SparkContext)

/**
* Get a local property set in this thread, or null if it is missing. See
* [[org.apache.spark.api.java.JavaSparkContext.setLocalProperty]].
* `org.apache.spark.api.java.JavaSparkContext.setLocalProperty`.
*/
def getLocalProperty(key: String): String = sc.getLocalProperty(key)

Expand All @@ -769,7 +773,7 @@ class JavaSparkContext(val sc: SparkContext)
* Application programmers can use this method to group all those jobs together and give a
* group description. Once set, the Spark web UI will associate such jobs with this group.
*
* The application can also use [[org.apache.spark.api.java.JavaSparkContext.cancelJobGroup]]
* The application can also use `org.apache.spark.api.java.JavaSparkContext.cancelJobGroup`
* to cancel all running jobs in this group. For example,
* {{{
* // In the main thread:
Expand Down Expand Up @@ -802,7 +806,7 @@ class JavaSparkContext(val sc: SparkContext)

/**
* Cancel active jobs for the specified group. See
* [[org.apache.spark.api.java.JavaSparkContext.setJobGroup]] for more information.
* `org.apache.spark.api.java.JavaSparkContext.setJobGroup` for more information.
*/
def cancelJobGroup(groupId: String): Unit = sc.cancelJobGroup(groupId)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ private final object SnappyCompressionCodec {
}

/**
* Wrapper over [[SnappyOutputStream]] which guards against write-after-close and double-close
* Wrapper over `SnappyOutputStream` which guards against write-after-close and double-close
* issues. See SPARK-7660 for more details. This wrapping can be removed if we upgrade to a version
* of snappy-java that contains the fix for https://github.com/xerial/snappy-java/issues/107.
*/
Expand Down
18 changes: 9 additions & 9 deletions core/src/main/scala/org/apache/spark/rdd/RDD.scala
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ import org.apache.spark.util.random.{BernoulliCellSampler, BernoulliSampler, Poi
* All of the scheduling and execution in Spark is done based on these methods, allowing each RDD
* to implement its own way of computing itself. Indeed, users can implement custom RDDs (e.g. for
* reading data from a new storage system) by overriding these functions. Please refer to the
* [[http://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf Spark paper]] for more details
* on RDD internals.
* <a href="http://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf">Spark paper</a>
* for more details on RDD internals.
*/
abstract class RDD[T: ClassTag](
@transient private var _sc: SparkContext,
Expand Down Expand Up @@ -469,7 +469,7 @@ abstract class RDD[T: ClassTag](
* @param withReplacement can elements be sampled multiple times (replaced when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's size
* without replacement: probability that each element is chosen; fraction must be [0, 1]
* with replacement: expected number of times each element is chosen; fraction must be >= 0
* with replacement: expected number of times each element is chosen; fraction must be &gt;= 0
* @param seed seed for the random number generator
*
* @note This is NOT guaranteed to provide exactly the fraction of the count
Expand Down Expand Up @@ -675,8 +675,8 @@ abstract class RDD[T: ClassTag](
* may even differ each time the resulting RDD is evaluated.
*
* @note This operation may be very expensive. If you are grouping in order to perform an
* aggregation (such as a sum or average) over each key, using [[PairRDDFunctions.aggregateByKey]]
* or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
* aggregation (such as a sum or average) over each key, using `PairRDDFunctions.aggregateByKey`
* or `PairRDDFunctions.reduceByKey` will provide much better performance.
*/
def groupBy[K](f: T => K)(implicit kt: ClassTag[K]): RDD[(K, Iterable[T])] = withScope {
groupBy[K](f, defaultPartitioner(this))
Expand All @@ -688,8 +688,8 @@ abstract class RDD[T: ClassTag](
* may even differ each time the resulting RDD is evaluated.
*
* @note This operation may be very expensive. If you are grouping in order to perform an
* aggregation (such as a sum or average) over each key, using [[PairRDDFunctions.aggregateByKey]]
* or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
* aggregation (such as a sum or average) over each key, using `PairRDDFunctions.aggregateByKey`
* or `PairRDDFunctions.reduceByKey` will provide much better performance.
*/
def groupBy[K](
f: T => K,
Expand All @@ -703,8 +703,8 @@ abstract class RDD[T: ClassTag](
* may even differ each time the resulting RDD is evaluated.
*
* @note This operation may be very expensive. If you are grouping in order to perform an
* aggregation (such as a sum or average) over each key, using [[PairRDDFunctions.aggregateByKey]]
* or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
* aggregation (such as a sum or average) over each key, using `PairRDDFunctions.aggregateByKey`
* or `PairRDDFunctions.reduceByKey` will provide much better performance.
*/
def groupBy[K](f: T => K, p: Partitioner)(implicit kt: ClassTag[K], ord: Ordering[K] = null)
: RDD[(K, Iterable[T])] = withScope {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ private[spark] object CryptoStreamUtils extends Logging {
val COMMONS_CRYPTO_CONF_PREFIX = "commons.crypto."

/**
* Helper method to wrap [[OutputStream]] with [[CryptoOutputStream]] for encryption.
* Helper method to wrap `OutputStream` with `CryptoOutputStream` for encryption.
*/
def createCryptoOutputStream(
os: OutputStream,
Expand All @@ -62,7 +62,7 @@ private[spark] object CryptoStreamUtils extends Logging {
}

/**
* Helper method to wrap [[InputStream]] with [[CryptoInputStream]] for decryption.
* Helper method to wrap `InputStream` with `CryptoInputStream` for decryption.
*/
def createCryptoInputStream(
is: InputStream,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ import org.apache.spark.util.{BoundedPriorityQueue, SerializableConfiguration, S
import org.apache.spark.util.collection.CompactBuffer

/**
* A Spark serializer that uses the [[https://code.google.com/p/kryo/ Kryo serialization library]].
* A Spark serializer that uses the <a href="https://code.google.com/p/kryo/">
* Kryo serialization library</a>.
*
* @note This serializer is not guaranteed to be wire-compatible across different versions of
* Spark. It is intended to be used to serialize/de-serialize data within a single
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,17 +89,18 @@ class RandomBlockReplicationPolicy
prioritizedPeers
}

// scalastyle:off line.size.limit
/**
* Uses sampling algorithm by Robert Floyd. Finds a random sample in O(n) while
* minimizing space usage
* [[http://math.stackexchange.com/questions/178690/
* whats-the-proof-of-correctness-for-robert-floyds-algorithm-for-selecting-a-sin]]
* minimizing space usage. Please see <a href="http://math.stackexchange.com/questions/178690/whats-the-proof-of-correctness-for-robert-floyds-algorithm-for-selecting-a-sin">
* here</a>.
*
* @param n total number of indices
* @param m number of samples needed
* @param r random number generator
* @return list of m random unique indices
*/
// scalastyle:on line.size.limit
private def getSampleIds(n: Int, m: Int, r: Random): List[Int] = {
val indices = (n - m + 1 to n).foldLeft(Set.empty[Int]) {case (set, i) =>
val t = r.nextInt(i) + 1
Expand Down
4 changes: 2 additions & 2 deletions core/src/main/scala/org/apache/spark/ui/UIUtils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -422,8 +422,8 @@ private[spark] object UIUtils extends Logging {
* the whole string will rendered as a simple escaped text.
*
* Note: In terms of security, only anchor tags with root relative links are supported. So any
* attempts to embed links outside Spark UI, or other tags like <script> will cause in the whole
* description to be treated as plain text.
* attempts to embed links outside Spark UI, or other tags like &lt;script&gt; will cause in
* the whole description to be treated as plain text.
*
* @param desc the original job or stage description string, which may contain html tags.
* @param basePathUri with which to prepend the relative links; this is used when plainText is
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ private[spark] object AccumulatorContext {
* Registers an [[AccumulatorV2]] created on the driver such that it can be used on the executors.
*
* All accumulators registered here can later be used as a container for accumulating partial
* values across multiple tasks. This is what [[org.apache.spark.scheduler.DAGScheduler]] does.
* values across multiple tasks. This is what `org.apache.spark.scheduler.DAGScheduler` does.
* Note: if an accumulator is registered here, it should also be registered with the active
* context cleaner for cleanup so as to avoid memory leaks.
*
Expand Down
2 changes: 1 addition & 1 deletion core/src/main/scala/org/apache/spark/util/RpcUtils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ import org.apache.spark.rpc.{RpcAddress, RpcEndpointRef, RpcEnv, RpcTimeout}
private[spark] object RpcUtils {

/**
* Retrieve a [[RpcEndpointRef]] which is located in the driver via its name.
* Retrieve a `RpcEndpointRef` which is located in the driver via its name.
*/
def makeDriverRef(name: String, conf: SparkConf, rpcEnv: RpcEnv): RpcEndpointRef = {
val driverHost: String = conf.get("spark.driver.host", "localhost")
Expand Down
4 changes: 2 additions & 2 deletions core/src/main/scala/org/apache/spark/util/StatCounter.scala
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ import org.apache.spark.annotation.Since
/**
* A class for tracking the statistics of a set of numbers (count, mean and variance) in a
* numerically robust way. Includes support for merging two StatCounters. Based on Welford
* and Chan's [[http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance algorithms]]
* for running variance.
* and Chan's <a href="http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance">
* algorithms</a> for running variance.
*
* @constructor Initialize the StatCounter with the given values.
*/
Expand Down
6 changes: 3 additions & 3 deletions core/src/main/scala/org/apache/spark/util/ThreadUtils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -180,8 +180,8 @@ private[spark] object ThreadUtils {

// scalastyle:off awaitresult
/**
* Preferred alternative to [[Await.result()]]. This method wraps and re-throws any exceptions
* thrown by the underlying [[Await]] call, ensuring that this thread's stack trace appears in
* Preferred alternative to `Await.result()`. This method wraps and re-throws any exceptions
* thrown by the underlying `Await` call, ensuring that this thread's stack trace appears in
* logs.
*/
@throws(classOf[SparkException])
Expand All @@ -196,7 +196,7 @@ private[spark] object ThreadUtils {
}

/**
* Calls [[Awaitable.result]] directly to avoid using `ForkJoinPool`'s `BlockingContext`, wraps
* Calls `Awaitable.result` directly to avoid using `ForkJoinPool`'s `BlockingContext`, wraps
* and re-throws any exceptions with nice stack track.
*
* Codes running in the user's thread may be in a thread of Scala ForkJoinPool. As concurrent
Expand Down
10 changes: 5 additions & 5 deletions core/src/main/scala/org/apache/spark/util/Utils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1673,8 +1673,8 @@ private[spark] object Utils extends Logging {
}

/**
* NaN-safe version of [[java.lang.Double.compare()]] which allows NaN values to be compared
* according to semantics where NaN == NaN and NaN > any non-NaN double.
* NaN-safe version of `java.lang.Double.compare()` which allows NaN values to be compared
* according to semantics where NaN == NaN and NaN &gt; any non-NaN double.
*/
def nanSafeCompareDoubles(x: Double, y: Double): Int = {
val xIsNan: Boolean = java.lang.Double.isNaN(x)
Expand All @@ -1687,8 +1687,8 @@ private[spark] object Utils extends Logging {
}

/**
* NaN-safe version of [[java.lang.Float.compare()]] which allows NaN values to be compared
* according to semantics where NaN == NaN and NaN > any non-NaN float.
* NaN-safe version of `java.lang.Float.compare()` which allows NaN values to be compared
* according to semantics where NaN == NaN and NaN &gt; any non-NaN float.
*/
def nanSafeCompareFloats(x: Float, y: Float): Int = {
val xIsNan: Boolean = java.lang.Float.isNaN(x)
Expand Down Expand Up @@ -2354,7 +2354,7 @@ private[spark] object Utils extends Logging {
* A spark url (`spark://host:port`) is a special URI that its scheme is `spark` and only contains
* host and port.
*
* @throws SparkException if `sparkUrl` is invalid.
* @note Throws `SparkException` if sparkUrl is invalid.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparkException reference is not found in javadoc.

[error] .../java/org/apache/spark/util/Utils.java:841: error: reference not found
[error]    * @throws SparkException if sparkUrl is invalid.
[error]      ^

I am not too sure using @note instead is right.

*/
def extractHostPortFromSparkUrl(sparkUrl: String): (String, Int) = {
try {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) {
/**
* Reads data from a ChunkedByteBuffer.
*
* @param dispose if true, [[ChunkedByteBuffer.dispose()]] will be called at the end of the stream
* @param dispose if true, `ChunkedByteBuffer.dispose()` will be called at the end of the stream
* in order to close any memory-mapped files which back the buffer.
*/
private class ChunkedByteBufferInputStream(
Expand Down
4 changes: 2 additions & 2 deletions graphx/src/main/scala/org/apache/spark/graphx/Graph.scala
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,8 @@ abstract class Graph[VD: ClassTag, ED: ClassTag] protected () extends Serializab
*
* @return an RDD containing the edges in this graph
*
* @see [[Edge]] for the edge type.
* @see [[Graph#triplets]] to get an RDD which contains all the edges
* @see `Edge` for the edge type.
* @see `Graph#triplets` to get an RDD which contains all the edges
* along with their vertex data.
*
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ object GraphLoader extends Logging {
* id and a target id. Skips lines that begin with `#`.
*
* If desired the edges can be automatically oriented in the positive
* direction (source Id < target Id) by setting `canonicalOrientation` to
* direction (source Id &lt; target Id) by setting `canonicalOrientation` to
* true.
*
* @example Loads a file in the following format:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ class EdgeRDDImpl[ED: ClassTag, VD: ClassTag] private[graphx] (

/**
* If `partitionsRDD` already has a partitioner, use it. Otherwise assume that the
* [[PartitionID]]s in `partitionsRDD` correspond to the actual partitions and create a new
* `PartitionID`s in `partitionsRDD` correspond to the actual partitions and create a new
* partitioner that allows co-partitioning with `partitionsRDD`.
*/
override val partitioner =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import org.apache.spark.ml.linalg.{Vector, Vectors}
/**
* PageRank algorithm implementation. There are two implementations of PageRank implemented.
*
* The first implementation uses the standalone [[Graph]] interface and runs PageRank
* The first implementation uses the standalone `Graph` interface and runs PageRank
* for a fixed number of iterations:
* {{{
* var PR = Array.fill(n)( 1.0 )
Expand All @@ -41,7 +41,7 @@ import org.apache.spark.ml.linalg.{Vector, Vectors}
* }
* }}}
*
* The second implementation uses the [[Pregel]] interface and runs PageRank until
* The second implementation uses the `Pregel` interface and runs PageRank until
* convergence:
*
* {{{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ object SVDPlusPlus {
/**
* Implement SVD++ based on "Factorization Meets the Neighborhood:
* a Multifaceted Collaborative Filtering Model",
* available at [[http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf]].
* available at <a href="http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf">
* here</a>.
*
* The prediction rule is rui = u + bu + bi + qi*(pu + |N(u)|^^-0.5^^*sum(y)),
* see the details on page 6.
Expand Down
Loading