Java API documentation

srowen Sat, 19 Nov 2016 03:26:05 -0800

[SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note 
that`/`'''Note:'''` across Scala/Java API documentation


It seems in Scala/Java,

- `Note:`
- `NOTE:`
- `Note that`
- `'''Note:'''`
- `note`

This PR proposes to fix those to `note` to be consistent.

**Before**

- Scala
  ![2016-11-17 6 16 
39](https://cloud.githubusercontent.com/assets/6477701/20383180/1a7aed8c-acf2-11e6-9611-5eaf6d52c2e0.png)

- Java
  ![2016-11-17 6 14 
41](https://cloud.githubusercontent.com/assets/6477701/20383096/c8ffc680-acf1-11e6-914a-33460bf1401d.png)

**After**

- Scala
  ![2016-11-17 6 16 
44](https://cloud.githubusercontent.com/assets/6477701/20383167/09940490-acf2-11e6-937a-0d5e1dc2cadf.png)

- Java
  ![2016-11-17 6 13 
39](https://cloud.githubusercontent.com/assets/6477701/20383132/e7c2a57e-acf1-11e6-9c47-b849674d4d88.png)

The notes were found via

```bash
grep -r "NOTE: " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// NOTE: " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \ # note that this is a regular 
expression. So actual matches were mostly `org/apache/spark/api/java/functions 
...`
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "Note that " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// Note that " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "Note: " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// Note: " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "'''Note:'''" . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// '''Note:''' " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

And then fixed one by one comparing with API documentation/access modifiers.

After that, manually tested via `jekyll build`.

Author: hyukjinkwon <[email protected]>

Closes #15889 from HyukjinKwon/SPARK-18437.

(cherry picked from commit d5b1d5fc80153571c308130833d0c0774de62c92)
Signed-off-by: Sean Owen <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4b396a65
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4b396a65
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4b396a65

Branch: refs/heads/branch-2.1
Commit: 4b396a6545ec0f1e31b0e211228f04bdc5660300
Parents: 693401b
Author: hyukjinkwon <[email protected]>
Authored: Sat Nov 19 11:24:15 2016 +0000
Committer: Sean Owen <[email protected]>
Committed: Sat Nov 19 11:25:07 2016 +0000

----------------------------------------------------------------------
 .../scala/org/apache/spark/ContextCleaner.scala |  2 +-
 .../scala/org/apache/spark/Partitioner.scala    |  2 +-
 .../main/scala/org/apache/spark/SparkConf.scala |  6 +-
 .../scala/org/apache/spark/SparkContext.scala   | 47 ++++++++-------
 .../apache/spark/api/java/JavaDoubleRDD.scala   |  4 +-
 .../org/apache/spark/api/java/JavaPairRDD.scala | 26 +++++----
 .../org/apache/spark/api/java/JavaRDD.scala     | 12 ++--
 .../org/apache/spark/api/java/JavaRDDLike.scala |  3 +-
 .../spark/api/java/JavaSparkContext.scala       | 21 +++----
 .../spark/api/java/JavaSparkStatusTracker.scala |  2 +-
 .../org/apache/spark/io/CompressionCodec.scala  | 23 ++++----
 .../apache/spark/partial/BoundedDouble.scala    |  2 +-
 .../org/apache/spark/rdd/CoGroupedRDD.scala     |  8 +--
 .../apache/spark/rdd/DoubleRDDFunctions.scala   |  2 +-
 .../scala/org/apache/spark/rdd/HadoopRDD.scala  |  6 +-
 .../org/apache/spark/rdd/NewHadoopRDD.scala     |  6 +-
 .../org/apache/spark/rdd/PairRDDFunctions.scala | 23 ++++----
 .../apache/spark/rdd/PartitionPruningRDD.scala  |  2 +-
 .../spark/rdd/PartitionwiseSampledRDD.scala     |  2 +-
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 46 +++++++--------
 .../apache/spark/rdd/RDDCheckpointData.scala    |  2 +-
 .../spark/rdd/ReliableCheckpointRDD.scala       |  2 +-
 .../spark/rdd/SequenceFileRDDFunctions.scala    |  5 +-
 .../apache/spark/rdd/ZippedWithIndexRDD.scala   |  2 +-
 .../spark/scheduler/AccumulableInfo.scala       | 10 ++--
 .../spark/serializer/JavaSerializer.scala       |  2 +-
 .../spark/serializer/KryoSerializer.scala       |  2 +-
 .../apache/spark/serializer/Serializer.scala    |  2 +-
 .../org/apache/spark/storage/StorageUtils.scala | 19 ++++---
 .../org/apache/spark/util/AccumulatorV2.scala   |  5 +-
 .../spark/scheduler/DAGSchedulerSuite.scala     |  2 +-
 docs/mllib-isotonic-regression.md               |  2 +-
 docs/streaming-programming-guide.md             |  2 +-
 .../apache/spark/sql/kafka010/KafkaSource.scala |  2 +-
 .../spark/streaming/kafka/KafkaUtils.scala      |  8 +--
 .../spark/streaming/kinesis/KinesisUtils.scala  | 60 +++++++++-----------
 .../kinesis/KinesisBackedBlockRDDSuite.scala    |  2 +-
 .../apache/spark/graphx/impl/GraphImpl.scala    |  2 +-
 .../org/apache/spark/graphx/lib/PageRank.scala  |  2 +-
 .../org/apache/spark/ml/linalg/Vectors.scala    |  2 +-
 .../main/scala/org/apache/spark/ml/Model.scala  |  2 +-
 .../classification/DecisionTreeClassifier.scala |  6 +-
 .../spark/ml/classification/GBTClassifier.scala |  6 +-
 .../ml/classification/LogisticRegression.scala  | 36 ++++++------
 .../spark/ml/clustering/GaussianMixture.scala   |  6 +-
 .../apache/spark/ml/feature/MinMaxScaler.scala  |  3 +-
 .../apache/spark/ml/feature/OneHotEncoder.scala |  3 +-
 .../scala/org/apache/spark/ml/feature/PCA.scala |  5 +-
 .../spark/ml/feature/StopWordsRemover.scala     |  5 +-
 .../apache/spark/ml/feature/StringIndexer.scala |  6 +-
 .../org/apache/spark/ml/param/params.scala      |  2 +-
 .../ml/regression/DecisionTreeRegressor.scala   |  6 +-
 .../GeneralizedLinearRegression.scala           |  4 +-
 .../spark/ml/regression/LinearRegression.scala  | 28 ++++-----
 .../ml/source/libsvm/LibSVMDataSource.scala     |  2 +-
 .../ml/tree/impl/GradientBoostedTrees.scala     |  4 +-
 .../org/apache/spark/ml/util/ReadWrite.scala    |  2 +-
 .../classification/LogisticRegression.scala     | 28 +++++----
 .../apache/spark/mllib/classification/SVM.scala | 20 ++++---
 .../mllib/clustering/GaussianMixture.scala      |  8 +--
 .../apache/spark/mllib/clustering/KMeans.scala  |  8 ++-
 .../org/apache/spark/mllib/clustering/LDA.scala |  4 +-
 .../spark/mllib/clustering/LDAModel.scala       |  2 +-
 .../spark/mllib/clustering/LDAOptimizer.scala   |  6 +-
 .../spark/mllib/evaluation/AreaUnderCurve.scala |  2 +-
 .../org/apache/spark/mllib/linalg/Vectors.scala |  6 +-
 .../mllib/linalg/distributed/BlockMatrix.scala  |  2 +-
 .../linalg/distributed/IndexedRowMatrix.scala   |  5 +-
 .../mllib/linalg/distributed/RowMatrix.scala    | 21 ++++---
 .../spark/mllib/optimization/Gradient.scala     |  3 +-
 .../apache/spark/mllib/rdd/RDDFunctions.scala   |  2 +-
 .../MatrixFactorizationModel.scala              |  6 +-
 .../apache/spark/mllib/stat/Statistics.scala    | 34 +++++------
 .../apache/spark/mllib/tree/DecisionTree.scala  | 32 +++++------
 .../org/apache/spark/mllib/tree/loss/Loss.scala | 12 ++--
 .../mllib/tree/model/treeEnsembleModels.scala   |  4 +-
 pom.xml                                         |  7 +++
 project/SparkBuild.scala                        |  3 +-
 python/pyspark/mllib/stat/KernelDensity.py      |  2 +-
 python/pyspark/mllib/util.py                    |  2 +-
 python/pyspark/rdd.py                           |  4 +-
 python/pyspark/streaming/kafka.py               |  4 +-
 .../scala/org/apache/spark/sql/Encoders.scala   |  8 +--
 .../spark/sql/types/CalendarIntervalType.scala  |  4 +-
 .../scala/org/apache/spark/sql/Column.scala     |  2 +-
 .../spark/sql/DataFrameStatFunctions.scala      |  3 +-
 .../org/apache/spark/sql/DataFrameWriter.scala  |  2 +-
 .../scala/org/apache/spark/sql/Dataset.scala    | 56 +++++++++---------
 .../scala/org/apache/spark/sql/SQLContext.scala |  7 ++-
 .../org/apache/spark/sql/SparkSession.scala     |  9 +--
 .../org/apache/spark/sql/UDFRegistration.scala  |  3 +-
 .../sql/execution/streaming/state/package.scala |  4 +-
 .../sql/expressions/UserDefinedFunction.scala   |  8 ++-
 .../scala/org/apache/spark/sql/functions.scala  | 22 +++----
 .../apache/spark/sql/jdbc/JdbcDialects.scala    |  2 +-
 .../apache/spark/sql/sources/interfaces.scala   | 10 ++--
 .../spark/sql/util/QueryExecutionListener.scala |  8 ++-
 .../columnar/InMemoryColumnarQuerySuite.scala   |  2 +-
 .../spark/streaming/StreamingContext.scala      | 18 +++---
 .../streaming/api/java/JavaPairDStream.scala    |  2 +-
 .../api/java/JavaStreamingContext.scala         | 40 ++++++-------
 .../spark/streaming/dstream/DStream.scala       |  4 +-
 .../streaming/dstream/MapWithStateDStream.scala |  2 +-
 .../rdd/WriteAheadLogBackedBlockRDDSuite.scala  |  2 +-
 104 files changed, 516 insertions(+), 435 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/ContextCleaner.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/ContextCleaner.scala 
b/core/src/main/scala/org/apache/spark/ContextCleaner.scala
index 5678d79..af91345 100644
--- a/core/src/main/scala/org/apache/spark/ContextCleaner.scala
+++ b/core/src/main/scala/org/apache/spark/ContextCleaner.scala
@@ -139,7 +139,7 @@ private[spark] class ContextCleaner(sc: SparkContext) 
extends Logging {
     periodicGCService.shutdown()
   }
 
-  /** Register a RDD for cleanup when it is garbage collected. */
+  /** Register an RDD for cleanup when it is garbage collected. */
   def registerRDDForCleanup(rdd: RDD[_]): Unit = {
     registerForCleanup(rdd, CleanRDD(rdd.id))
   }

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/Partitioner.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/Partitioner.scala 
b/core/src/main/scala/org/apache/spark/Partitioner.scala
index 93dfbc0..f83f527 100644
--- a/core/src/main/scala/org/apache/spark/Partitioner.scala
+++ b/core/src/main/scala/org/apache/spark/Partitioner.scala
@@ -101,7 +101,7 @@ class HashPartitioner(partitions: Int) extends Partitioner {
  * A [[org.apache.spark.Partitioner]] that partitions sortable records by 
range into roughly
  * equal ranges. The ranges are determined by sampling the content of the RDD 
passed in.
  *
- * Note that the actual number of partitions created by the RangePartitioner 
might not be the same
+ * @note The actual number of partitions created by the RangePartitioner might 
not be the same
  * as the `partitions` parameter, in the case where the number of sampled 
records is less than
  * the value of `partitions`.
  */

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/SparkConf.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/SparkConf.scala 
b/core/src/main/scala/org/apache/spark/SparkConf.scala
index c9c342d..04d657c 100644
--- a/core/src/main/scala/org/apache/spark/SparkConf.scala
+++ b/core/src/main/scala/org/apache/spark/SparkConf.scala
@@ -42,10 +42,10 @@ import org.apache.spark.util.Utils
  * All setter methods in this class support chaining. For example, you can 
write
  * `new SparkConf().setMaster("local").setAppName("My app")`.
  *
- * Note that once a SparkConf object is passed to Spark, it is cloned and can 
no longer be modified
- * by the user. Spark does not support modifying the configuration at runtime.
- *
  * @param loadDefaults whether to also load values from Java system properties
+ *
+ * @note Once a SparkConf object is passed to Spark, it is cloned and can no 
longer be modified
+ * by the user. Spark does not support modifying the configuration at runtime.
  */
 class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging with 
Serializable {
 

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/SparkContext.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala 
b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 25a3d60..1261e3e 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -281,7 +281,7 @@ class SparkContext(config: SparkConf) extends Logging {
   /**
    * A default Hadoop Configuration for the Hadoop code (e.g. file systems) 
that we reuse.
    *
-   * '''Note:''' As it will be reused in all Hadoop RDDs, it's better not to 
modify it unless you
+   * @note As it will be reused in all Hadoop RDDs, it's better not to modify 
it unless you
    * plan to set some global configurations for all Hadoop RDDs.
    */
   def hadoopConfiguration: Configuration = _hadoopConfiguration
@@ -700,7 +700,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * Execute a block of code in a scope such that all new RDDs created in this 
body will
    * be part of the same scope. For more detail, see 
{{org.apache.spark.rdd.RDDOperationScope}}.
    *
-   * Note: Return statements are NOT allowed in the given body.
+   * @note Return statements are NOT allowed in the given body.
    */
   private[spark] def withScope[U](body: => U): U = 
RDDOperationScope.withScope[U](this)(body)
 
@@ -927,7 +927,7 @@ class SparkContext(config: SparkConf) extends Logging {
   /**
    * Load data from a flat binary file, assuming the length of each record is 
constant.
    *
-   * '''Note:''' We ensure that the byte array for each record in the 
resulting RDD
+   * @note We ensure that the byte array for each record in the resulting RDD
    * has the provided record length.
    *
    * @param path Directory to the input data files, the path can be comma 
separated paths as the
@@ -970,7 +970,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * @param valueClass Class of the values
    * @param minPartitions Minimum number of Hadoop Splits to generate.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -995,7 +995,7 @@ class SparkContext(config: SparkConf) extends Logging {
 
   /** Get an RDD for a Hadoop file with an arbitrary InputFormat
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1034,7 +1034,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * val file = sparkContext.hadoopFile[LongWritable, Text, 
TextInputFormat](path, minPartitions)
    * }}}
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1058,7 +1058,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * val file = sparkContext.hadoopFile[LongWritable, Text, 
TextInputFormat](path)
    * }}}
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1084,7 +1084,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * Get an RDD for a given Hadoop file with an arbitrary new API InputFormat
    * and extra configuration options to pass to the input format.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1124,7 +1124,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * @param kClass Class of the keys
    * @param vClass Class of the values
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1150,7 +1150,7 @@ class SparkContext(config: SparkConf) extends Logging {
   /**
    * Get an RDD for a Hadoop SequenceFile with given key and value types.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1169,7 +1169,7 @@ class SparkContext(config: SparkConf) extends Logging {
   /**
    * Get an RDD for a Hadoop SequenceFile with given key and value types.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1199,7 +1199,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * for the appropriate type. In addition, we pass the converter a ClassTag 
of its type to
    * allow it to figure out the Writable class to use in the subclass case.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD or directly passing it to an 
aggregation or shuffle
    * operation will create many references to the same object.
    * If you plan to directly cache, sort, or aggregate Hadoop writable 
objects, you should first
@@ -1330,16 +1330,18 @@ class SparkContext(config: SparkConf) extends Logging {
   }
 
   /**
-   * Register the given accumulator.  Note that accumulators must be 
registered before use, or it
-   * will throw exception.
+   * Register the given accumulator.
+   *
+   * @note Accumulators must be registered before use, or it will throw 
exception.
    */
   def register(acc: AccumulatorV2[_, _]): Unit = {
     acc.register(this)
   }
 
   /**
-   * Register the given accumulator with given name.  Note that accumulators 
must be registered
-   * before use, or it will throw exception.
+   * Register the given accumulator with given name.
+   *
+   * @note Accumulators must be registered before use, or it will throw 
exception.
    */
   def register(acc: AccumulatorV2[_, _], name: String): Unit = {
     acc.register(this, name = Some(name))
@@ -1550,7 +1552,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * :: DeveloperApi ::
    * Request that the cluster manager kill the specified executors.
    *
-   * Note: This is an indication to the cluster manager that the application 
wishes to adjust
+   * @note This is an indication to the cluster manager that the application 
wishes to adjust
    * its resource usage downwards. If the application wishes to replace the 
executors it kills
    * through this method with new ones, it should follow up explicitly with a 
call to
    * {{SparkContext#requestExecutors}}.
@@ -1572,7 +1574,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * :: DeveloperApi ::
    * Request that the cluster manager kill the specified executor.
    *
-   * Note: This is an indication to the cluster manager that the application 
wishes to adjust
+   * @note This is an indication to the cluster manager that the application 
wishes to adjust
    * its resource usage downwards. If the application wishes to replace the 
executor it kills
    * through this method with a new one, it should follow up explicitly with a 
call to
    * {{SparkContext#requestExecutors}}.
@@ -1590,7 +1592,7 @@ class SparkContext(config: SparkConf) extends Logging {
    * this request. This assumes the cluster manager will automatically and 
eventually
    * fulfill all missing application resource requests.
    *
-   * Note: The replace is by no means guaranteed; another application on the 
same cluster
+   * @note The replace is by no means guaranteed; another application on the 
same cluster
    * can steal the window of opportunity and acquire this application's 
resources in the
    * mean time.
    *
@@ -1639,7 +1641,8 @@ class SparkContext(config: SparkConf) extends Logging {
 
   /**
    * Returns an immutable map of RDDs that have marked themselves as 
persistent via cache() call.
-   * Note that this does not necessarily mean the caching or computation was 
successful.
+   *
+   * @note This does not necessarily mean the caching or computation was 
successful.
    */
   def getPersistentRDDs: Map[Int, RDD[_]] = persistentRdds.toMap
 
@@ -2298,7 +2301,7 @@ object SparkContext extends Logging {
    * singleton object. Because we can only have one active SparkContext per 
JVM,
    * this is useful when applications may wish to share a SparkContext.
    *
-   * Note: This function cannot be used to create multiple SparkContext 
instances
+   * @note This function cannot be used to create multiple SparkContext 
instances
    * even if multiple contexts are allowed.
    */
   def getOrCreate(config: SparkConf): SparkContext = {
@@ -2323,7 +2326,7 @@ object SparkContext extends Logging {
    *
    * This method allows not passing a SparkConf (useful if just retrieving).
    *
-   * Note: This function cannot be used to create multiple SparkContext 
instances
+   * @note This function cannot be used to create multiple SparkContext 
instances
    * even if multiple contexts are allowed.
    */
   def getOrCreate(): SparkContext = {

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala
index 0026fc9..a32a4b2 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala
@@ -153,7 +153,7 @@ class JavaDoubleRDD(val srdd: RDD[scala.Double])
    * Return the intersection of this RDD and another one. The output will not 
contain any duplicate
    * elements, even if the input RDDs did.
    *
-   * Note that this method performs a shuffle internally.
+   * @note This method performs a shuffle internally.
    */
   def intersection(other: JavaDoubleRDD): JavaDoubleRDD = 
fromRDD(srdd.intersection(other.srdd))
 
@@ -256,7 +256,7 @@ class JavaDoubleRDD(val srdd: RDD[scala.Double])
    *  e.g 1&lt;=x&lt;10 , 10&lt;=x&lt;20, 20&lt;=x&lt;50
    *  And on the input of 1 and 50 we would have a histogram of 1,0,0
    *
-   * Note: if your histogram is evenly spaced (e.g. [0, 10, 20, 30]) this can 
be switched
+   * @note If your histogram is evenly spaced (e.g. [0, 10, 20, 30]) this can 
be switched
    * from an O(log n) insertion to O(1) per element. (where n = # buckets) if 
you set evenBuckets
    * to true.
    * buckets must be sorted and not contain any duplicates.

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
index 1c95bc4..bff5a29 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
@@ -206,7 +206,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    * Return the intersection of this RDD and another one. The output will not 
contain any duplicate
    * elements, even if the input RDDs did.
    *
-   * Note that this method performs a shuffle internally.
+   * @note This method performs a shuffle internally.
    */
   def intersection(other: JavaPairRDD[K, V]): JavaPairRDD[K, V] =
     new JavaPairRDD[K, V](rdd.intersection(other.rdd))
@@ -223,9 +223,9 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
   /**
    * Generic function to combine the elements for each key using a custom set 
of aggregation
    * functions. Turns a JavaPairRDD[(K, V)] into a result of type 
JavaPairRDD[(K, C)], for a
-   * "combined type" C. Note that V and C can be different -- for example, one 
might group an
-   * RDD of type (Int, Int) into an RDD of type (Int, List[Int]). Users 
provide three
-   * functions:
+   * "combined type" C.
+   *
+   * Users provide three functions:
    *
    *  - `createCombiner`, which turns a V into a C (e.g., creates a 
one-element list)
    *  - `mergeValue`, to merge a V into a C (e.g., adds it to the end of a 
list)
@@ -234,6 +234,9 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    * In addition, users can control the partitioning of the output RDD, the 
serializer that is use
    * for the shuffle, and whether to perform map-side aggregation (if a mapper 
can produce multiple
    * items with the same key).
+   *
+   * @note V and C can be different -- for example, one might group an RDD of 
type (Int, Int) into
+   * an RDD of type (Int, List[Int]).
    */
   def combineByKey[C](createCombiner: JFunction[V, C],
       mergeValue: JFunction2[C, V, C],
@@ -255,9 +258,9 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
   /**
    * Generic function to combine the elements for each key using a custom set 
of aggregation
    * functions. Turns a JavaPairRDD[(K, V)] into a result of type 
JavaPairRDD[(K, C)], for a
-   * "combined type" C. Note that V and C can be different -- for example, one 
might group an
-   * RDD of type (Int, Int) into an RDD of type (Int, List[Int]). Users 
provide three
-   * functions:
+   * "combined type" C.
+   *
+   * Users provide three functions:
    *
    *  - `createCombiner`, which turns a V into a C (e.g., creates a 
one-element list)
    *  - `mergeValue`, to merge a V into a C (e.g., adds it to the end of a 
list)
@@ -265,6 +268,9 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    *
    * In addition, users can control the partitioning of the output RDD. This 
method automatically
    * uses map-side aggregation in shuffling the RDD.
+   *
+   * @note V and C can be different -- for example, one might group an RDD of 
type (Int, Int) into
+   * an RDD of type (Int, List[Int]).
    */
   def combineByKey[C](createCombiner: JFunction[V, C],
       mergeValue: JFunction2[C, V, C],
@@ -398,7 +404,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    * Group the values for each key in the RDD into a single sequence. Allows 
controlling the
    * partitioning of the resulting key-value pair RDD by passing a Partitioner.
    *
-   * Note: If you are grouping in order to perform an aggregation (such as a 
sum or average) over
+   * @note If you are grouping in order to perform an aggregation (such as a 
sum or average) over
    * each key, using [[JavaPairRDD.reduceByKey]] or 
[[JavaPairRDD.combineByKey]]
    * will provide much better performance.
    */
@@ -409,7 +415,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    * Group the values for each key in the RDD into a single sequence. 
Hash-partitions the
    * resulting RDD with into `numPartitions` partitions.
    *
-   * Note: If you are grouping in order to perform an aggregation (such as a 
sum or average) over
+   * @note If you are grouping in order to perform an aggregation (such as a 
sum or average) over
    * each key, using [[JavaPairRDD.reduceByKey]] or 
[[JavaPairRDD.combineByKey]]
    * will provide much better performance.
    */
@@ -539,7 +545,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    * Group the values for each key in the RDD into a single sequence. 
Hash-partitions the
    * resulting RDD with the existing partitioner/parallelism level.
    *
-   * Note: If you are grouping in order to perform an aggregation (such as a 
sum or average) over
+   * @note If you are grouping in order to perform an aggregation (such as a 
sum or average) over
    * each key, using [[JavaPairRDD.reduceByKey]] or 
[[JavaPairRDD.combineByKey]]
    * will provide much better performance.
    */

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
index d67cff6..ccd94f8 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
@@ -99,27 +99,29 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: 
ClassTag[T])
 
   /**
    * Return a sampled subset of this RDD with a random seed.
-   * Note: this is NOT guaranteed to provide exactly the fraction of the count
-   * of the given [[RDD]].
    *
    * @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
    * @param fraction expected size of the sample as a fraction of this RDD's 
size
    *  without replacement: probability that each element is chosen; fraction 
must be [0, 1]
    *  with replacement: expected number of times each element is chosen; 
fraction must be >= 0
+   *
+   * @note This is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
    */
   def sample(withReplacement: Boolean, fraction: Double): JavaRDD[T] =
     sample(withReplacement, fraction, Utils.random.nextLong)
 
   /**
    * Return a sampled subset of this RDD, with a user-supplied seed.
-   * Note: this is NOT guaranteed to provide exactly the fraction of the count
-   * of the given [[RDD]].
    *
    * @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
    * @param fraction expected size of the sample as a fraction of this RDD's 
size
    *  without replacement: probability that each element is chosen; fraction 
must be [0, 1]
    *  with replacement: expected number of times each element is chosen; 
fraction must be >= 0
    * @param seed seed for the random number generator
+   *
+   * @note This is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
    */
   def sample(withReplacement: Boolean, fraction: Double, seed: Long): 
JavaRDD[T] =
     wrapRDD(rdd.sample(withReplacement, fraction, seed))
@@ -157,7 +159,7 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: 
ClassTag[T])
    * Return the intersection of this RDD and another one. The output will not 
contain any duplicate
    * elements, even if the input RDDs did.
    *
-   * Note that this method performs a shuffle internally.
+   * @note This method performs a shuffle internally.
    */
   def intersection(other: JavaRDD[T]): JavaRDD[T] = 
wrapRDD(rdd.intersection(other.rdd))
 

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
index a37c52c..eda16d9 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
@@ -47,7 +47,8 @@ private[spark] abstract class AbstractJavaRDDLike[T, This <: 
JavaRDDLike[T, This
 
 /**
  * Defines operations common to several Java RDD implementations.
- * Note that this trait is not intended to be implemented by user code.
+ *
+ * @note This trait is not intended to be implemented by user code.
  */
 trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends Serializable {
   def wrapRDD(rdd: RDD[T]): This

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala
index 4e50c26..38d347a 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala
@@ -298,7 +298,7 @@ class JavaSparkContext(val sc: SparkContext)
   /**
    * Get an RDD for a Hadoop SequenceFile with given key and value types.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -316,7 +316,7 @@ class JavaSparkContext(val sc: SparkContext)
   /**
    * Get an RDD for a Hadoop SequenceFile.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -366,7 +366,7 @@ class JavaSparkContext(val sc: SparkContext)
    * @param valueClass Class of the values
    * @param minPartitions Minimum number of Hadoop Splits to generate.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -396,7 +396,7 @@ class JavaSparkContext(val sc: SparkContext)
    * @param keyClass Class of the keys
    * @param valueClass Class of the values
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -416,7 +416,7 @@ class JavaSparkContext(val sc: SparkContext)
   /**
    * Get an RDD for a Hadoop file with an arbitrary InputFormat.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -437,7 +437,7 @@ class JavaSparkContext(val sc: SparkContext)
   /**
    * Get an RDD for a Hadoop file with an arbitrary InputFormat
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -458,7 +458,7 @@ class JavaSparkContext(val sc: SparkContext)
    * Get an RDD for a given Hadoop file with an arbitrary new API InputFormat
    * and extra configuration options to pass to the input format.
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -487,7 +487,7 @@ class JavaSparkContext(val sc: SparkContext)
    * @param kClass Class of the keys
    * @param vClass Class of the values
    *
-   * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable 
object for each
+   * @note Because Hadoop's RecordReader class re-uses the same Writable 
object for each
    * record, directly caching the returned RDD will create many references to 
the same object.
    * If you plan to directly cache Hadoop writable objects, you should first 
copy them using
    * a `map` function.
@@ -694,7 +694,7 @@ class JavaSparkContext(val sc: SparkContext)
   /**
    * Returns the Hadoop configuration used for the Hadoop code (e.g. file 
systems) we reuse.
    *
-   * '''Note:''' As it will be reused in all Hadoop RDDs, it's better not to 
modify it unless you
+   * @note As it will be reused in all Hadoop RDDs, it's better not to modify 
it unless you
    * plan to set some global configurations for all Hadoop RDDs.
    */
   def hadoopConfiguration(): Configuration = {
@@ -811,7 +811,8 @@ class JavaSparkContext(val sc: SparkContext)
 
   /**
    * Returns a Java map of JavaRDDs that have marked themselves as persistent 
via cache() call.
-   * Note that this does not necessarily mean the caching or computation was 
successful.
+   *
+   * @note This does not necessarily mean the caching or computation was 
successful.
    */
   def getPersistentRDDs: JMap[java.lang.Integer, JavaRDD[_]] = {
     sc.getPersistentRDDs.mapValues(s => JavaRDD.fromRDD(s))

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/api/java/JavaSparkStatusTracker.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/api/java/JavaSparkStatusTracker.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaSparkStatusTracker.scala
index 99ca3c7..6aa290e 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaSparkStatusTracker.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaSparkStatusTracker.scala
@@ -31,7 +31,7 @@ import org.apache.spark.{SparkContext, SparkJobInfo, 
SparkStageInfo}
  * will provide information for the last `spark.ui.retainedStages` stages and
  * `spark.ui.retainedJobs` jobs.
  *
- * NOTE: this class's constructor should be considered private and may be 
subject to change.
+ * @note This class's constructor should be considered private and may be 
subject to change.
  */
 class JavaSparkStatusTracker private[spark] (sc: SparkContext) {
 

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala 
b/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
index ae014be..6ba79e5 100644
--- a/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
+++ b/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
@@ -32,9 +32,8 @@ import org.apache.spark.util.Utils
  * CompressionCodec allows the customization of choosing different compression 
implementations
  * to be used in block storage.
  *
- * Note: The wire protocol for a codec is not guaranteed compatible across 
versions of Spark.
- *       This is intended for use as an internal compression utility within a 
single
- *       Spark application.
+ * @note The wire protocol for a codec is not guaranteed compatible across 
versions of Spark.
+ * This is intended for use as an internal compression utility within a single 
Spark application.
  */
 @DeveloperApi
 trait CompressionCodec {
@@ -103,9 +102,9 @@ private[spark] object CompressionCodec {
  * LZ4 implementation of [[org.apache.spark.io.CompressionCodec]].
  * Block size can be configured by `spark.io.compression.lz4.blockSize`.
  *
- * Note: The wire protocol for this codec is not guaranteed to be compatible 
across versions
- *       of Spark. This is intended for use as an internal compression utility 
within a single Spark
- *       application.
+ * @note The wire protocol for this codec is not guaranteed to be compatible 
across versions
+ * of Spark. This is intended for use as an internal compression utility 
within a single Spark
+ * application.
  */
 @DeveloperApi
 class LZ4CompressionCodec(conf: SparkConf) extends CompressionCodec {
@@ -123,9 +122,9 @@ class LZ4CompressionCodec(conf: SparkConf) extends 
CompressionCodec {
  * :: DeveloperApi ::
  * LZF implementation of [[org.apache.spark.io.CompressionCodec]].
  *
- * Note: The wire protocol for this codec is not guaranteed to be compatible 
across versions
- *       of Spark. This is intended for use as an internal compression utility 
within a single Spark
- *       application.
+ * @note The wire protocol for this codec is not guaranteed to be compatible 
across versions
+ * of Spark. This is intended for use as an internal compression utility 
within a single Spark
+ * application.
  */
 @DeveloperApi
 class LZFCompressionCodec(conf: SparkConf) extends CompressionCodec {
@@ -143,9 +142,9 @@ class LZFCompressionCodec(conf: SparkConf) extends 
CompressionCodec {
  * Snappy implementation of [[org.apache.spark.io.CompressionCodec]].
  * Block size can be configured by `spark.io.compression.snappy.blockSize`.
  *
- * Note: The wire protocol for this codec is not guaranteed to be compatible 
across versions
- *       of Spark. This is intended for use as an internal compression utility 
within a single Spark
- *       application.
+ * @note The wire protocol for this codec is not guaranteed to be compatible 
across versions
+ * of Spark. This is intended for use as an internal compression utility 
within a single Spark
+ * application.
  */
 @DeveloperApi
 class SnappyCompressionCodec(conf: SparkConf) extends CompressionCodec {

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/partial/BoundedDouble.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/partial/BoundedDouble.scala 
b/core/src/main/scala/org/apache/spark/partial/BoundedDouble.scala
index ab6aba6..8f579c5 100644
--- a/core/src/main/scala/org/apache/spark/partial/BoundedDouble.scala
+++ b/core/src/main/scala/org/apache/spark/partial/BoundedDouble.scala
@@ -28,7 +28,7 @@ class BoundedDouble(val mean: Double, val confidence: Double, 
val low: Double, v
     this.mean.hashCode ^ this.confidence.hashCode ^ this.low.hashCode ^ 
this.high.hashCode
 
   /**
-   * Note that consistent with Double, any NaN value will make equality false
+   * @note Consistent with Double, any NaN value will make equality false
    */
   override def equals(that: Any): Boolean =
     that match {

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala
index 2381f54..a091f06 100644
--- a/core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala
@@ -66,14 +66,14 @@ private[spark] class CoGroupPartition(
 
 /**
  * :: DeveloperApi ::
- * A RDD that cogroups its parents. For each key k in parent RDDs, the 
resulting RDD contains a
+ * An RDD that cogroups its parents. For each key k in parent RDDs, the 
resulting RDD contains a
  * tuple with the list of values for that key.
  *
- * Note: This is an internal API. We recommend users use RDD.cogroup(...) 
instead of
- * instantiating this directly.
- *
  * @param rdds parent RDDs.
  * @param part partitioner used to partition the shuffle output
+ *
+ * @note This is an internal API. We recommend users use RDD.cogroup(...) 
instead of
+ * instantiating this directly.
  */
 @DeveloperApi
 class CoGroupedRDD[K: ClassTag](

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala 
b/core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala
index a05a770..f3ab324 100644
--- a/core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala
@@ -158,7 +158,7 @@ class DoubleRDDFunctions(self: RDD[Double]) extends Logging 
with Serializable {
    *  e.g 1<=x<10 , 10<=x<20, 20<=x<=50
    *  And on the input of 1 and 50 we would have a histogram of 1, 0, 1
    *
-   * Note: if your histogram is evenly spaced (e.g. [0, 10, 20, 30]) this can 
be switched
+   * @note If your histogram is evenly spaced (e.g. [0, 10, 20, 30]) this can 
be switched
    * from an O(log n) insertion to O(1) per element. (where n = # buckets) if 
you set evenBuckets
    * to true.
    * buckets must be sorted and not contain any duplicates.

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
index 36a2f5c..86351b8 100644
--- a/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
@@ -84,9 +84,6 @@ private[spark] class HadoopPartition(rddId: Int, override val 
index: Int, s: Inp
  * An RDD that provides core functionality for reading data stored in Hadoop 
(e.g., files in HDFS,
  * sources in HBase, or S3), using the older MapReduce API 
(`org.apache.hadoop.mapred`).
  *
- * Note: Instantiating this class directly is not recommended, please use
- * [[org.apache.spark.SparkContext.hadoopRDD()]]
- *
  * @param sc The SparkContext to associate the RDD with.
  * @param broadcastedConf A general Hadoop Configuration, or a subclass of it. 
If the enclosed
  *   variable references an instance of JobConf, then that JobConf will be 
used for the Hadoop job.
@@ -97,6 +94,9 @@ private[spark] class HadoopPartition(rddId: Int, override val 
index: Int, s: Inp
  * @param keyClass Class of the key associated with the inputFormatClass.
  * @param valueClass Class of the value associated with the inputFormatClass.
  * @param minPartitions Minimum number of HadoopRDD partitions (Hadoop Splits) 
to generate.
+ *
+ * @note Instantiating this class directly is not recommended, please use
+ * [[org.apache.spark.SparkContext.hadoopRDD()]]
  */
 @DeveloperApi
 class HadoopRDD[K, V](

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala
index 488e777..a5965f5 100644
--- a/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala
@@ -57,13 +57,13 @@ private[spark] class NewHadoopPartition(
  * An RDD that provides core functionality for reading data stored in Hadoop 
(e.g., files in HDFS,
  * sources in HBase, or S3), using the new MapReduce API 
(`org.apache.hadoop.mapreduce`).
  *
- * Note: Instantiating this class directly is not recommended, please use
- * [[org.apache.spark.SparkContext.newAPIHadoopRDD()]]
- *
  * @param sc The SparkContext to associate the RDD with.
  * @param inputFormatClass Storage format of the data to be read.
  * @param keyClass Class of the key associated with the inputFormatClass.
  * @param valueClass Class of the value associated with the inputFormatClass.
+ *
+ * @note Instantiating this class directly is not recommended, please use
+ * [[org.apache.spark.SparkContext.newAPIHadoopRDD()]]
  */
 @DeveloperApi
 class NewHadoopRDD[K, V](

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala 
b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
index 67baad1..9ed0f3d 100644
--- a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
@@ -59,8 +59,8 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    * :: Experimental ::
    * Generic function to combine the elements for each key using a custom set 
of aggregation
    * functions. Turns an RDD[(K, V)] into a result of type RDD[(K, C)], for a 
"combined type" C
-   * Note that V and C can be different -- for example, one might group an RDD 
of type
-   * (Int, Int) into an RDD of type (Int, Seq[Int]). Users provide three 
functions:
+   *
+   * Users provide three functions:
    *
    *  - `createCombiner`, which turns a V into a C (e.g., creates a 
one-element list)
    *  - `mergeValue`, to merge a V into a C (e.g., adds it to the end of a 
list)
@@ -68,6 +68,9 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    *
    * In addition, users can control the partitioning of the output RDD, and 
whether to perform
    * map-side aggregation (if a mapper can produce multiple items with the 
same key).
+   *
+   * @note V and C can be different -- for example, one might group an RDD of 
type
+   * (Int, Int) into an RDD of type (Int, Seq[Int]).
    */
   @Experimental
   def combineByKeyWithClassTag[C](
@@ -363,7 +366,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
   /**
    * Count the number of elements for each key, collecting the results to a 
local Map.
    *
-   * Note that this method should only be used if the resulting map is 
expected to be small, as
+   * @note This method should only be used if the resulting map is expected to 
be small, as
    * the whole thing is loaded into the driver's memory.
    * To handle very large results, consider using rdd.mapValues(_ => 
1L).reduceByKey(_ + _), which
    * returns an RDD[T, Long] instead of a map.
@@ -490,11 +493,11 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    * The ordering of elements within each group is not guaranteed, and may 
even differ
    * each time the resulting RDD is evaluated.
    *
-   * Note: This operation may be very expensive. If you are grouping in order 
to perform an
+   * @note This operation may be very expensive. If you are grouping in order 
to perform an
    * aggregation (such as a sum or average) over each key, using 
[[PairRDDFunctions.aggregateByKey]]
    * or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
    *
-   * Note: As currently implemented, groupByKey must be able to hold all the 
key-value pairs for any
+   * @note As currently implemented, groupByKey must be able to hold all the 
key-value pairs for any
    * key in memory. If a key has too many values, it can result in an 
[[OutOfMemoryError]].
    */
   def groupByKey(partitioner: Partitioner): RDD[(K, Iterable[V])] = 
self.withScope {
@@ -514,11 +517,11 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    * resulting RDD with into `numPartitions` partitions. The ordering of 
elements within
    * each group is not guaranteed, and may even differ each time the resulting 
RDD is evaluated.
    *
-   * Note: This operation may be very expensive. If you are grouping in order 
to perform an
+   * @note This operation may be very expensive. If you are grouping in order 
to perform an
    * aggregation (such as a sum or average) over each key, using 
[[PairRDDFunctions.aggregateByKey]]
    * or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
    *
-   * Note: As currently implemented, groupByKey must be able to hold all the 
key-value pairs for any
+   * @note As currently implemented, groupByKey must be able to hold all the 
key-value pairs for any
    * key in memory. If a key has too many values, it can result in an 
[[OutOfMemoryError]].
    */
   def groupByKey(numPartitions: Int): RDD[(K, Iterable[V])] = self.withScope {
@@ -635,7 +638,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    * within each group is not guaranteed, and may even differ each time the 
resulting RDD is
    * evaluated.
    *
-   * Note: This operation may be very expensive. If you are grouping in order 
to perform an
+   * @note This operation may be very expensive. If you are grouping in order 
to perform an
    * aggregation (such as a sum or average) over each key, using 
[[PairRDDFunctions.aggregateByKey]]
    * or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
    */
@@ -1016,7 +1019,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    * Output the RDD to any Hadoop-supported file system, using a Hadoop 
`OutputFormat` class
    * supporting the key and value types K and V in this RDD.
    *
-   * Note that, we should make sure our tasks are idempotent when speculation 
is enabled, i.e. do
+   * @note We should make sure our tasks are idempotent when speculation is 
enabled, i.e. do
    * not use output committer that writes data directly.
    * There is an example in https://issues.apache.org/jira/browse/SPARK-10063 
to show the bad
    * result of using direct output committer with speculation enabled.
@@ -1070,7 +1073,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    * output paths required (e.g. a table name to write to) in the same way as 
it would be
    * configured for a Hadoop MapReduce job.
    *
-   * Note that, we should make sure our tasks are idempotent when speculation 
is enabled, i.e. do
+   * @note We should make sure our tasks are idempotent when speculation is 
enabled, i.e. do
    * not use output committer that writes data directly.
    * There is an example in https://issues.apache.org/jira/browse/SPARK-10063 
to show the bad
    * result of using direct output committer with speculation enabled.

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala
index 0c6ddda..ce75a16 100644
--- a/core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala
@@ -48,7 +48,7 @@ private[spark] class PruneDependency[T](rdd: RDD[T], 
partitionFilterFunc: Int =>
 
 /**
  * :: DeveloperApi ::
- * A RDD used to prune RDD partitions/partitions so we can avoid launching 
tasks on
+ * An RDD used to prune RDD partitions/partitions so we can avoid launching 
tasks on
  * all partitions. An example use case: If we know the RDD is partitioned by 
range,
  * and the execution DAG has a filter on the key, we can avoid launching tasks
  * on partitions that don't have the range covering the key.

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/PartitionwiseSampledRDD.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/rdd/PartitionwiseSampledRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/PartitionwiseSampledRDD.scala
index 3b1acac..6a89ea8 100644
--- a/core/src/main/scala/org/apache/spark/rdd/PartitionwiseSampledRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/PartitionwiseSampledRDD.scala
@@ -32,7 +32,7 @@ class PartitionwiseSampledRDDPartition(val prev: Partition, 
val seed: Long)
 }
 
 /**
- * A RDD sampled from its parent RDD partition-wise. For each partition of the 
parent RDD,
+ * An RDD sampled from its parent RDD partition-wise. For each partition of 
the parent RDD,
  * a user-specified [[org.apache.spark.util.random.RandomSampler]] instance is 
used to obtain
  * a random sample of the records in the partition. The random seeds assigned 
to the samplers
  * are guaranteed to have different values.

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/RDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index cded899..bff2b8f 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -428,7 +428,7 @@ abstract class RDD[T: ClassTag](
    * current upstream partitions will be executed in parallel (per whatever
    * the current partitioning is).
    *
-   * Note: With shuffle = true, you can actually coalesce to a larger number
+   * @note With shuffle = true, you can actually coalesce to a larger number
    * of partitions. This is useful if you have a small number of partitions,
    * say 100, potentially with a few partitions being abnormally large. Calling
    * coalesce(1000, shuffle = true) will result in 1000 partitions with the
@@ -466,14 +466,14 @@ abstract class RDD[T: ClassTag](
   /**
    * Return a sampled subset of this RDD.
    *
-   * Note: this is NOT guaranteed to provide exactly the fraction of the count
-   * of the given [[RDD]].
-   *
    * @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
    * @param fraction expected size of the sample as a fraction of this RDD's 
size
    *  without replacement: probability that each element is chosen; fraction 
must be [0, 1]
    *  with replacement: expected number of times each element is chosen; 
fraction must be >= 0
    * @param seed seed for the random number generator
+   *
+   * @note This is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
    */
   def sample(
       withReplacement: Boolean,
@@ -537,13 +537,13 @@ abstract class RDD[T: ClassTag](
   /**
    * Return a fixed-size sampled subset of this RDD in an array
    *
-   * @note this method should only be used if the resulting array is expected 
to be small, as
-   * all the data is loaded into the driver's memory.
-   *
    * @param withReplacement whether sampling is done with replacement
    * @param num size of the returned sample
    * @param seed seed for the random number generator
    * @return sample of specified size in an array
+   *
+   * @note this method should only be used if the resulting array is expected 
to be small, as
+   * all the data is loaded into the driver's memory.
    */
   def takeSample(
       withReplacement: Boolean,
@@ -618,7 +618,7 @@ abstract class RDD[T: ClassTag](
    * Return the intersection of this RDD and another one. The output will not 
contain any duplicate
    * elements, even if the input RDDs did.
    *
-   * Note that this method performs a shuffle internally.
+   * @note This method performs a shuffle internally.
    */
   def intersection(other: RDD[T]): RDD[T] = withScope {
     this.map(v => (v, null)).cogroup(other.map(v => (v, null)))
@@ -630,7 +630,7 @@ abstract class RDD[T: ClassTag](
    * Return the intersection of this RDD and another one. The output will not 
contain any duplicate
    * elements, even if the input RDDs did.
    *
-   * Note that this method performs a shuffle internally.
+   * @note This method performs a shuffle internally.
    *
    * @param partitioner Partitioner to use for the resulting RDD
    */
@@ -646,7 +646,7 @@ abstract class RDD[T: ClassTag](
    * Return the intersection of this RDD and another one. The output will not 
contain any duplicate
    * elements, even if the input RDDs did.  Performs a hash partition across 
the cluster
    *
-   * Note that this method performs a shuffle internally.
+   * @note This method performs a shuffle internally.
    *
    * @param numPartitions How many partitions to use in the resulting RDD
    */
@@ -674,7 +674,7 @@ abstract class RDD[T: ClassTag](
    * mapping to that key. The ordering of elements within each group is not 
guaranteed, and
    * may even differ each time the resulting RDD is evaluated.
    *
-   * Note: This operation may be very expensive. If you are grouping in order 
to perform an
+   * @note This operation may be very expensive. If you are grouping in order 
to perform an
    * aggregation (such as a sum or average) over each key, using 
[[PairRDDFunctions.aggregateByKey]]
    * or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
    */
@@ -687,7 +687,7 @@ abstract class RDD[T: ClassTag](
    * mapping to that key. The ordering of elements within each group is not 
guaranteed, and
    * may even differ each time the resulting RDD is evaluated.
    *
-   * Note: This operation may be very expensive. If you are grouping in order 
to perform an
+   * @note This operation may be very expensive. If you are grouping in order 
to perform an
    * aggregation (such as a sum or average) over each key, using 
[[PairRDDFunctions.aggregateByKey]]
    * or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
    */
@@ -702,7 +702,7 @@ abstract class RDD[T: ClassTag](
    * mapping to that key. The ordering of elements within each group is not 
guaranteed, and
    * may even differ each time the resulting RDD is evaluated.
    *
-   * Note: This operation may be very expensive. If you are grouping in order 
to perform an
+   * @note This operation may be very expensive. If you are grouping in order 
to perform an
    * aggregation (such as a sum or average) over each key, using 
[[PairRDDFunctions.aggregateByKey]]
    * or [[PairRDDFunctions.reduceByKey]] will provide much better performance.
    */
@@ -921,7 +921,7 @@ abstract class RDD[T: ClassTag](
   /**
    * Return an array that contains all of the elements in this RDD.
    *
-   * @note this method should only be used if the resulting array is expected 
to be small, as
+   * @note This method should only be used if the resulting array is expected 
to be small, as
    * all the data is loaded into the driver's memory.
    */
   def collect(): Array[T] = withScope {
@@ -934,7 +934,7 @@ abstract class RDD[T: ClassTag](
    *
    * The iterator will consume as much memory as the largest partition in this 
RDD.
    *
-   * Note: this results in multiple Spark jobs, and if the input RDD is the 
result
+   * @note This results in multiple Spark jobs, and if the input RDD is the 
result
    * of a wide transformation (e.g. join with different partitioners), to avoid
    * recomputing the input RDD should be cached first.
    */
@@ -1182,7 +1182,7 @@ abstract class RDD[T: ClassTag](
   /**
    * Return the count of each unique value in this RDD as a local map of 
(value, count) pairs.
    *
-   * Note that this method should only be used if the resulting map is 
expected to be small, as
+   * @note This method should only be used if the resulting map is expected to 
be small, as
    * the whole thing is loaded into the driver's memory.
    * To handle very large results, consider using rdd.map(x =&gt; (x, 
1L)).reduceByKey(_ + _), which
    * returns an RDD[T, Long] instead of a map.
@@ -1272,7 +1272,7 @@ abstract class RDD[T: ClassTag](
    * This is similar to Scala's zipWithIndex but it uses Long instead of Int 
as the index type.
    * This method needs to trigger a spark job when this RDD contains more than 
one partitions.
    *
-   * Note that some RDDs, such as those returned by groupBy(), do not 
guarantee order of
+   * @note Some RDDs, such as those returned by groupBy(), do not guarantee 
order of
    * elements in a partition. The index assigned to each element is therefore 
not guaranteed,
    * and may even change if the RDD is reevaluated. If a fixed ordering is 
required to guarantee
    * the same index assignments, you should sort the RDD with sortByKey() or 
save it to a file.
@@ -1286,7 +1286,7 @@ abstract class RDD[T: ClassTag](
    * 2*n+k, ..., where n is the number of partitions. So there may exist gaps, 
but this method
    * won't trigger a spark job, which is different from 
[[org.apache.spark.rdd.RDD#zipWithIndex]].
    *
-   * Note that some RDDs, such as those returned by groupBy(), do not 
guarantee order of
+   * @note Some RDDs, such as those returned by groupBy(), do not guarantee 
order of
    * elements in a partition. The unique ID assigned to each element is 
therefore not guaranteed,
    * and may even change if the RDD is reevaluated. If a fixed ordering is 
required to guarantee
    * the same index assignments, you should sort the RDD with sortByKey() or 
save it to a file.
@@ -1305,10 +1305,10 @@ abstract class RDD[T: ClassTag](
    * results from that partition to estimate the number of additional 
partitions needed to satisfy
    * the limit.
    *
-   * @note this method should only be used if the resulting array is expected 
to be small, as
+   * @note This method should only be used if the resulting array is expected 
to be small, as
    * all the data is loaded into the driver's memory.
    *
-   * @note due to complications in the internal implementation, this method 
will raise
+   * @note Due to complications in the internal implementation, this method 
will raise
    * an exception if called on an RDD of `Nothing` or `Null`.
    */
   def take(num: Int): Array[T] = withScope {
@@ -1370,7 +1370,7 @@ abstract class RDD[T: ClassTag](
    *   // returns Array(6, 5)
    * }}}
    *
-   * @note this method should only be used if the resulting array is expected 
to be small, as
+   * @note This method should only be used if the resulting array is expected 
to be small, as
    * all the data is loaded into the driver's memory.
    *
    * @param num k, the number of top elements to return
@@ -1393,7 +1393,7 @@ abstract class RDD[T: ClassTag](
    *   // returns Array(2, 3)
    * }}}
    *
-   * @note this method should only be used if the resulting array is expected 
to be small, as
+   * @note This method should only be used if the resulting array is expected 
to be small, as
    * all the data is loaded into the driver's memory.
    *
    * @param num k, the number of elements to return
@@ -1438,7 +1438,7 @@ abstract class RDD[T: ClassTag](
   }
 
   /**
-   * @note due to complications in the internal implementation, this method 
will raise an
+   * @note Due to complications in the internal implementation, this method 
will raise an
    * exception if called on an RDD of `Nothing` or `Null`. This may be come up 
in practice
    * because, for example, the type of `parallelize(Seq())` is `RDD[Nothing]`.
    * (`parallelize(Seq())` should be avoided anyway in favor of 
`parallelize(Seq[T]())`.)

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala
index 429514b..1070bb9 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala
@@ -32,7 +32,7 @@ private[spark] object CheckpointState extends Enumeration {
 
 /**
  * This class contains all the information related to RDD checkpointing. Each 
instance of this
- * class is associated with a RDD. It manages process of checkpointing of the 
associated RDD,
+ * class is associated with an RDD. It manages process of checkpointing of the 
associated RDD,
  * as well as, manages the post-checkpoint state by providing the updated 
partitions,
  * iterator and preferred locations of the checkpointed RDD.
  */

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala
index eac901d..7f399ec 100644
--- a/core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala
@@ -151,7 +151,7 @@ private[spark] object ReliableCheckpointRDD extends Logging 
{
   }
 
   /**
-   * Write a RDD partition's data to a checkpoint file.
+   * Write an RDD partition's data to a checkpoint file.
    */
   def writePartitionToCheckpointFile[T: ClassTag](
       path: String,

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala 
b/core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala
index 1311b48..86a3327 100644
--- a/core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala
@@ -27,9 +27,10 @@ import org.apache.spark.internal.Logging
 
 /**
  * Extra functions available on RDDs of (key, value) pairs to create a Hadoop 
SequenceFile,
- * through an implicit conversion. Note that this can't be part of 
PairRDDFunctions because
- * we need more implicit parameters to convert our keys and values to Writable.
+ * through an implicit conversion.
  *
+ * @note This can't be part of PairRDDFunctions because we need more implicit 
parameters to
+ * convert our keys and values to Writable.
  */
 class SequenceFileRDDFunctions[K <% Writable: ClassTag, V <% Writable : 
ClassTag](
     self: RDD[(K, V)],

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala
index b0e5ba0..8425b21 100644
--- a/core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala
@@ -29,7 +29,7 @@ class ZippedWithIndexRDDPartition(val prev: Partition, val 
startIndex: Long)
 }
 
 /**
- * Represents a RDD zipped with its element indices. The ordering is first 
based on the partition
+ * Represents an RDD zipped with its element indices. The ordering is first 
based on the partition
  * index and then the ordering of items within each partition. So the first 
item in the first
  * partition gets index 0, and the last item in the last partition receives 
the largest index.
  *

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/scheduler/AccumulableInfo.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/scheduler/AccumulableInfo.scala 
b/core/src/main/scala/org/apache/spark/scheduler/AccumulableInfo.scala
index cedacad..0a5fe5a 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/AccumulableInfo.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/AccumulableInfo.scala
@@ -24,11 +24,6 @@ import org.apache.spark.annotation.DeveloperApi
  * :: DeveloperApi ::
  * Information about an [[org.apache.spark.Accumulable]] modified during a 
task or stage.
  *
- * Note: once this is JSON serialized the types of `update` and `value` will 
be lost and be
- * cast to strings. This is because the user can define an accumulator of any 
type and it will
- * be difficult to preserve the type in consumers of the event log. This does 
not apply to
- * internal accumulators that represent task level metrics.
- *
  * @param id accumulator ID
  * @param name accumulator name
  * @param update partial value from a task, may be None if used on driver to 
describe a stage
@@ -36,6 +31,11 @@ import org.apache.spark.annotation.DeveloperApi
  * @param internal whether this accumulator was internal
  * @param countFailedValues whether to count this accumulator's partial value 
if the task failed
  * @param metadata internal metadata associated with this accumulator, if any
+ *
+ * @note Once this is JSON serialized the types of `update` and `value` will 
be lost and be
+ * cast to strings. This is because the user can define an accumulator of any 
type and it will
+ * be difficult to preserve the type in consumers of the event log. This does 
not apply to
+ * internal accumulators that represent task level metrics.
  */
 @DeveloperApi
 case class AccumulableInfo private[spark] (

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala 
b/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala
index 8b72da2..f60dcfd 100644
--- a/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala
+++ b/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala
@@ -131,7 +131,7 @@ private[spark] class JavaSerializerInstance(
  * :: DeveloperApi ::
  * A Spark serializer that uses Java's built-in serialization.
  *
- * Note that this serializer is not guaranteed to be wire-compatible across 
different versions of
+ * @note This serializer is not guaranteed to be wire-compatible across 
different versions of
  * Spark. It is intended to be used to serialize/de-serialize data within a 
single
  * Spark application.
  */

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala 
b/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala
index 0d26281..19e020c 100644
--- a/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala
+++ b/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala
@@ -45,7 +45,7 @@ import org.apache.spark.util.collection.CompactBuffer
 /**
  * A Spark serializer that uses the [[https://code.google.com/p/kryo/ Kryo 
serialization library]].
  *
- * Note that this serializer is not guaranteed to be wire-compatible across 
different versions of
+ * @note This serializer is not guaranteed to be wire-compatible across 
different versions of
  * Spark. It is intended to be used to serialize/de-serialize data within a 
single
  * Spark application.
  */

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/serializer/Serializer.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/serializer/Serializer.scala 
b/core/src/main/scala/org/apache/spark/serializer/Serializer.scala
index cb95246..afe6cd8 100644
--- a/core/src/main/scala/org/apache/spark/serializer/Serializer.scala
+++ b/core/src/main/scala/org/apache/spark/serializer/Serializer.scala
@@ -40,7 +40,7 @@ import org.apache.spark.util.NextIterator
  *
  * 2. Java serialization interface.
  *
- * Note that serializers are not required to be wire-compatible across 
different versions of Spark.
+ * @note Serializers are not required to be wire-compatible across different 
versions of Spark.
  * They are intended to be used to serialize/de-serialize data within a single 
Spark application.
  */
 @DeveloperApi

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/storage/StorageUtils.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/storage/StorageUtils.scala 
b/core/src/main/scala/org/apache/spark/storage/StorageUtils.scala
index fb9941b..e12f2e6 100644
--- a/core/src/main/scala/org/apache/spark/storage/StorageUtils.scala
+++ b/core/src/main/scala/org/apache/spark/storage/StorageUtils.scala
@@ -71,7 +71,7 @@ class StorageStatus(val blockManagerId: BlockManagerId, val 
maxMem: Long) {
   /**
    * Return the blocks stored in this block manager.
    *
-   * Note that this is somewhat expensive, as it involves cloning the 
underlying maps and then
+   * @note This is somewhat expensive, as it involves cloning the underlying 
maps and then
    * concatenating them together. Much faster alternatives exist for common 
operations such as
    * contains, get, and size.
    */
@@ -80,7 +80,7 @@ class StorageStatus(val blockManagerId: BlockManagerId, val 
maxMem: Long) {
   /**
    * Return the RDD blocks stored in this block manager.
    *
-   * Note that this is somewhat expensive, as it involves cloning the 
underlying maps and then
+   * @note This is somewhat expensive, as it involves cloning the underlying 
maps and then
    * concatenating them together. Much faster alternatives exist for common 
operations such as
    * getting the memory, disk, and off-heap memory sizes occupied by this RDD.
    */
@@ -128,7 +128,8 @@ class StorageStatus(val blockManagerId: BlockManagerId, val 
maxMem: Long) {
 
   /**
    * Return whether the given block is stored in this block manager in O(1) 
time.
-   * Note that this is much faster than `this.blocks.contains`, which is 
O(blocks) time.
+   *
+   * @note This is much faster than `this.blocks.contains`, which is O(blocks) 
time.
    */
   def containsBlock(blockId: BlockId): Boolean = {
     blockId match {
@@ -141,7 +142,8 @@ class StorageStatus(val blockManagerId: BlockManagerId, val 
maxMem: Long) {
 
   /**
    * Return the given block stored in this block manager in O(1) time.
-   * Note that this is much faster than `this.blocks.get`, which is O(blocks) 
time.
+   *
+   * @note This is much faster than `this.blocks.get`, which is O(blocks) time.
    */
   def getBlock(blockId: BlockId): Option[BlockStatus] = {
     blockId match {
@@ -154,19 +156,22 @@ class StorageStatus(val blockManagerId: BlockManagerId, 
val maxMem: Long) {
 
   /**
    * Return the number of blocks stored in this block manager in O(RDDs) time.
-   * Note that this is much faster than `this.blocks.size`, which is O(blocks) 
time.
+   *
+   * @note This is much faster than `this.blocks.size`, which is O(blocks) 
time.
    */
   def numBlocks: Int = _nonRddBlocks.size + numRddBlocks
 
   /**
    * Return the number of RDD blocks stored in this block manager in O(RDDs) 
time.
-   * Note that this is much faster than `this.rddBlocks.size`, which is O(RDD 
blocks) time.
+   *
+   * @note This is much faster than `this.rddBlocks.size`, which is O(RDD 
blocks) time.
    */
   def numRddBlocks: Int = _rddBlocks.values.map(_.size).sum
 
   /**
    * Return the number of blocks that belong to the given RDD in O(1) time.
-   * Note that this is much faster than `this.rddBlocksById(rddId).size`, 
which is
+   *
+   * @note This is much faster than `this.rddBlocksById(rddId).size`, which is
    * O(blocks in this RDD) time.
    */
   def numRddBlocksById(rddId: Int): Int = 
_rddBlocks.get(rddId).map(_.size).getOrElse(0)

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala 
b/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala
index d3ddd39..1326f09 100644
--- a/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala
+++ b/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala
@@ -59,8 +59,9 @@ abstract class AccumulatorV2[IN, OUT] extends Serializable {
   }
 
   /**
-   * Returns true if this accumulator has been registered.  Note that all 
accumulators must be
-   * registered before use, or it will throw exception.
+   * Returns true if this accumulator has been registered.
+   *
+   * @note All accumulators must be registered before use, or it will throw 
exception.
    */
   final def isRegistered: Boolean =
     metadata != null && AccumulatorContext.get(metadata.id).isDefined

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
----------------------------------------------------------------------
diff --git 
a/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala 
b/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
index bec95d1..5e8a854 100644
--- a/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
+++ b/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
@@ -2076,7 +2076,7 @@ class DAGSchedulerSuite extends SparkFunSuite with 
LocalSparkContext with Timeou
   }
 
   /**
-   * Checks the DAGScheduler's internal logic for traversing a RDD DAG by 
making sure that
+   * Checks the DAGScheduler's internal logic for traversing an RDD DAG by 
making sure that
    * getShuffleDependencies correctly returns the direct shuffle dependencies 
of a particular
    * RDD. The test creates the following RDD graph (where n denotes a narrow 
dependency and s
    * denotes a shuffle dependency):

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/docs/mllib-isotonic-regression.md
----------------------------------------------------------------------
diff --git a/docs/mllib-isotonic-regression.md 
b/docs/mllib-isotonic-regression.md
index d90905a..ca84551 100644
--- a/docs/mllib-isotonic-regression.md
+++ b/docs/mllib-isotonic-regression.md
@@ -27,7 +27,7 @@ best fitting the original data points.
 [pool adjacent violators algorithm](http://doi.org/10.1198/TECH.2010.10111)
 which uses an approach to
 [parallelizing isotonic 
regression](http://doi.org/10.1007/978-3-642-99789-1_10).
-The training input is a RDD of tuples of three double values that represent
+The training input is an RDD of tuples of three double values that represent
 label, feature and weight in this order. Additionally IsotonicRegression 
algorithm has one
 optional parameter called $isotonic$ defaulting to true.
 This argument specifies if the isotonic regression is

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/docs/streaming-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/streaming-programming-guide.md 
b/docs/streaming-programming-guide.md
index 0b0315b..18fc1cd 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -2191,7 +2191,7 @@ consistent batch processing times. Make sure you set the 
CMS GC on both the driv
 
 - When data is received from a stream source, receiver creates blocks of data. 
 A new block of data is generated every blockInterval milliseconds. N blocks of 
data are created during the batchInterval where N = 
batchInterval/blockInterval. These blocks are distributed by the BlockManager 
of the current executor to the block managers of other executors. After that, 
the Network Input Tracker running on the driver is informed about the block 
locations for further processing.
 
-- A RDD is created on the driver for the blocks created during the 
batchInterval. The blocks generated during the batchInterval are partitions of 
the RDD. Each partition is a task in spark. blockInterval== batchinterval would 
mean that a single partition is created and probably it is processed locally.
+- An RDD is created on the driver for the blocks created during the 
batchInterval. The blocks generated during the batchInterval are partitions of 
the RDD. Each partition is a task in spark. blockInterval== batchinterval would 
mean that a single partition is created and probably it is processed locally.
 
 - The map tasks on the blocks are processed in the executors (one that 
received the block, and another where the block was replicated) that has the 
blocks irrespective of block interval, unless non-local scheduling kicks in.
 Having bigger blockinterval means bigger blocks. A high value of 
`spark.locality.wait` increases the chance of processing a block on the local 
node. A balance needs to be found out between these two parameters to ensure 
that the bigger blocks are processed locally.

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
----------------------------------------------------------------------
diff --git 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
index 5bcc512..341081a 100644
--- 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
+++ 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
@@ -279,7 +279,7 @@ private[kafka010] case class KafkaSource(
       }
     }.toArray
 
-    // Create a RDD that reads from Kafka and get the (key, value) pair as 
byte arrays.
+    // Create an RDD that reads from Kafka and get the (key, value) pair as 
byte arrays.
     val rdd = new KafkaSourceRDD(
       sc, executorKafkaParams, offsetRanges, pollTimeoutMs).map { cr =>
       Row(cr.key, cr.value, cr.topic, cr.partition, cr.offset, cr.timestamp, 
cr.timestampType.id)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[3/3] spark git commit: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation

Reply via email to