spark git commit: [MINOR][DOCS] Fix all typos in markdown files of `doc` and similar patterns in other comments

srowen Mon, 22 Feb 2016 01:53:18 -0800

Repository: spark
Updated Branches:
  refs/heads/master 1b144455b -> 024482bf5



[MINOR][DOCS] Fix all typos in markdown files of `doc` and similar patterns in 
other comments

## What changes were proposed in this pull request?

This PR tries to fix all typos in all markdown files under `docs` module,
and fixes similar typos in other comments, too.

## How was the this patch tested?

manual tests.

Author: Dongjoon Hyun <[email protected]>

Closes #11300 from dongjoon-hyun/minor_fix_typos.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/024482bf
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/024482bf
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/024482bf

Branch: refs/heads/master
Commit: 024482bf51e8158eed08a7dc0758f585baf86e1f
Parents: 1b14445
Author: Dongjoon Hyun <[email protected]>
Authored: Mon Feb 22 09:52:07 2016 +0000
Committer: Sean Owen <[email protected]>
Committed: Mon Feb 22 09:52:07 2016 +0000

----------------------------------------------------------------------
 R/pkg/R/functions.R                                            | 6 +++---
 R/pkg/R/sparkR.R                                               | 2 +-
 .../main/java/org/apache/spark/util/sketch/CountMinSketch.java | 2 +-
 core/src/main/scala/org/apache/spark/CacheManager.scala        | 2 +-
 .../main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala  | 2 +-
 .../main/scala/org/apache/spark/scheduler/DAGScheduler.scala   | 2 +-
 .../org/apache/spark/deploy/history/HistoryServerSuite.scala   | 6 +++---
 docs/ml-classification-regression.md                           | 2 +-
 docs/ml-features.md                                            | 6 +++---
 docs/ml-guide.md                                               | 2 +-
 docs/mllib-clustering.md                                       | 6 +++---
 docs/mllib-evaluation-metrics.md                               | 6 +++---
 docs/mllib-frequent-pattern-mining.md                          | 2 +-
 docs/monitoring.md                                             | 2 +-
 docs/programming-guide.md                                      | 2 +-
 docs/running-on-mesos.md                                       | 4 ++--
 docs/spark-standalone.md                                       | 2 +-
 docs/sql-programming-guide.md                                  | 2 +-
 docs/streaming-flume-integration.md                            | 2 +-
 docs/streaming-kinesis-integration.md                          | 4 ++--
 docs/streaming-programming-guide.md                            | 2 +-
 .../main/scala/org/apache/spark/graphx/impl/GraphImpl.scala    | 2 +-
 .../scala/org/apache/spark/graphx/util/GraphGenerators.scala   | 2 +-
 .../main/scala/org/apache/spark/ml/recommendation/ALS.scala    | 2 +-
 python/pyspark/sql/functions.py                                | 6 +++---
 .../sql/catalyst/analysis/DistinctAggregationRewriter.scala    | 2 +-
 .../catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala   | 2 +-
 .../expressions/codegen/GenerateMutableProjection.scala        | 2 +-
 .../test/scala/org/apache/spark/sql/RandomDataGenerator.scala  | 4 ++--
 .../spark/sql/execution/datasources/WriterContainer.scala      | 2 +-
 .../spark/sql/execution/exchange/ExchangeCoordinator.scala     | 2 +-
 sql/core/src/main/scala/org/apache/spark/sql/functions.scala   | 6 +++---
 sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala  | 4 ++--
 .../spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala  | 4 ++--
 .../main/scala/org/apache/spark/streaming/util/StateMap.scala  | 2 +-
 .../apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala    | 2 +-
 36 files changed, 55 insertions(+), 55 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/R/pkg/R/functions.R
----------------------------------------------------------------------
diff --git a/R/pkg/R/functions.R b/R/pkg/R/functions.R
index 8f8651c..e5521f3 100644
--- a/R/pkg/R/functions.R
+++ b/R/pkg/R/functions.R
@@ -1962,7 +1962,7 @@ setMethod("sha2", signature(y = "Column", x = "numeric"),
 
 #' shiftLeft
 #'
-#' Shift the the given value numBits left. If the given value is a long value, 
this function
+#' Shift the given value numBits left. If the given value is a long value, 
this function
 #' will return a long value else it will return an integer value.
 #'
 #' @family math_funcs
@@ -1980,7 +1980,7 @@ setMethod("shiftLeft", signature(y = "Column", x = 
"numeric"),
 
 #' shiftRight
 #'
-#' Shift the the given value numBits right. If the given value is a long 
value, it will return
+#' Shift the given value numBits right. If the given value is a long value, it 
will return
 #' a long value else it will return an integer value.
 #'
 #' @family math_funcs
@@ -1998,7 +1998,7 @@ setMethod("shiftRight", signature(y = "Column", x = 
"numeric"),
 
 #' shiftRightUnsigned
 #'
-#' Unsigned shift the the given value numBits right. If the given value is a 
long value,
+#' Unsigned shift the given value numBits right. If the given value is a long 
value,
 #' it will return a long value else it will return an integer value.
 #'
 #' @family math_funcs

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/R/pkg/R/sparkR.R
----------------------------------------------------------------------
diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R
index d2bfad5..3e9eafc 100644
--- a/R/pkg/R/sparkR.R
+++ b/R/pkg/R/sparkR.R
@@ -299,7 +299,7 @@ sparkRHive.init <- function(jsc = NULL) {
 #'
 #' @param sc existing spark context
 #' @param groupid the ID to be assigned to job groups
-#' @param description description for the the job group ID
+#' @param description description for the job group ID
 #' @param interruptOnCancel flag to indicate if the job is interrupted on job 
cancellation
 #' @examples
 #'\dontrun{

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java
----------------------------------------------------------------------
diff --git 
a/common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java 
b/common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java
index 48f9868..2c9aa93 100644
--- 
a/common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java
+++ 
b/common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java
@@ -39,7 +39,7 @@ import java.io.OutputStream;
  * Suppose you want to estimate the number of times an element {@code x} has 
appeared in a data
  * stream so far.  With probability {@code delta}, the estimate of this 
frequency is within the
  * range {@code true frequency <= estimate <= true frequency + eps * N}, where 
{@code N} is the
- * total count of items have appeared the the data stream so far.
+ * total count of items have appeared the data stream so far.
  *
  * Under the cover, a {@link CountMinSketch} is essentially a two-dimensional 
{@code long} array
  * with depth {@code d} and width {@code w}, where

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/core/src/main/scala/org/apache/spark/CacheManager.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/CacheManager.scala 
b/core/src/main/scala/org/apache/spark/CacheManager.scala
index 923ff41..1ec9ba7 100644
--- a/core/src/main/scala/org/apache/spark/CacheManager.scala
+++ b/core/src/main/scala/org/apache/spark/CacheManager.scala
@@ -120,7 +120,7 @@ private[spark] class CacheManager(blockManager: 
BlockManager) extends Logging {
    * The effective storage level refers to the level that actually specifies 
BlockManager put
    * behavior, not the level originally specified by the user. This is mainly 
for forcing a
    * MEMORY_AND_DISK partition to disk if there is not enough room to unroll 
the partition,
-   * while preserving the the original semantics of the RDD as specified by 
the application.
+   * while preserving the original semantics of the RDD as specified by the 
application.
    */
   private def putInBlockManager[T](
       key: BlockId,

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala 
b/core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala
index d71bb63..2096a37 100644
--- a/core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala
@@ -76,7 +76,7 @@ class OrderedRDDFunctions[K : Ordering : ClassTag,
   }
 
   /**
-   * Returns an RDD containing only the elements in the the inclusive range 
`lower` to `upper`.
+   * Returns an RDD containing only the elements in the inclusive range 
`lower` to `upper`.
    * If the RDD has been partitioned using a `RangePartitioner`, then this 
operation can be
    * performed efficiently by only scanning the partitions that might contain 
matching elements.
    * Otherwise, a standard `filter` is applied to all partitions.

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
index 379dc14..ba773e1 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
@@ -655,7 +655,7 @@ class DAGScheduler(
 
   /**
    * Submit a shuffle map stage to run independently and get a JobWaiter 
object back. The waiter
-   * can be used to block until the the job finishes executing or can be used 
to cancel the job.
+   * can be used to block until the job finishes executing or can be used to 
cancel the job.
    * This method is used for adaptive query planning, to run map stages and 
look at statistics
    * about their outputs before submitting downstream stages.
    *

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
----------------------------------------------------------------------
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
index 4b05469..e5cd2ed 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
@@ -47,7 +47,7 @@ import org.apache.spark.util.{ResetSystemProperties, Utils}
 /**
  * A collection of tests against the historyserver, including comparing 
responses from the json
  * metrics api to a set of known "golden files".  If new endpoints / 
parameters are added,
- * cases should be added to this test suite.  The expected outcomes can be 
genered by running
+ * cases should be added to this test suite.  The expected outcomes can be 
generated by running
  * the HistoryServerSuite.main.  Note that this will blindly generate new 
expectation files matching
  * the current behavior -- the developer must verify that behavior is correct.
  *
@@ -274,12 +274,12 @@ class HistoryServerSuite extends SparkFunSuite with 
BeforeAndAfter with Matchers
     implicit val webDriver: WebDriver = new HtmlUnitDriver
     implicit val formats = org.json4s.DefaultFormats
 
-    // this test dir is explictly deleted on successful runs; retained for 
diagnostics when
+    // this test dir is explicitly deleted on successful runs; retained for 
diagnostics when
     // not
     val logDir = Utils.createDirectory(System.getProperty("java.io.tmpdir", 
"logs"))
 
     // a new conf is used with the background thread set and running at its 
fastest
-    // alllowed refresh rate (1Hz)
+    // allowed refresh rate (1Hz)
     val myConf = new SparkConf()
       .set("spark.history.fs.logDirectory", logDir.getAbsolutePath)
       .set("spark.eventLog.dir", logDir.getAbsolutePath)

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/ml-classification-regression.md
----------------------------------------------------------------------
diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index 9569a06..45155c8 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -252,7 +252,7 @@ Nodes in the output layer use softmax function:
 \]`
 The number of nodes `$N$` in the output layer corresponds to the number of 
classes. 
 
-MLPC employes backpropagation for learning the model. We use logistic loss 
function for optimization and L-BFGS as optimization routine.
+MLPC employs backpropagation for learning the model. We use logistic loss 
function for optimization and L-BFGS as optimization routine.
 
 **Example**
 

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/ml-features.md
----------------------------------------------------------------------
diff --git a/docs/ml-features.md b/docs/ml-features.md
index 5809f65..68d3ea2 100644
--- a/docs/ml-features.md
+++ b/docs/ml-features.md
@@ -185,7 +185,7 @@ for more details on the API.
 <div data-lang="python" markdown="1">
 
 Refer to the [Tokenizer Python 
docs](api/python/pyspark.ml.html#pyspark.ml.feature.Tokenizer) and
-the the [RegexTokenizer Python 
docs](api/python/pyspark.ml.html#pyspark.ml.feature.RegexTokenizer)
+the [RegexTokenizer Python 
docs](api/python/pyspark.ml.html#pyspark.ml.feature.RegexTokenizer)
 for more details on the API.
 
 {% include_example python/ml/tokenizer_example.py %}
@@ -459,7 +459,7 @@ column, we should get the following:
 "a" gets index `0` because it is the most frequent, followed by "c" with index 
`1` and "b" with
 index `2`.
 
-Additionaly, there are two strategies regarding how `StringIndexer` will handle
+Additionally, there are two strategies regarding how `StringIndexer` will 
handle
 unseen labels when you have fit a `StringIndexer` on one dataset and then use 
it
 to transform another:
 
@@ -779,7 +779,7 @@ for more details on the API.
 
 * `splits`: Parameter for mapping continuous features into buckets. With n+1 
splits, there are n buckets. A bucket defined by splits x,y holds values in the 
range [x,y) except the last bucket, which also includes y. Splits should be 
strictly increasing. Values at -inf, inf must be explicitly provided to cover 
all Double values; Otherwise, values outside the splits specified will be 
treated as errors. Two examples of `splits` are `Array(Double.NegativeInfinity, 
0.0, 1.0, Double.PositiveInfinity)` and `Array(0.0, 1.0, 2.0)`.
 
-Note that if you have no idea of the upper bound and lower bound of the 
targeted column, you would better add the `Double.NegativeInfinity` and 
`Double.PositiveInfinity` as the bounds of your splits to prevent a potenial 
out of Bucketizer bounds exception.
+Note that if you have no idea of the upper bound and lower bound of the 
targeted column, you would better add the `Double.NegativeInfinity` and 
`Double.PositiveInfinity` as the bounds of your splits to prevent a potential 
out of Bucketizer bounds exception.
 
 Note also that the splits that you provided have to be in strictly increasing 
order, i.e. `s0 < s1 < s2 < ... < sn`.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/ml-guide.md
----------------------------------------------------------------------
diff --git a/docs/ml-guide.md b/docs/ml-guide.md
index 1770aab..8eee2fb 100644
--- a/docs/ml-guide.md
+++ b/docs/ml-guide.md
@@ -628,7 +628,7 @@ Currently, `spark.ml` supports model selection using the 
[`CrossValidator`](api/
 The `Evaluator` can be a 
[`RegressionEvaluator`](api/scala/index.html#org.apache.spark.ml.evaluation.RegressionEvaluator)
 for regression problems, a 
[`BinaryClassificationEvaluator`](api/scala/index.html#org.apache.spark.ml.evaluation.BinaryClassificationEvaluator)
 for binary data, or a 
[`MultiClassClassificationEvaluator`](api/scala/index.html#org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator)
-for multiclass problems. The default metric used to choose the best `ParamMap` 
can be overriden by the `setMetricName`
+for multiclass problems. The default metric used to choose the best `ParamMap` 
can be overridden by the `setMetricName`
 method in each of these evaluators.
 
 The `ParamMap` which produces the best evaluation metric (averaged over the 
`$k$` folds) is selected as the best model.

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/mllib-clustering.md
----------------------------------------------------------------------
diff --git a/docs/mllib-clustering.md b/docs/mllib-clustering.md
index d0be032..8e724fb 100644
--- a/docs/mllib-clustering.md
+++ b/docs/mllib-clustering.md
@@ -300,7 +300,7 @@ for i in range(2):
 ## Power iteration clustering (PIC)
 
 Power iteration clustering (PIC) is a scalable and efficient algorithm for 
clustering vertices of a
-graph given pairwise similarties as edge properties,
+graph given pairwise similarities as edge properties,
 described in [Lin and Cohen, Power Iteration 
Clustering](http://www.icml2010.org/papers/387.pdf).
 It computes a pseudo-eigenvector of the normalized affinity matrix of the 
graph via
 [power iteration](http://en.wikipedia.org/wiki/Power_iteration)  and uses it 
to cluster vertices.
@@ -786,7 +786,7 @@ This example shows how to estimate clusters on streaming 
data.
 <div data-lang="scala" markdown="1">
 Refer to the [`StreamingKMeans` Scala 
docs](api/scala/index.html#org.apache.spark.mllib.clustering.StreamingKMeans) 
for details on the API.
 
-First we import the neccessary classes.
+First we import the necessary classes.
 
 {% highlight scala %}
 
@@ -837,7 +837,7 @@ ssc.awaitTermination()
 <div data-lang="python" markdown="1">
 Refer to the [`StreamingKMeans` Python 
docs](api/python/pyspark.mllib.html#pyspark.mllib.clustering.StreamingKMeans) 
for more details on the API.
 
-First we import the neccessary classes.
+First we import the necessary classes.
 
 {% highlight python %}
 from pyspark.mllib.linalg import Vectors

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/mllib-evaluation-metrics.md
----------------------------------------------------------------------
diff --git a/docs/mllib-evaluation-metrics.md b/docs/mllib-evaluation-metrics.md
index 774826c..a269dbf 100644
--- a/docs/mllib-evaluation-metrics.md
+++ b/docs/mllib-evaluation-metrics.md
@@ -67,7 +67,7 @@ plots (recall, false positive rate) points.
   </thead>
   <tbody>
     <tr>
-      <td>Precision (Postive Predictive Value)</td>
+      <td>Precision (Positive Predictive Value)</td>
       <td>$PPV=\frac{TP}{TP + FP}$</td>
     </tr>
     <tr>
@@ -360,7 +360,7 @@ $$I_A(x) = \begin{cases}1 & \text{if $x \in A$}, \\ 0 & 
\text{otherwise}.\end{ca
 
 **Examples**
 
-The following code snippets illustrate how to evaluate the performance of a 
multilabel classifer. The examples
+The following code snippets illustrate how to evaluate the performance of a 
multilabel classifier. The examples
 use the fake prediction and label data for multilabel classification that is 
shown below.
 
 Document predictions:
@@ -558,7 +558,7 @@ variable from a number of independent variables.
       <td>$RMSE = \sqrt{\frac{\sum_{i=0}^{N-1} (\mathbf{y}_i - 
\hat{\mathbf{y}}_i)^2}{N}}$</td>
     </tr>
     <tr>
-      <td>Mean Absoloute Error (MAE)</td>
+      <td>Mean Absolute Error (MAE)</td>
       <td>$MAE=\sum_{i=0}^{N-1} \left|\mathbf{y}_i - 
\hat{\mathbf{y}}_i\right|$</td>
     </tr>
     <tr>

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/mllib-frequent-pattern-mining.md
----------------------------------------------------------------------
diff --git a/docs/mllib-frequent-pattern-mining.md 
b/docs/mllib-frequent-pattern-mining.md
index 2c8a8f2..a7b55dc 100644
--- a/docs/mllib-frequent-pattern-mining.md
+++ b/docs/mllib-frequent-pattern-mining.md
@@ -135,7 +135,7 @@ pattern mining problem.
   included in the results.
 * `maxLocalProjDBSize`: the maximum number of items allowed in a
   prefix-projected database before local iterative processing of the
-  projected databse begins. This parameter should be tuned with respect
+  projected database begins. This parameter should be tuned with respect
   to the size of your executors.
 
 **Examples**

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/monitoring.md
----------------------------------------------------------------------
diff --git a/docs/monitoring.md b/docs/monitoring.md
index c37f6fb..c139e1c 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -108,7 +108,7 @@ The history server can be configured as follows:
     <td>spark.history.fs.update.interval</td>
     <td>10s</td>
     <td>
-      The period at which the the filesystem history provider checks for new or
+      The period at which the filesystem history provider checks for new or
       updated logs in the log directory. A shorter interval detects new 
applications faster,
       at the expense of more server load re-reading updated applications.
       As soon as an update has completed, listings of the completed and 
incomplete applications

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 2d6f776..5ebafa4 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -629,7 +629,7 @@ class MyClass {
 }
 {% endhighlight %}
 
-is equilvalent to writing `rdd.map(x => this.field + x)`, which references all 
of `this`. To avoid this
+is equivalent to writing `rdd.map(x => this.field + x)`, which references all 
of `this`. To avoid this
 issue, the simplest way is to copy `field` into a local variable instead of 
accessing it externally:
 
 {% highlight scala %}

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/running-on-mesos.md
----------------------------------------------------------------------
diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md
index 35f6caa..b9f64c7 100644
--- a/docs/running-on-mesos.md
+++ b/docs/running-on-mesos.md
@@ -188,7 +188,7 @@ overhead, but at the cost of reserving the Mesos resources 
for the complete dura
 application.
 
 Coarse-grained is the default mode. You can also set `spark.mesos.coarse` 
property to true
-to turn it on explictly in [SparkConf](configuration.html#spark-properties):
+to turn it on explicitly in [SparkConf](configuration.html#spark-properties):
 
 {% highlight scala %}
 conf.set("spark.mesos.coarse", "true")
@@ -384,7 +384,7 @@ See the [configuration page](configuration.html) for 
information on Spark config
       <li>Scalar constraints are matched with "less than equal" semantics i.e. 
value in the constraint must be less than or equal to the value in the resource 
offer.</li>
       <li>Range constraints are matched with "contains" semantics i.e. value 
in the constraint must be within the resource offer's value.</li>
       <li>Set constraints are matched with "subset of" semantics i.e. value in 
the constraint must be a subset of the resource offer's value.</li>
-      <li>Text constraints are metched with "equality" semantics i.e. value in 
the constraint must be exactly equal to the resource offer's value.</li>
+      <li>Text constraints are matched with "equality" semantics i.e. value in 
the constraint must be exactly equal to the resource offer's value.</li>
       <li>In case there is no value present as a part of the constraint any 
offer with the corresponding attribute will be accepted (without value 
check).</li>
     </ul>
   </td>

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/spark-standalone.md
----------------------------------------------------------------------
diff --git a/docs/spark-standalone.md b/docs/spark-standalone.md
index 3de72bc..fd94c34 100644
--- a/docs/spark-standalone.md
+++ b/docs/spark-standalone.md
@@ -335,7 +335,7 @@ By default, standalone scheduling clusters are resilient to 
Worker failures (ins
 
 **Overview**
 
-Utilizing ZooKeeper to provide leader election and some state storage, you can 
launch multiple Masters in your cluster connected to the same ZooKeeper 
instance. One will be elected "leader" and the others will remain in standby 
mode. If the current leader dies, another Master will be elected, recover the 
old Master's state, and then resume scheduling. The entire recovery process 
(from the time the the first leader goes down) should take between 1 and 2 
minutes. Note that this delay only affects scheduling _new_ applications -- 
applications that were already running during Master failover are unaffected.
+Utilizing ZooKeeper to provide leader election and some state storage, you can 
launch multiple Masters in your cluster connected to the same ZooKeeper 
instance. One will be elected "leader" and the others will remain in standby 
mode. If the current leader dies, another Master will be elected, recover the 
old Master's state, and then resume scheduling. The entire recovery process 
(from the time the first leader goes down) should take between 1 and 2 minutes. 
Note that this delay only affects scheduling _new_ applications -- applications 
that were already running during Master failover are unaffected.
 
 Learn more about getting started with ZooKeeper 
[here](http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html).
 

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index d246100..c4d277f 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1372,7 +1372,7 @@ Hive metastore Parquet table to a Spark SQL Parquet 
table. The reconciliation ru
 1. The reconciled schema contains exactly those fields defined in Hive 
metastore schema.
 
    - Any fields that only appear in the Parquet schema are dropped in the 
reconciled schema.
-   - Any fileds that only appear in the Hive metastore schema are added as 
nullable field in the
+   - Any fields that only appear in the Hive metastore schema are added as 
nullable field in the
      reconciled schema.
 
 #### Metadata Refreshing

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/streaming-flume-integration.md
----------------------------------------------------------------------
diff --git a/docs/streaming-flume-integration.md 
b/docs/streaming-flume-integration.md
index e2d589b..8eeeee7 100644
--- a/docs/streaming-flume-integration.md
+++ b/docs/streaming-flume-integration.md
@@ -30,7 +30,7 @@ See the [Flume's 
documentation](https://flume.apache.org/documentation.html) for
 configuring Flume agents.
 
 #### Configuring Spark Streaming Application
-1. **Linking:** In your SBT/Maven projrect definition, link your streaming 
application against the following artifact (see [Linking 
section](streaming-programming-guide.html#linking) in the main programming 
guide for further information).
+1. **Linking:** In your SBT/Maven project definition, link your streaming 
application against the following artifact (see [Linking 
section](streaming-programming-guide.html#linking) in the main programming 
guide for further information).
 
                groupId = org.apache.spark
                artifactId = spark-streaming-flume_{{site.SCALA_BINARY_VERSION}}

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/streaming-kinesis-integration.md
----------------------------------------------------------------------
diff --git a/docs/streaming-kinesis-integration.md 
b/docs/streaming-kinesis-integration.md
index 5f5e2b9..2a868e8 100644
--- a/docs/streaming-kinesis-integration.md
+++ b/docs/streaming-kinesis-integration.md
@@ -95,7 +95,7 @@ A Kinesis stream can be set up at one of the valid Kinesis 
endpoints with 1 or m
        </div>
        </div>
 
-       - `streamingContext`: StreamingContext containg an application name 
used by Kinesis to tie this Kinesis application to the Kinesis stream
+       - `streamingContext`: StreamingContext containing an application name 
used by Kinesis to tie this Kinesis application to the Kinesis stream
 
        - `[Kinesis app name]`: The application name that will be used to 
checkpoint the Kinesis
                sequence numbers in DynamoDB table.
@@ -216,6 +216,6 @@ de-aggregate records during consumption.
 
 - Checkpointing too frequently will cause excess load on the AWS checkpoint 
storage layer and may lead to AWS throttling.  The provided example handles 
this throttling with a random-backoff-retry strategy.
 
-- If no Kinesis checkpoint info exists when the input DStream starts, it will 
start either from the oldest record available 
(InitialPositionInStream.TRIM_HORIZON) or from the latest tip 
(InitialPostitionInStream.LATEST).  This is configurable.
+- If no Kinesis checkpoint info exists when the input DStream starts, it will 
start either from the oldest record available 
(InitialPositionInStream.TRIM_HORIZON) or from the latest tip 
(InitialPositionInStream.LATEST).  This is configurable.
 - InitialPositionInStream.LATEST could lead to missed records if data is added 
to the stream while no input DStreams are running (and no checkpoint info is 
being stored).
 - InitialPositionInStream.TRIM_HORIZON may lead to duplicate processing of 
records where the impact is dependent on checkpoint frequency and processing 
idempotency.

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/docs/streaming-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/streaming-programming-guide.md 
b/docs/streaming-programming-guide.md
index 4d1932b..5d67a0a 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -158,7 +158,7 @@ JavaReceiverInputDStream<String> lines = 
jssc.socketTextStream("localhost", 9999
 {% endhighlight %}
 
 This `lines` DStream represents the stream of data that will be received from 
the data
-server. Each record in this stream is a line of text. Then, we want to split 
the the lines by
+server. Each record in this stream is a line of text. Then, we want to split 
the lines by
 space into words.
 
 {% highlight java %}

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/graphx/src/main/scala/org/apache/spark/graphx/impl/GraphImpl.scala
----------------------------------------------------------------------
diff --git a/graphx/src/main/scala/org/apache/spark/graphx/impl/GraphImpl.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/impl/GraphImpl.scala
index c5cb533..699731b 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/impl/GraphImpl.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/impl/GraphImpl.scala
@@ -266,7 +266,7 @@ class GraphImpl[VD: ClassTag, ED: ClassTag] protected (
     }
   }
 
-  /** Test whether the closure accesses the the attribute with name 
`attrName`. */
+  /** Test whether the closure accesses the attribute with name `attrName`. */
   private def accessesVertexAttr(closure: AnyRef, attrName: String): Boolean = 
{
     try {
       BytecodeUtils.invokedMethod(closure, classOf[EdgeTriplet[VD, ED]], 
attrName)

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/graphx/src/main/scala/org/apache/spark/graphx/util/GraphGenerators.scala
----------------------------------------------------------------------
diff --git 
a/graphx/src/main/scala/org/apache/spark/graphx/util/GraphGenerators.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/util/GraphGenerators.scala
index 280b6c5..9552229 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/util/GraphGenerators.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/util/GraphGenerators.scala
@@ -166,7 +166,7 @@ object GraphGenerators extends Logging {
   }
 
   /**
-   * This method recursively subdivides the the adjacency matrix into quadrants
+   * This method recursively subdivides the adjacency matrix into quadrants
    * until it picks a single cell. The naming conventions in this paper match
    * those of the R-MAT paper. There are a power of 2 number of nodes in the 
graph.
    * The adjacency matrix looks like:

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala 
b/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
index 8f49423..4be4d6a 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
@@ -1301,7 +1301,7 @@ object ALS extends DefaultParamsReadable[ALS] with 
Logging {
 
   /**
    * Partitioner used by ALS. We requires that getPartition is a projection. 
That is, for any key k,
-   * we have getPartition(getPartition(k)) = getPartition(k). Since the the 
default HashPartitioner
+   * we have getPartition(getPartition(k)) = getPartition(k). Since the 
default HashPartitioner
    * satisfies this requirement, we simply use a type alias here.
    */
   private[recommendation] type ALSPartitioner = 
org.apache.spark.HashPartitioner

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/python/pyspark/sql/functions.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index fdae05d..6894c27 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -479,7 +479,7 @@ def round(col, scale=0):
 
 @since(1.5)
 def shiftLeft(col, numBits):
-    """Shift the the given value numBits left.
+    """Shift the given value numBits left.
 
     >>> sqlContext.createDataFrame([(21,)], ['a']).select(shiftLeft('a', 
1).alias('r')).collect()
     [Row(r=42)]
@@ -490,7 +490,7 @@ def shiftLeft(col, numBits):
 
 @since(1.5)
 def shiftRight(col, numBits):
-    """Shift the the given value numBits right.
+    """Shift the given value numBits right.
 
     >>> sqlContext.createDataFrame([(42,)], ['a']).select(shiftRight('a', 
1).alias('r')).collect()
     [Row(r=21)]
@@ -502,7 +502,7 @@ def shiftRight(col, numBits):
 
 @since(1.5)
 def shiftRightUnsigned(col, numBits):
-    """Unsigned shift the the given value numBits right.
+    """Unsigned shift the given value numBits right.
 
     >>> df = sqlContext.createDataFrame([(-42,)], ['a'])
     >>> df.select(shiftRightUnsigned('a', 1).alias('r')).collect()

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DistinctAggregationRewriter.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DistinctAggregationRewriter.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DistinctAggregationRewriter.scala
index b49885d..7518946 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DistinctAggregationRewriter.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DistinctAggregationRewriter.scala
@@ -88,7 +88,7 @@ import org.apache.spark.sql.types.IntegerType
  *    this aggregate consists of the original group by clause, all the 
requested distinct columns
  *    and the group id. Both de-duplication of distinct column and the 
aggregation of the
  *    non-distinct group take advantage of the fact that we group by the group 
id (gid) and that we
- *    have nulled out all non-relevant columns for the the given group.
+ *    have nulled out all non-relevant columns the given group.
  * 3. Aggregating the distinct groups and combining this with the results of 
the non-distinct
  *    aggregation. In this step we use the group id to filter the inputs for 
the aggregate
  *    functions. The result of the non-distinct group are 'aggregated' by 
using the first operator,

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala
index ec833d6..a474017 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala
@@ -238,7 +238,7 @@ case class HyperLogLogPlusPlus(
       diff * diff
     }
 
-    // Keep moving bounds as long as the the (exclusive) high bound is closer 
to the estimate than
+    // Keep moving bounds as long as the (exclusive) high bound is closer to 
the estimate than
     // the lower (inclusive) bound.
     var low = math.max(nearestEstimateIndex - K + 1, 0)
     var high = math.min(low + K, numEstimates)

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala
index 5b4dc8d..9abe92b 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala
@@ -83,7 +83,7 @@ object GenerateMutableProjection extends 
CodeGenerator[Seq[Expression], () => Mu
         }
     }
 
-    // Evaluate all the the subexpressions.
+    // Evaluate all the subexpressions.
     val evalSubexpr = ctx.subexprFunctions.mkString("\n")
 
     val updates = validExpr.zip(index).map {

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index 7c173cb..8207d64 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -148,7 +148,7 @@ object RandomDataGenerator {
             // for "0001-01-01 00:00:00.000000". We need to find a
             // number that is greater or equals to this number as a valid 
timestamp value.
             while (milliseconds < -62135740800000L) {
-              // 253402329599999L is the the number of milliseconds since
+              // 253402329599999L is the number of milliseconds since
               // January 1, 1970, 00:00:00 GMT for "9999-12-31 
23:59:59.999999".
               milliseconds = rand.nextLong() % 253402329599999L
             }
@@ -163,7 +163,7 @@ object RandomDataGenerator {
             // for "0001-01-01 00:00:00.000000". We need to find a
             // number that is greater or equals to this number as a valid 
timestamp value.
             while (milliseconds < -62135740800000L) {
-              // 253402329599999L is the the number of milliseconds since
+              // 253402329599999L is the number of milliseconds since
               // January 1, 1970, 00:00:00 GMT for "9999-12-31 
23:59:59.999999".
               milliseconds = rand.nextLong() % 253402329599999L
             }

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
index 6340229..7e5c8f2 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
@@ -145,7 +145,7 @@ private[sql] abstract class BaseWriterContainer(
       // If we are appending data to an existing dir, we will only use the 
output committer
       // associated with the file output format since it is not safe to use a 
custom
       // committer for appending. For example, in S3, direct parquet output 
committer may
-      // leave partial data in the destination dir when the the appending job 
fails.
+      // leave partial data in the destination dir when the appending job 
fails.
       //
       // See SPARK-8578 for more details
       logInfo(

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala
index 6f3bb0a..7f54ea9 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala
@@ -55,7 +55,7 @@ import org.apache.spark.sql.execution.{ShuffledRowRDD, 
SparkPlan}
  *    If this coordinator has made the decision on how to shuffle data, this 
[[ShuffleExchange]]
  *    will immediately get its corresponding post-shuffle [[ShuffledRowRDD]].
  *  - If this coordinator has not made the decision on how to shuffle data, it 
will ask those
- *    registered [[ShuffleExchange]]s to submit their pre-shuffle stages. 
Then, based on the the
+ *    registered [[ShuffleExchange]]s to submit their pre-shuffle stages. 
Then, based on the
  *    size statistics of pre-shuffle partitions, this coordinator will 
determine the number of
  *    post-shuffle partitions and pack multiple pre-shuffle partitions with 
continuous indices
  *    to a single post-shuffle partition whenever necessary.

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index 97c6992..510894a 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -1782,7 +1782,7 @@ object functions extends LegacyFunctions {
   def round(e: Column, scale: Int): Column = withExpr { Round(e.expr, 
Literal(scale)) }
 
   /**
-   * Shift the the given value numBits left. If the given value is a long 
value, this function
+   * Shift the given value numBits left. If the given value is a long value, 
this function
    * will return a long value else it will return an integer value.
    *
    * @group math_funcs
@@ -1791,7 +1791,7 @@ object functions extends LegacyFunctions {
   def shiftLeft(e: Column, numBits: Int): Column = withExpr { 
ShiftLeft(e.expr, lit(numBits).expr) }
 
   /**
-   * Shift the the given value numBits right. If the given value is a long 
value, it will return
+   * Shift the given value numBits right. If the given value is a long value, 
it will return
    * a long value else it will return an integer value.
    *
    * @group math_funcs
@@ -1802,7 +1802,7 @@ object functions extends LegacyFunctions {
   }
 
   /**
-   * Unsigned shift the the given value numBits right. If the given value is a 
long value,
+   * Unsigned shift the given value numBits right. If the given value is a 
long value,
    * it will return a long value else it will return an integer value.
    *
    * @group math_funcs

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala
index 62710e7..bb51358 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala
@@ -173,10 +173,10 @@ trait StreamTest extends QueryTest with Timeouts {
     testStream(stream.toDF())(actions: _*)
 
   /**
-   * Executes the specified actions on the the given streaming DataFrame and 
provides helpful
+   * Executes the specified actions on the given streaming DataFrame and 
provides helpful
    * error messages in the case of failures or incorrect answers.
    *
-   * Note that if the stream is not explictly started before an action that 
requires it to be
+   * Note that if the stream is not explicitly started before an action that 
requires it to be
    * running then it will be automatically started before performing any other 
actions.
    */
   def testStream(stream: DataFrame)(actions: StreamAction*): Unit = {

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
----------------------------------------------------------------------
diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
index 865197e..5f9952a 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
@@ -721,13 +721,13 @@ abstract class HiveThriftServer2Test extends 
SparkFunSuite with BeforeAndAfterAl
   }
 
   /**
-   * String to scan for when looking for the the thrift binary endpoint 
running.
+   * String to scan for when looking for the thrift binary endpoint running.
    * This can change across Hive versions.
    */
   val THRIFT_BINARY_SERVICE_LIVE = "Starting ThriftBinaryCLIService on port"
 
   /**
-   * String to scan for when looking for the the thrift HTTP endpoint running.
+   * String to scan for when looking for the thrift HTTP endpoint running.
    * This can change across Hive versions.
    */
   val THRIFT_HTTP_SERVICE_LIVE = "Started ThriftHttpCLIService in http"

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/streaming/src/main/scala/org/apache/spark/streaming/util/StateMap.scala
----------------------------------------------------------------------
diff --git 
a/streaming/src/main/scala/org/apache/spark/streaming/util/StateMap.scala 
b/streaming/src/main/scala/org/apache/spark/streaming/util/StateMap.scala
index 4ccc905..2be1d6d 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/util/StateMap.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/util/StateMap.scala
@@ -364,7 +364,7 @@ private[streaming] object OpenHashMapBasedStateMap {
   }
 
   /**
-   * Internal class to represent a marker the demarkate the the end of all 
state data in the
+   * Internal class to represent a marker the demarkate the end of all state 
data in the
    * serialized bytes.
    */
   class LimitMarker(val num: Int) extends Serializable

http://git-wip-us.apache.org/repos/asf/spark/blob/024482bf/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
----------------------------------------------------------------------
diff --git 
a/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
 
b/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
index b8daa50..2ac9e33 100644
--- 
a/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
+++ 
b/yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
@@ -151,7 +151,7 @@ private[yarn] class AMDelegationTokenRenewer(
     // passed in already has tokens for that FS even if the tokens are expired 
(it really only
     // checks if there are tokens for the service, and not if they are valid). 
So the only real
     // way to get new tokens is to make sure a different Credentials object is 
used each time to
-    // get new tokens and then the new tokens are copied over the the current 
user's Credentials.
+    // get new tokens and then the new tokens are copied over the current 
user's Credentials.
     // So:
     // - we login as a different user and get the UGI
     // - use that UGI to get the tokens (see doAs block below)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

spark git commit: [MINOR][DOCS] Fix all typos in markdown files of `doc` and similar patterns in other comments

Reply via email to