Repository: spark Updated Branches: refs/heads/master b0cafdb6c -> 5d188a697
[DOC][MINOR] Fixed minor errors in feature.ml user guide doc ## What changes were proposed in this pull request? Fixed some minor errors found when reviewing feature.ml user guide ## How was this patch tested? built docs locally Author: Bryan Cutler <[email protected]> Closes #12940 from BryanCutler/feature.ml-doc_fixes-DOCS-MINOR. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5d188a69 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5d188a69 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5d188a69 Branch: refs/heads/master Commit: 5d188a6970ef97d11656ab39255109fefc42203d Parents: b0cafdb Author: Bryan Cutler <[email protected]> Authored: Sat May 7 11:20:38 2016 +0200 Committer: Nick Pentreath <[email protected]> Committed: Sat May 7 11:20:38 2016 +0200 ---------------------------------------------------------------------- docs/ml-features.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/5d188a69/docs/ml-features.md ---------------------------------------------------------------------- diff --git a/docs/ml-features.md b/docs/ml-features.md index 237e93a..c79bcac 100644 --- a/docs/ml-features.md +++ b/docs/ml-features.md @@ -127,7 +127,7 @@ Assume that we have the following DataFrame with columns `id` and `texts`: 1 | Array("a", "b", "b", "c", "a") ~~~~ -each row in`texts` is a document of type Array[String]. +each row in `texts` is a document of type Array[String]. Invoking fit of `CountVectorizer` produces a `CountVectorizerModel` with vocabulary (a, b, c), then the output column "vector" after transformation contains: @@ -185,7 +185,7 @@ for more details on the API. <div data-lang="scala" markdown="1"> Refer to the [Tokenizer Scala docs](api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) -and the [RegexTokenizer Scala docs](api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) +and the [RegexTokenizer Scala docs](api/scala/index.html#org.apache.spark.ml.feature.RegexTokenizer) for more details on the API. {% include_example scala/org/apache/spark/examples/ml/TokenizerExample.scala %} @@ -775,7 +775,7 @@ The rescaled value for a feature E is calculated as, \end{equation}` For the case `E_{max} == E_{min}`, `Rescaled(e_i) = 0.5 * (max + min)` -Note that since zero values will probably be transformed to non-zero values, output of the transformer will be DenseVector even for sparse input. +Note that since zero values will probably be transformed to non-zero values, output of the transformer will be `DenseVector` even for sparse input. The following example demonstrates how to load a dataset in libsvm format and then rescale each feature to [0, 1]. @@ -801,6 +801,7 @@ for more details on the API. <div data-lang="python" markdown="1"> Refer to the [MinMaxScaler Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.MinMaxScaler) +and the [MinMaxScalerModel Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.MinMaxScalerModel) for more details on the API. {% include_example python/ml/min_max_scaler_example.py %} @@ -841,6 +842,7 @@ for more details on the API. <div data-lang="python" markdown="1"> Refer to the [MaxAbsScaler Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.MaxAbsScaler) +and the [MaxAbsScalerModel Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.MaxAbsScalerModel) for more details on the API. {% include_example python/ml/max_abs_scaler_example.py %} --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
