(spark) branch branch-4.0 updated: [SPARK-50997][DOCS] Remove `\t` character in `docs`

dongjoon Sun, 26 Jan 2025 16:49:26 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-4.0
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-4.0 by this push:
     new 068e44877a3f [SPARK-50997][DOCS] Remove `\t` character in `docs`
068e44877a3f is described below

commit 068e44877a3f37f6111b90142efb4d92f22d998d
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Sun Jan 26 16:48:37 2025 -0800

    [SPARK-50997][DOCS] Remove `\t` character in `docs`
    
    ### What changes were proposed in this pull request?
    
    This PR aims to remove `\t` characters in `docs`.
    
    ### Why are the changes needed?
    
    This is a clean-up in order to be consistent in `docs`.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, these are white-space character changes in docs.
    
    ### How was this patch tested?
    
    Manual review.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #49682 from dongjoon-hyun/SPARK-50997.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
    (cherry picked from commit 77096a29e4fe4ccaf1df8f90b5223e88b03251bf)
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 docs/mllib-decision-tree.md                 | 14 +++++++-------
 docs/mllib-ensembles.md                     | 12 ++++++------
 docs/running-on-kubernetes.md               |  2 +-
 docs/sql-ref-ansi-compliance.md             |  4 ++--
 docs/sql-ref-syntax-qry-select-aggregate.md | 22 ++++++++++-----------
 docs/sql-ref-syntax-qry-select-transform.md | 22 ++++++++++-----------
 docs/streaming-kafka-0-10-integration.md    |  6 +++---
 docs/streaming-programming-guide.md         |  6 +++---
 docs/streaming/performance-tips.md          |  4 ++--
 docs/web-ui.md                              | 30 ++++++++++++++---------------
 10 files changed, 61 insertions(+), 61 deletions(-)

diff --git a/docs/mllib-decision-tree.md b/docs/mllib-decision-tree.md
index 0d9886315e28..601eb1513ca8 100644
--- a/docs/mllib-decision-tree.md
+++ b/docs/mllib-decision-tree.md
@@ -58,19 +58,19 @@ impurity measure for regression (variance).
   <tbody>
     <tr>
       <td>Gini impurity</td>
-         <td>Classification</td>
-         <td>$\sum_{i=1}^{C} f_i(1-f_i)$</td><td>$f_i$ is the frequency of 
label $i$ at a node and $C$ is the number of unique labels.</td>
+      <td>Classification</td>
+      <td>$\sum_{i=1}^{C} f_i(1-f_i)$</td><td>$f_i$ is the frequency of label 
$i$ at a node and $C$ is the number of unique labels.</td>
     </tr>
     <tr>
       <td>Entropy</td>
-         <td>Classification</td>
-         <td>$\sum_{i=1}^{C} -f_ilog(f_i)$</td><td>$f_i$ is the frequency of 
label $i$ at a node and $C$ is the number of unique labels.</td>
+      <td>Classification</td>
+      <td>$\sum_{i=1}^{C} -f_ilog(f_i)$</td><td>$f_i$ is the frequency of 
label $i$ at a node and $C$ is the number of unique labels.</td>
     </tr>
     <tr>
       <td>Variance</td>
-         <td>Regression</td>
-     <td>$\frac{1}{N} \sum_{i=1}^{N} (y_i - \mu)^2$</td><td>$y_i$ is label for 
an instance,
-         $N$ is the number of instances and $\mu$ is the mean given by 
$\frac{1}{N} \sum_{i=1}^N y_i$.</td>
+      <td>Regression</td>
+      <td>$\frac{1}{N} \sum_{i=1}^{N} (y_i - \mu)^2$</td><td>$y_i$ is label 
for an instance,
+      $N$ is the number of instances and $\mu$ is the mean given by 
$\frac{1}{N} \sum_{i=1}^N y_i$.</td>
     </tr>
   </tbody>
 </table>
diff --git a/docs/mllib-ensembles.md b/docs/mllib-ensembles.md
index 8f4e6b1088b3..a161293072bc 100644
--- a/docs/mllib-ensembles.md
+++ b/docs/mllib-ensembles.md
@@ -198,18 +198,18 @@ Notation: $N$ = number of instances. $y_i$ = label of 
instance $i$.  $x_i$ = fea
   <tbody>
     <tr>
       <td>Log Loss</td>
-         <td>Classification</td>
-         <td>$2 \sum_{i=1}^{N} \log(1+\exp(-2 y_i F(x_i)))$</td><td>Twice 
binomial negative log likelihood.</td>
+      <td>Classification</td>
+      <td>$2 \sum_{i=1}^{N} \log(1+\exp(-2 y_i F(x_i)))$</td><td>Twice 
binomial negative log likelihood.</td>
     </tr>
     <tr>
       <td>Squared Error</td>
-         <td>Regression</td>
-         <td>$\sum_{i=1}^{N} (y_i - F(x_i))^2$</td><td>Also called L2 loss.  
Default loss for regression tasks.</td>
+      <td>Regression</td>
+      <td>$\sum_{i=1}^{N} (y_i - F(x_i))^2$</td><td>Also called L2 loss.  
Default loss for regression tasks.</td>
     </tr>
     <tr>
       <td>Absolute Error</td>
-         <td>Regression</td>
-     <td>$\sum_{i=1}^{N} |y_i - F(x_i)|$</td><td>Also called L1 loss.  Can be 
more robust to outliers than Squared Error.</td>
+      <td>Regression</td>
+      <td>$\sum_{i=1}^{N} |y_i - F(x_i)|$</td><td>Also called L1 loss.  Can be 
more robust to outliers than Squared Error.</td>
     </tr>
   </tbody>
 </table>
diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md
index bd5e1956a627..79d0554a5ae2 100644
--- a/docs/running-on-kubernetes.md
+++ b/docs/running-on-kubernetes.md
@@ -1470,7 +1470,7 @@ See the [configuration page](configuration.html) for 
information on Spark config
   <td><code>spark.kubernetes.executor.scheduler.name</code></td>
   <td>(none)</td>
   <td>
-       Specify the scheduler name for each executor pod.
+    Specify the scheduler name for each executor pod.
   </td>
   <td>3.0.0</td>
 </tr>
diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 3b1138b9ee0e..45b8e9a4dcea 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -281,8 +281,8 @@ Note, arithmetic operations have special rules to calculate 
the least common typ
 | Operation  | Result precision                         | Result scale        |
 |------------|------------------------------------------|---------------------|
 | e1 + e2    | max(s1, s2) + max(p1 - s1, p2 - s2) + 1  | max(s1, s2)         |
-| e1 - e2    | max(s1, s2) + max(p1 - s1, p2 - s2) + 1 | max(s1, s2)         |
-| e1 * e2    | p1 + p2 + 1                             | s1 + s2             |
+| e1 - e2    | max(s1, s2) + max(p1 - s1, p2 - s2) + 1  | max(s1, s2)         |
+| e1 * e2    | p1 + p2 + 1                              | s1 + s2             |
 | e1 / e2    | p1 - s1 + s2 + max(6, s1 + p2 + 1)       | max(6, s1 + p2 + 1) |
 | e1 % e2    | min(p1 - s1, p2 - s2) + max(s1, s2)      | max(s1, s2)         |
 
diff --git a/docs/sql-ref-syntax-qry-select-aggregate.md 
b/docs/sql-ref-syntax-qry-select-aggregate.md
index e0e294cc50c2..8cb371486335 100644
--- a/docs/sql-ref-syntax-qry-select-aggregate.md
+++ b/docs/sql-ref-syntax-qry-select-aggregate.md
@@ -94,23 +94,23 @@ SELECT * FROM basic_pays;
 +-----------------+----------+------+
 |    employee_name|department|salary|
 +-----------------+----------+------+
-|      Anthony Bow|Accounting| 6627|
+|      Anthony Bow|Accounting|  6627|
 |      Barry Jones|       SCM| 10586|
-|     Diane Murphy|Accounting| 8435|
-|   Foon Yue Tseng|     Sales| 6660|
+|     Diane Murphy|Accounting|  8435|
+|   Foon Yue Tseng|     Sales|  6660|
 |    George Vanauf|     Sales| 10563|
 |    Gerard Bondur|Accounting| 11472|
-| Gerard Hernandez|       SCM| 6949|
-|    Jeff Firrelli|Accounting| 8992|
-|   Julie Firrelli|     Sales| 9181|
+| Gerard Hernandez|       SCM|  6949|
+|    Jeff Firrelli|Accounting|  8992|
+|   Julie Firrelli|     Sales|  9181|
 |       Larry Bott|       SCM| 11798|
-|  Leslie Jennings|        IT| 8113|
-|  Leslie Thompson|        IT| 5186|
+|  Leslie Jennings|        IT|  8113|
+|  Leslie Thompson|        IT|  5186|
 |      Loui Bondur|       SCM| 10449|
-|   Mary Patterson|Accounting| 9998|
+|   Mary Patterson|Accounting|  9998|
 |  Pamela Castillo|       SCM| 11303|
-|  Steve Patterson|     Sales| 9441|
-|William Patterson|Accounting| 8870|
+|  Steve Patterson|     Sales|  9441|
+|William Patterson|Accounting|  8870|
 +-----------------+----------+------+
 
 SELECT
diff --git a/docs/sql-ref-syntax-qry-select-transform.md 
b/docs/sql-ref-syntax-qry-select-transform.md
index 2ca69727a704..18d05db3ddf4 100644
--- a/docs/sql-ref-syntax-qry-select-transform.md
+++ b/docs/sql-ref-syntax-qry-select-transform.md
@@ -238,17 +238,17 @@ SELECT TRANSFORM(zip_code, name, age)
     USING 'cat'
 FROM person
 WHERE zip_code > 94500;
-+-------+---------------------+
-|    key|                value|
-+-------+---------------------+
-|  94588|        Anil K    27|
-|  94588|        John V    \N|
-|  94511|      Aryan B.    18|
-|  94511|       David K    42|
-|  94588|       Zen Hui    50|
-|  94588|        Dan Li    18|
-|  94511|      Lalit B.    \N|
-+-------+---------------------+
++-------+----------------+
+|    key|           value|
++-------+----------------+
+|  94588|    Anil K    27|
+|  94588|    John V    \N|
+|  94511|  Aryan B.    18|
+|  94511|   David K    42|
+|  94588|   Zen Hui    50|
+|  94588|    Dan Li    18|
+|  94511|  Lalit B.    \N|
++-------+----------------+
 ```
 
 ### Related Statements
diff --git a/docs/streaming-kafka-0-10-integration.md 
b/docs/streaming-kafka-0-10-integration.md
index 0f5964786fbc..6af14c17476d 100644
--- a/docs/streaming-kafka-0-10-integration.md
+++ b/docs/streaming-kafka-0-10-integration.md
@@ -26,9 +26,9 @@ there are notable differences in usage.
 ### Linking
 For Scala/Java applications using SBT/Maven project definitions, link your 
streaming application with the following artifact (see [Linking 
section](streaming-programming-guide.html#linking) in the main programming 
guide for further information).
 
-       groupId = org.apache.spark
-       artifactId = spark-streaming-kafka-0-10_{{site.SCALA_BINARY_VERSION}}
-       version = {{site.SPARK_VERSION_SHORT}}
+    groupId = org.apache.spark
+    artifactId = spark-streaming-kafka-0-10_{{site.SCALA_BINARY_VERSION}}
+    version = {{site.SPARK_VERSION_SHORT}}
 
 **Do not** manually add dependencies on `org.apache.kafka` artifacts (e.g. 
`kafka-clients`).  The `spark-streaming-kafka-0-10` artifact has the 
appropriate transitive dependencies already, and different versions may be 
incompatible in hard to diagnose ways.
 
diff --git a/docs/streaming-programming-guide.md 
b/docs/streaming-programming-guide.md
index 3d39331eb15f..51d50d7e1bf8 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -414,7 +414,7 @@ Similar to Spark, Spark Streaming is available through 
Maven Central. To write y
 <div class="codetabs">
 <div data-lang="Maven" markdown="1">
 
-       <dependency>
+    <dependency>
         <groupId>org.apache.spark</groupId>
         <artifactId>spark-streaming_{{site.SCALA_BINARY_VERSION}}</artifactId>
         <version>{{site.SPARK_VERSION}}</version>
@@ -423,7 +423,7 @@ Similar to Spark, Spark Streaming is available through 
Maven Central. To write y
 </div>
 <div data-lang="SBT" markdown="1">
 
-       libraryDependencies += "org.apache.spark" % 
"spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}" % 
"provided"
+    libraryDependencies += "org.apache.spark" % 
"spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}" % 
"provided"
 </div>
 </div>
 
@@ -2191,7 +2191,7 @@ improve the performance of your application. At a high 
level, you need to consid
 1. Reducing the processing time of each batch of data by efficiently using 
cluster resources.
 
 2. Setting the right batch size such that the batches of data can be processed 
as fast as they
-       are received (that is, data processing keeps up with the data 
ingestion).
+   are received (that is, data processing keeps up with the data ingestion).
 
 ## Reducing the Batch Processing Times
 There are a number of optimizations that can be done in Spark to minimize the 
processing time of
diff --git a/docs/streaming/performance-tips.md 
b/docs/streaming/performance-tips.md
index 25b29ee7097b..7fdebddbbeee 100644
--- a/docs/streaming/performance-tips.md
+++ b/docs/streaming/performance-tips.md
@@ -43,9 +43,9 @@ val stream = spark.readStream
       .load()
 val query = stream.writeStream
      .format("kafka")
-       .option("topic", "out")
+     .option("topic", "out")
      .option("checkpointLocation", "/tmp/checkpoint")
-       .option("asyncProgressTrackingEnabled", "true")
+     .option("asyncProgressTrackingEnabled", "true")
      .start()
 ```
 
diff --git a/docs/web-ui.md b/docs/web-ui.md
index c500860a201b..9173ddef81d3 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -80,15 +80,15 @@ This page displays the details of a specific job identified 
by its job ID.
 </p>
 
 * List of stages (grouped by state active, pending, completed, skipped, and 
failed)
-       * Stage ID
-       * Description of the stage
-       * Submitted timestamp
-       * Duration of the stage
-       * Tasks progress bar
-       * Input: Bytes read from storage in this stage
-       * Output: Bytes written in storage in this stage
-       * Shuffle read: Total shuffle bytes and records read, includes both 
data read locally and data read from remote executors
-       * Shuffle write: Bytes and records written to disk in order to be read 
by a shuffle in a future stage
+    * Stage ID
+    * Description of the stage
+    * Submitted timestamp
+    * Duration of the stage
+    * Tasks progress bar
+    * Input: Bytes read from storage in this stage
+    * Output: Bytes written in storage in this stage
+    * Shuffle read: Total shuffle bytes and records read, includes both data 
read locally and data read from remote executors
+    * Shuffle write: Bytes and records written to disk in order to be read by 
a shuffle in a future stage
 
 <p style="text-align: center;">
   <img src="img/JobPageDetail3.png" title="DAG" alt="DAG">
@@ -479,12 +479,12 @@ The third section has the SQL statistics of the submitted 
operations.
 * **Duration time** is the difference between close time and start time.
 * **Statement** is the operation being executed.
 * **State** of the process.
-       * _Started_, first state, when the process begins.
-       * _Compiled_, execution plan generated.
-       * _Failed_, final state when the execution failed or finished with 
error.
-       * _Canceled_, final state when the execution is canceled.
-       * _Finished_ processing and waiting to fetch results.
-       * _Closed_, final state when client closed the statement.
+    * _Started_, first state, when the process begins.
+    * _Compiled_, execution plan generated.
+    * _Failed_, final state when the execution failed or finished with error.
+    * _Canceled_, final state when the execution is canceled.
+    * _Finished_ processing and waiting to fetch results.
+    * _Closed_, final state when client closed the statement.
 * **Detail** of the execution plan with parsed logical plan, analyzed logical 
plan, optimized logical plan and physical plan or errors in the SQL statement.
 
 <p style="text-align: center;">


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch branch-4.0 updated: [SPARK-50997][DOCS] Remove `\t` character in `docs`

Reply via email to