This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 11069d4cfded [SPARK-55893][EXAMPLES] Fix `SparkDataFramePi` to match 
with `SparkPi`
11069d4cfded is described below

commit 11069d4cfdeddaa52c092265793a9b05b0a30d58
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Mon Mar 9 14:51:35 2026 -0700

    [SPARK-55893][EXAMPLES] Fix `SparkDataFramePi` to match with `SparkPi`
    
    ### What changes were proposed in this pull request?
    
    This PR fixes the `spark.range` start value in `SparkDataFramePi` from `0` 
to `1` to match `SparkPi`'s `1 until n`.
    
    ### Why are the changes needed?
    
    We added `SparkDataFramePi` example newly at Apache Spark 4.0.0 to follow 
`SparkPi` example.
    - https://github.com/apache/spark/pull/49617
    
    `SparkPi` uses `1 until n` which generates `n - 1` samples and divides by 
`(n - 1)`. However, `SparkDataFramePi` uses `spark.range(0, n)` which generates 
`n` samples but still divides by `(n - 1)`, resulting in an inaccurate pi 
approximation.
    
    
https://github.com/apache/spark/blob/897e1b828b1a66e0aa7b8a959897fc23f7c29c0c/examples/src/main/scala/org/apache/spark/examples/SparkPi.scala#L34-L39
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    This is a simple example fix. Verified by code inspection against 
`SparkPi.scala`.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Claude Code (claude-opus-4-6)
    
    Closes #54696 from dongjoon-hyun/SPARK-55893.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 .../src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala 
b/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala
index 0102b2d291e9..bddd6f9f206c 100644
--- 
a/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala
+++ 
b/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala
@@ -31,7 +31,7 @@ object SparkDataFramePi {
     import spark.implicits._
     val slices = if (args.length > 0) args(0).toInt else 2
     val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
-    val count = spark.range(0, n, 1, slices)
+    val count = spark.range(1, n, 1, slices)
       .select((pow(rand() * 2 - 1, lit(2)) + pow(rand() * 2 - 1, 
lit(2))).as("v"))
       .where($"v" <= 1)
       .count()


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to