This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch branch-4.1
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.1 by this push:
new e7aee3ade915 [SPARK-55893][EXAMPLES] Fix `SparkDataFramePi` to match
with `SparkPi`
e7aee3ade915 is described below
commit e7aee3ade9158e14997ebf271413e4cf2c328a84
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Mon Mar 9 14:51:35 2026 -0700
[SPARK-55893][EXAMPLES] Fix `SparkDataFramePi` to match with `SparkPi`
### What changes were proposed in this pull request?
This PR fixes the `spark.range` start value in `SparkDataFramePi` from `0`
to `1` to match `SparkPi`'s `1 until n`.
### Why are the changes needed?
We added `SparkDataFramePi` example newly at Apache Spark 4.0.0 to follow
`SparkPi` example.
- https://github.com/apache/spark/pull/49617
`SparkPi` uses `1 until n` which generates `n - 1` samples and divides by
`(n - 1)`. However, `SparkDataFramePi` uses `spark.range(0, n)` which generates
`n` samples but still divides by `(n - 1)`, resulting in an inaccurate pi
approximation.
https://github.com/apache/spark/blob/897e1b828b1a66e0aa7b8a959897fc23f7c29c0c/examples/src/main/scala/org/apache/spark/examples/SparkPi.scala#L34-L39
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
This is a simple example fix. Verified by code inspection against
`SparkPi.scala`.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (claude-opus-4-6)
Closes #54696 from dongjoon-hyun/SPARK-55893.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 11069d4cfdeddaa52c092265793a9b05b0a30d58)
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
a/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala
b/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala
index 0102b2d291e9..bddd6f9f206c 100644
---
a/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala
+++
b/examples/src/main/scala/org/apache/spark/examples/sql/SparkDataFramePi.scala
@@ -31,7 +31,7 @@ object SparkDataFramePi {
import spark.implicits._
val slices = if (args.length > 0) args(0).toInt else 2
val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
- val count = spark.range(0, n, 1, slices)
+ val count = spark.range(1, n, 1, slices)
.select((pow(rand() * 2 - 1, lit(2)) + pow(rand() * 2 - 1,
lit(2))).as("v"))
.where($"v" <= 1)
.count()
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]