Re: [PR] [SEDONA-625] Add ST_GeneratePoints [sedona]

via GitHub Mon, 15 Jul 2024 06:22:34 -0700


Kontinuation commented on code in PR #1520:
URL: https://github.com/apache/sedona/pull/1520#discussion_r1677822649



##########
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Functions.scala:
##########
@@ -1437,6 +1439,60 @@ case class ST_ForceRHR(inputExpressions: Seq[Expression])
   }
 }
 
+case class ST_GeneratePoints(inputExpressions: Seq[Expression], randomSeed: 
Option[Long] = None)
+    extends Expression
+    with Nondeterministic
+    with ExpectsInputTypes
+    with CodegenFallback
+    with ExpressionWithRandomSeed {
+
+  def this(inputExpressions: Seq[Expression]) = this(inputExpressions, 
Some(0L))

Review Comment:
   `ExpressionWithRandomSeed` has binary incompatible changes since Spark 3.0, 
so we cannot use it to write binary compatible code for Spark 3.0 ~ 3.5. 
Maintaining compatibility with old Spark versions (<= 3.2) is really a burden 
for us.
   
   We can initialize `ST_GeneratePoints` in the old Spark 2.3 way:
   
   ```
   case class ST_GeneratePoints(inputExpressions: Seq[Expression], randomSeed: 
Long) extends RDG {
   
     def this(inputExpressions: Seq[Expression]) = this(inputExpressions, 
Utils.random.nextLong())
   ```
   
   The consequence is that the random seed will be fixed for multiple 
minibatches when running streaming jobs. However, that's the best we can do to 
maintain compatibility with various Spark versions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [SEDONA-625] Add ST_GeneratePoints [sedona]

Reply via email to