mbutrovich commented on PR #4417:
URL: 
https://github.com/apache/datafusion-comet/pull/4417#issuecomment-4545399499

   Awesome to see the codegen framework being put to good use, and bugs being 
found and fixed!
   
   A couple of things on the broader dispatcher pattern worth thinking about 
before the second wave of expressions lands.
   
   The sentinel `query` after each `expect_error` block is cool. I checked 
`make_timestamp_ansi.sql`, `timestamp_millis_ansi.sql`, and 
`to_unix_timestamp_ansi.sql` and the sentinel is in all three. The concern is 
that nothing in `CometSqlFileTestSuite` or `SqlFileTestParser` enforces "any 
fixture combining `expect_error` with 
`spark.comet.exec.scalaUDF.codegen.enabled=true` must include a non-error 
sentinel using `checkSparkAnswerAndOperator`." As more codegen-dispatched ANSI 
expressions adopt this fixture pattern, a forgotten sentinel would let a 
regression silently pass. Could the parser flag that shape, or at minimum could 
a comment in `SqlFileTestParser` document the requirement so the next 
contributor isn't relying on conventions absorbed by osmosis?
   
   On the cache-key side, three of the new expressions are 
`failOnError`-sensitive (`MakeTimestamp`, `MillisToTimestamp`, 
`ToUnixTimestamp`), `MonthsBetween` carries a `roundOff` flag, and several are 
timezone-aware. The dispatcher caches kernels by closure-serialized bytes, 
which should naturally differentiate them, but I don't see a test pinning that 
two plans of the same expression class differing only in `failOnError` (or 
`roundOff`, or session timezone) compile to distinct kernels rather than 
colliding in the cache. A small unit test in `CometScalaUDFCodegen`'s suite 
that asserts the serialized bytes diverge for those pairs would lock in the 
invariant cheaply. Worth adding while the shapes are still fresh?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to