mbutrovich commented on PR #4417: URL: https://github.com/apache/datafusion-comet/pull/4417#issuecomment-4545399499
Awesome to see the codegen framework being put to good use, and bugs being found and fixed! A couple of things on the broader dispatcher pattern worth thinking about before the second wave of expressions lands. The sentinel `query` after each `expect_error` block is cool. I checked `make_timestamp_ansi.sql`, `timestamp_millis_ansi.sql`, and `to_unix_timestamp_ansi.sql` and the sentinel is in all three. The concern is that nothing in `CometSqlFileTestSuite` or `SqlFileTestParser` enforces "any fixture combining `expect_error` with `spark.comet.exec.scalaUDF.codegen.enabled=true` must include a non-error sentinel using `checkSparkAnswerAndOperator`." As more codegen-dispatched ANSI expressions adopt this fixture pattern, a forgotten sentinel would let a regression silently pass. Could the parser flag that shape, or at minimum could a comment in `SqlFileTestParser` document the requirement so the next contributor isn't relying on conventions absorbed by osmosis? On the cache-key side, three of the new expressions are `failOnError`-sensitive (`MakeTimestamp`, `MillisToTimestamp`, `ToUnixTimestamp`), `MonthsBetween` carries a `roundOff` flag, and several are timezone-aware. The dispatcher caches kernels by closure-serialized bytes, which should naturally differentiate them, but I don't see a test pinning that two plans of the same expression class differing only in `failOnError` (or `roundOff`, or session timezone) compile to distinct kernels rather than colliding in the cache. A small unit test in `CometScalaUDFCodegen`'s suite that asserts the serialized bytes diverge for those pairs would lock in the invariant cheaply. Worth adding while the shapes are still fresh? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
