andygrove opened a new pull request, #4579:
URL: https://github.com/apache/datafusion-comet/pull/4579
## Which issue does this PR close?
Closes #4554.
## Rationale for this change
`try_make_timestamp` rewrites to `MakeTimestamp(failOnError = false)`, which
Spark's `doGenCode` implements with a `try { ... } catch (DateTimeException e)
{ ev.isNull = true; }` block. With Comet's codegen dispatcher routing
`MakeTimestamp` through Spark's own codegen, the kernel was honoring the
input-null short-circuit but skipping the post-eval `ev.isNull` guard, so
invalid date/time components (month=13, day=32, hour=25, ...) returned stale
`ev.value` bytes instead of NULL.
The dispatcher's `defaultBody` conflated `NullIntolerant`'s "null in → null
out" guarantee with the converse "non-null in → non-null out", which
`MakeTimestamp(failOnError=false)` does not satisfy. The same path also affects
non-ANSI `make_timestamp` directly, since that is the same expression with the
same flag.
## What changes are included in this PR?
- `CometBatchKernelCodegen.defaultBody`: in the `NullIntolerant &&
allNullIntolerant` branch, when the bound expression is nullable, wrap the
post-`ev.code` write in `if (ev.isNull) setNull else write`. Non-nullable roots
keep the direct write (the existing optimization).
- `CometCodegenSourceSuite`: lock in the fix with a generated-source
assertion that `MakeTimestamp(failOnError=false)` emits two `setNull` sites —
input short-circuit + post-eval guard.
- `make_timestamp.sql`: add column and literal queries with invalid
components covering month/day/hour/minute out-of-range. These ride the default
`failOnError=false` path (ANSI off in `CometSqlFileTestSuite`).
- `try_make_timestamp.sql` (new, Spark 4.0+): direct coverage for
`try_make_timestamp` with both column and literal invalid inputs, plus a
sentinel valid query so the dispatcher must run natively.
## How are these changes tested?
- `CometCodegenSourceSuite` (Spark 3.5): 51/51 pass; the new test fails on
the unmodified codegen and passes with the fix.
- `CometSqlFileTestSuite` `make_timestamp` filter (Spark 3.5 and 4.0): all
three fixtures pass; on 3.5 `try_make_timestamp.sql` is skipped via
`MinSparkVersion: 4.0`.
- All datetime SQL fixtures (Spark 3.5): 89/89 pass.
- `CometCodegenSuite`, `CometCodegenHOFSuite`,
`CometTemporalExpressionSuite`: pass.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]