andygrove opened a new pull request, #3890: URL: https://github.com/apache/datafusion-comet/pull/3890
## Which issue does this PR close? N/A - proactive audit of existing expression. ## Rationale for this change Audited the `array_insert` expression against Spark 3.4.3, 3.5.8, and 4.0.1 source code to verify correctness and identify test coverage gaps. Found a metadata bug and significant missing test coverage. ## What changes are included in this PR? **Bug fix:** - Fixed `nullable()` in `ArrayInsert` (Rust) to account for `pos_expr` nullability, matching Spark's `first.nullable | second.nullable`. Previously only checked `src_array_expr`, so if the position column was nullable but the array was not, the metadata incorrectly reported "not nullable". **Test coverage (expanded from 1 query to 27+ queries):** - Column and literal arguments for int, string, boolean, double, float, long, short, and byte array types - NULL handling: null array, null position, null value, arrays with existing nulls - Positive out-of-bounds insertion with null padding (pos=5, pos=10) - Negative indices: in-range (-1 through -4) and out-of-bounds (-5, -10) with null padding - Special double values: NaN, Infinity, -Infinity, negative zero - Multibyte UTF-8 characters and empty strings - Legacy negative index mode (`spark.sql.legacy.negativeIndexInArrayInsert=true`) in a separate test file - Parquet dictionary encoding via ConfigMatrix **Documentation:** - Added `docs/source/contributor-guide/expression-audit-log.md` to track audited expressions - Updated audit skill with Step 8 to maintain the audit log ## How are these changes tested? All new SQL file tests pass against both Spark and Comet: ``` ./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite array_insert" -Dtest=none ``` Rust unit tests and clippy pass. Full build succeeds. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
