crm26 opened a new pull request, #21861: URL: https://github.com/apache/datafusion/pull/21861
## Which issue does this PR close? Part of #21536 — split of #21371 into one-function-per-PR. ## Rationale for this change Adds `inner_product(array1, array2)` — the dot product of two equal-length numeric arrays, returning `Float64`. Computed as `sum(array1[i] * array2[i])`. ## What changes are included in this PR? Mirrors the structural pattern of merged #21542 (`cosine_distance`): - Same `coerce_types` for `List`/`LargeList`/`FixedSizeList` of any numeric inner type, with widening to `LargeList` when any input is `LargeList` (per the #21704 pattern) - Same NULL semantics: bare `NULL` → `NULL`, NULL row → NULL, NULL element in list → NULL - Same Arrow-idiomatic implementation: single `as_float64_array(list_array.values())` downcast, slice by `value_offsets()`, iterate via `ScalarBuffer<f64>` - No alias, no shared module — standalone, inline math The arithmetic is the only semantic divergence from `cosine_distance`: - `dot += a*b` (no magnitude or normalization) - Empty arrays return `0.0` (sum of empty set), not `NULL` - No zero-magnitude special case (`inner_product([0,0], [1,2])` returns `0`, which is well-defined for inner product) ## Are these changes tested? Yes. SLT covers: - Orthogonal, identical, opposite, general non-trivial vectors - Single zero vector, both zero vectors - Bare `NULL` in either or both positions - NULL element inside a list (returns NULL for that row) - Mismatched lengths (error) - `LargeList` inputs - Mixed `(List, LargeList)` in both orders - `(FixedSizeList, FixedSizeList)` and `(FixedSizeList, LargeList)` - `Float32` and `Int64` inner type coercion - Multi-row query with NULL row propagation - Empty arrays (returns `0`) - No-args error - Return-type assertion (`Float64`) ## Are there any user-facing changes? New scalar function `inner_product`, documented in `docs/source/user-guide/sql/scalar_functions.md`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
