crm26 opened a new issue, #21536:
URL: https://github.com/apache/datafusion/issues/21536

   ## Summary
   
   This issue tracks adding vector math and array aggregate scalar functions to 
DataFusion. These close gaps versus DuckDB and LanceDB for vector search and 
array analytics workloads.
   
   Replaces #21371 and #21376, which were requested to be split into 
function-per-PR submissions (per @alamb's review).
   
   ## Functions
   
   ### Vector math (with shared `vector_math.rs` primitives)
   
   | Function | Signature | Reference |
   |----------|-----------|-----------|
   | `cosine_distance` | `(array, array) → float64` | [DuckDB 
`array_cosine_similarity`](https://duckdb.org/docs/sql/functions/array.html) |
   | `inner_product` | `(array, array) → float64` | [DuckDB 
`array_inner_product`](https://duckdb.org/docs/sql/functions/array.html) |
   | `array_normalize` | `(array) → array` | NumPy / scipy convention |
   
   ### Array element-wise math
   
   | Function | Signature | Reference |
   |----------|-----------|-----------|
   | `array_add` | `(array, array) → array` | Element-wise addition |
   | `array_subtract` | `(array, array) → array` | Element-wise subtraction |
   | `array_scale` | `(array, scalar) → array` | Scalar multiply |
   
   ### Array aggregate scalars
   
   | Function | Signature | Reference |
   |----------|-----------|-----------|
   | `array_sum` / `list_sum` | `(array) → numeric` | [DuckDB 
`list_sum`](https://duckdb.org/docs/sql/functions/list.html#list_sumlist) |
   | `array_product` / `list_product` | `(array) → numeric` | [DuckDB 
`list_product`](https://duckdb.org/docs/sql/functions/list.html) |
   | `array_avg` / `list_avg` | `(array) → float64` | [DuckDB 
`list_avg`](https://duckdb.org/docs/sql/functions/list.html) |
   
   ### Alias fix
   
   | Fix | Description |
   |-----|-------------|
   | `list_min` | Missing alias on `ArrayMin` (parity with existing `list_max` 
on `ArrayMax`) |
   
   ## Submission plan
   
   One PR per function, submitted serially. Each PR will reference this issue.
   
   ## References
   
   - [DuckDB list functions](https://duckdb.org/docs/sql/functions/list.html)
   - [DuckDB array functions](https://duckdb.org/docs/sql/functions/array.html)
   - [LanceDB distance metrics](https://lancedb.github.io/lancedb/)
   - [Trino array functions](https://trino.io/docs/current/functions/array.html)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to