Jefffrey commented on code in PR #21883:
URL: https://github.com/apache/datafusion/pull/21883#discussion_r3154272123
##########
datafusion/sqllogictest/test_files/binary.slt:
##########
@@ -321,3 +321,20 @@ query T
SELECT split_part(CAST(binary AS VARCHAR), 'o', 2) FROM t WHERE binary =
X'466f6f';
----
(empty)
+
+# Pipe concatenation of binaries always provides a binary
+query ?
+SELECT x'636166c3a9' || x'68656c6c6f';
+----
+636166c3a968656c6c6f
+
+# Byte pipe operator is forbidden for mixed binary and text
Review Comment:
What about mixed binary?
```sql
1. query failed: DataFusion error: Error during planning: Cannot infer
common string type for string concat operation Binary || LargeBinary
[SQL] SELECT x'636166c3a9' || arrow_cast(x'68656c6c6f', 'LargeBinary');
at
/Users/jeffrey/Code/datafusion/datafusion/sqllogictest/test_files/binary.slt:331
```
Is this intentional?
##########
datafusion/physical-expr/src/expressions/binary/kernels.rs:
##########
@@ -204,6 +204,95 @@ pub fn concat_elements_utf8view(
Ok(result.finish())
}
+/// Concatenates two `GenericBinaryArray`s element-wise.
+/// If either element is `Null`, the result element is also `Null`.
+///
+/// # Errors
+/// - Returns an error if the input arrays have different lengths.
+/// - Panics if any concatenated string exceeds `T::Offset::MAX` in length.
+pub fn concat_elements_binary_array<T: OffsetSizeTrait>(
Review Comment:
We can follow what we do for strings and just use `concat_element_binary`
https://github.com/apache/datafusion/blob/22bb4e6b752c7a62b677d94a63bcf08b68e8d5ec/datafusion/physical-expr/src/expressions/binary.rs#L944-L951
https://docs.rs/arrow/latest/arrow/compute/kernels/concat_elements/fn.concat_element_binary.html
##########
datafusion/sqllogictest/test_files/spark/string/concat.slt:
##########
@@ -135,3 +135,25 @@ query T
SELECT concat(x'636166c3', x'a968656c6c6f');
----
caféhello
+
+# UDF concatenation for valid UTF-8 arguments
Review Comment:
What's the relation with adding these concat tests for the Spark UDF version
in this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]