This is an automated email from the ASF dual-hosted git repository.
dheres pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git
The following commit(s) were added to refs/heads/main by this push:
new 71ac9bd714 Extend the fast path in GenericByteViewArray::is_eq for
comparing against empty strings (#7767)
71ac9bd714 is described below
commit 71ac9bd7146565ab47be8053c412b30c0a4c01c6
Author: Jörn Horstmann <[email protected]>
AuthorDate: Wed Jun 25 08:59:13 2025 +0200
Extend the fast path in GenericByteViewArray::is_eq for comparing against
empty strings (#7767)
# Which issue does this PR close?
This avoids a call to memcmp for the relatively common case of comparing
against an empty string.
Closes #7766.
# Rationale for this change
This speeds up some of the queries in the `arrow_reader_clickbench`
benchmark, some of them significantly. The biggest benefits are for Q10,
Q11 and Q12, I did not observe any slowdowns on any other query.
Benchmark results are for an uncompressed parquet file.
```
arrow_reader_clickbench/sync/Q10
time: [8.3934 ms 8.4411 ms 8.5212 ms]
change: [-36.714% -36.040% -35.243%] (p = 0.00 <
0.05)
Performance has improved.
arrow_reader_clickbench/sync/Q11
time: [10.180 ms 10.315 ms 10.476 ms]
change: [-33.571% -32.145% -30.661%] (p = 0.00 <
0.05)
Performance has improved.
arrow_reader_clickbench/sync/Q12
time: [17.262 ms 17.419 ms 17.616 ms]
change: [-21.201% -19.289% -17.409%] (p = 0.00 <
0.05)
Performance has improved.
```
# Are there any user-facing changes?
No
---
arrow-ord/src/cmp.rs | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arrow-ord/src/cmp.rs b/arrow-ord/src/cmp.rs
index 46cab1bb8e..6711f4390f 100644
--- a/arrow-ord/src/cmp.rs
+++ b/arrow-ord/src/cmp.rs
@@ -581,6 +581,9 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a
GenericByteViewArray<T> {
if l_len != r_len {
return false;
}
+ if l_len == 0 && r_len == 0 {
+ return true;
+ }
// # Safety
// The index is within bounds as it is checked in value()