raushanprabhakar1 opened a new pull request, #21806:
URL: https://github.com/apache/datafusion/pull/21806
## Which issue does this PR close?
Closes #21784
## Rationale for this change
Apache Arrow added `BooleanArray::has_true()` and `has_false()` so callers
can answer “any true/false?” without a full bit count. That can short-circuit
and avoid unnecessary work compared to patterns like `true_count() == 0` or
`true_count() > 0`.
This PR applies those APIs across DataFusion where the logic is purely
existential (or equivalent via null-safe “all true” / “no true” checks),
matching the audit suggested in the issue.
## What changes are included in this PR?
- Replace hot-path checks that only needed existence or emptiness with
`has_true()` / `has_false()` (and `null_count()` where needed), including:
- Nested/array helpers (`array_has`, list replace), Spark `array_contains`
null-semantics fast path
- Physical expressions: `evaluate_selection`, binary AND/OR short-circuit,
CASE/IN list loops
- `scatter` fast paths
- Top-K filter handling, sort-merge join filter, nested-loop join bitmap
checks
- Parquet column stats (`metadata.rs`, `has_any_exact_match`)
- Keep `true_count()` / `false_count()` where an actual count is required
(row counts, metrics, selectivity, `to_array(n)`, etc.)
- Import `arrow::array::Array` where `null_count()` is used on
`BooleanArray` in trait-heavy paths
## Are these changes tested?
Existing tests cover this behavior; the edits are semantics-preserving
refactors (same conditions, cheaper primitives). No new tests were added.
## Are there any user-facing changes?
No. Behavior should be unchanged; this is an internal performance/clarity
improvement.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]