bert-beyondloops opened a new pull request, #21934:
URL: https://github.com/apache/datafusion/pull/21934
## Which issue does this PR close?
- Closes #21928.
## Rationale for this change
ScalarValue::compact copies array data for container types to release
sliced-buffer overhead, but never called .gc() on nested StringViewArray /
BinaryViewArray. A scalar extracted from a large batch would therefore still
hold a reference to the entire original view backing buffer, negating the
benefit of compaction for any list, struct, or map whose values are view-typed.
## What changes are included in this PR?
- ScalarValue::compact now compacts nested view buffers for all container
types (List, LargeList, FixedSizeList, ListView, LargeListView, Struct, Map),
trimming StringViewArray / BinaryViewArray backing buffers to only the
referenced bytes.
- The internal compact_view_buffers helper recursively handles
FixedSizeList, ListView, LargeListView, and Map.
## Are these changes tested?
new unit tests are added in scalar::tests, one per container type. Each test
verifies that after compact() the backing buffer is reduced to exactly the
bytes of the referenced string, and that the scalar value is preserved.
## Are there any user-facing changes?
No. The public compact / compacted API is unchanged; this PR only fixes the
behaviour for view-typed nested arrays.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]