avantgardnerio commented on PR #23094:
URL: https://github.com/apache/datafusion/pull/23094#issuecomment-4770368684
Pushed `e6f846ed9b` — adds `HaloSpec` and a `halo: Option<HaloSpec>` field
on `DynamicRangePartitioning`.
**Why the scope grew:** an honest answer to "what do we need before a
runtime range repartitioner is implementable" surfaced halo as a missing piece.
The routing operator and the downstream halo-strip operator need to agree at
plan time on how far each bucket extends beyond its primary range; the field on
the partitioning type is the natural carrier for that agreement. Without it,
the two operators would need a side channel.
**Shape:**
```rust
pub struct HaloSpec {
preceding: ScalarValue,
following: ScalarValue,
}
pub struct DynamicRangePartitioning {
ordering: LexOrdering,
partition_count: usize,
halo: Option<HaloSpec>,
}
```
Distances are in the leading sort key's domain. Builder keeps the common
(no-halo) case terse: `DynamicRangePartitioning::new(ordering, k)` vs
`…::new(ordering, k).with_halo(halo)`.
This is the API hook the runtime range repartitioner needs to publish
`ExtremaKind::Expanded` extrema (proposed in #23089 / implemented in #23090) to
a downstream halo-strip filter. With halo unset, the partitioning produces
disjoint buckets and downstream sees `Observed` extrema.
ROWS-frame halo (a count of neighbor rows rather than a domain distance) is
intentionally not represented; a separate variant can be added later if
motivated.
`compatible_with` requires halo equality; `project` passes halo through
unchanged; `Display` adds the `halo(preceding=…, following=…)` suffix when set.
Three new tests cover metadata, compatibility, and projection preservation.
I'll update the discussion at #23093 to mirror this scope shift.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]