schenksj opened a new pull request, #4700:
URL: https://github.com/apache/datafusion-comet/pull/4700

   **Part 1 of the Delta Lake contrib PR breakup.** The native Delta scan work 
(delta-kernel-rs, Iceberg-style contrib) was first posted as a single ~27k-line 
tracking PR, #4366, which is impractical to review as one unit. This is the 
first of a sequence of small, independently-reviewable, independently-mergeable 
PRs that reconstruct that work. The full sequence and its dependency graph live 
in #4366.
   
   This first slice touches **core only**. It adds a small extension contract 
that lets out-of-tree Comet contrib leaf scans (Delta now, Hudi and others 
later) take part in native planning **without core holding a compile-time 
reference to them**. This is the same "the edge keeps the source-specific code" 
shape Iceberg already uses. It ships **no Delta code** and is inert on default 
builds.
   
   ## Changes
   
   - **`trait CometScanWithPlanData`** — `sourceKey` / `commonData` / 
`perPartitionData`, plus optional `dynamicPruningFilters` / 
`withDynamicPruningFilters` (for scans whose DPP filters live in a `@transient` 
field that `TreeNode.makeCopy` cannot carry, #3510). `CometNativeScanExec` 
mixes it in.
   - **`foreachUntilCometInput`** now matches `case _: CometLeafExec`. This is 
a strict superset of the previous fixed scan list: the three leaf scans it 
replaces (`CometNativeScanExec`, `CometIcebergNativeScanExec`, 
`CometCsvNativeScanExec`) are exactly the classes that extend `CometLeafExec`.
   - **`PlanDataInjector.findAllPlanData`** collects per-partition planning 
data via the trait instead of a hardcoded `CometNativeScanExec` match.
   - **`PlanDataInjector` registry** gains one reflective 
`DeltaPlanDataInjector$` slot, appended to the existing `injectorsByKind` 
registry (#4535) **only** when a contrib bundled the class (`-Pcontrib-delta`). 
Default builds get `ClassNotFoundException -> None` and an unchanged registry. 
A class that is present but fails to bind (a misbuilt contrib jar) is logged, 
not silently swallowed.
   - **`CometPlanAdaptiveDynamicPruningFilters`** rewrites AQE DPP filters in 
place for trait scans whose filters cannot survive `makeCopy`.
   
   ## What this part deliberately does NOT do yet
   
   - **No `perPartitionFilePaths` on the trait.** That member only feeds 
`FAILED_READ_FILE` error conversion and lands in a later part, after #4536 (now 
merged).
   - **No Delta code.** There is no `DeltaPlanDataInjector` on the classpath 
yet, so the reflective slot resolves to nothing. This part is inert.
   
   ## Why it is safe on default builds
   
   With no contrib on the classpath the change is behavior-preserving. The leaf 
match is a proven superset of the old enumeration. The trait match catches the 
same `CometNativeScanExec` and still drives its subquery resolution. The 
reflective slot resolves to `None`. And the new DPP arm never fires because 
`CometNativeScanExec` leaves `dynamicPruningFilters` empty.
   
   ## Verification
   
   - `CometScanWithPlanDataSuite` (new): trait-contract defaults plus 
reflective-slot graceful absence. 2/2.
   - `CometJoinSuite` (native scan fusion and the DPP path): 28/28.
   - spotless and scalastyle: clean.
   - No native changes in this part.
   
   ## Roadmap
   
   This is part 1 of the breakup. Subsequent parts add the build gate and inert 
wiring, the Rust planning and read path, the Scala claim/decline and execution, 
Change Data Feed reads, the test battery, and docs. Each later part is gated 
behind `-Pcontrib-delta`, so every intermediate state on `main` is safe for 
default builds. Tracking umbrella: #4366.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to