andygrove commented on PR #3977:
URL: 
https://github.com/apache/datafusion-comet/pull/3977#issuecomment-4269945397

   Honest update: the decision flip is real, but it is **not sufficient** to 
reproduce the full #3949 crash.
   
   I tried many DPP-shaped queries that match the general pattern (inner DPP 
join + outer broadcast join + aggregate/topK/intersect/IN-subquery, across 
BHJ/SMJ/coalesce/localRead variants). Every query executed cleanly — no 
`ColumnarToRowExec` canonicalization assertion. So the decision flip 
demonstrated in `CometDppFallbackConsistencySuite` is a real inconsistency, but 
something else — larger scale, specific stats, or a plan shape unique to 
q14a/q14b/q31/q47/q57 — is needed to actually crash.
   
   Where that leaves us:
   
   1. `stageContainsDPPScan` descending into `QueryStageExec.plan` is likely 
still worth doing as a correctness fix, but I can no longer claim it will close 
#3949 until we have a real repro.
   2. The fuzz + canonicalization infrastructure in this PR is still useful as 
regression coverage going forward.
   3. To make progress, we likely need the actual plan from a failing awslabs 
run, or a diagnostic build that logs the plan at the moment 
`AdaptiveSparkPlanExec.createQueryStages` crashes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to