mbutrovich commented on issue #20324:
URL: https://github.com/apache/datafusion/issues/20324#issuecomment-4801629397

   So for kicks I turned it on in Comet with TPC-DS SF1000. We still see pretty 
large regressions:
   
   <img width="3500" height="600" alt="Image" 
src="https://github.com/user-attachments/assets/eba54a2b-25f1-41e9-8a25-963ea1df42b1";
 />
   <img width="1000" height="600" alt="Image" 
src="https://github.com/user-attachments/assets/6d127cfc-45e9-434a-b743-75e8dbb03a7d";
 />
   
   If I dig into the biggest regression on Q88 with Spark UI:
   
   No row-level filtering/filter pushdown/late materialization:
   <img width="756" height="675" alt="Image" 
src="https://github.com/user-attachments/assets/0184c0b0-aae8-494f-ac10-d6385962eff1";
 />
   
   Row-level filtering/filter pushdown/late materialization:
   <img width="756" height="712" alt="Image" 
src="https://github.com/user-attachments/assets/b3143fe0-da2d-46ae-830d-1ca4215ff986";
 />
   
   Even if we try to add an optimization to omit the CometFilter node when 
everything is pushed into the scan, the extra time in the scan doesn't offset 
eliding the CometFilter, so it's strictly slower. I'm not sure what next steps 
would be to help optimize this, but don't have cycles for it in the immediate 
future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to