singhpk234 commented on issue #6456:
URL: https://github.com/apache/iceberg/issues/6456#issuecomment-1360077027

   It looks like for some reason, the splits created for the left side source 
are very skewed, and this skewness, as per my understanding is the main reason 
for slow down. plz. refer the min / 25th percentile / median / 75th take KB's 
of data where as Max has 100's of MB of data, and also spilling is happening 
for that task. 
   
   P.S : It would be really nice to see what was the distribution prior to 1.1 
release, can you please also attach that.
   
   ![Screen Shot 2022-12-20 at 11 45 18 
AM](https://user-images.githubusercontent.com/35593236/208753629-28e0c31a-8f14-4d1d-8f98-5cdbd4d552b2.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to