singhpk234 commented on issue #6456: URL: https://github.com/apache/iceberg/issues/6456#issuecomment-1360077027
It looks like for some reason, the splits created for the left side source are very skewed, and this skewness, as per my understanding is the main reason for slow down. plz. refer the min / 25th percentile / median / 75th take KB's of data where as Max has 100's of MB of data, and also spilling is happening for that task. P.S : It would be really nice to see what was the distribution prior to 1.1 release, can you please also attach that.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org