chenwyi2 commented on PR #7161: URL: https://github.com/apache/iceberg/pull/7161#issuecomment-1761169778
Hi @stevenzwu @kengtin this PR can be create too many small files when parition with dt,hout,minute and bucekt(id), suppose paralisim is 120 and bucke number is 8, then 15 writes can write into same one bucket, but there is problem, data from the previous few hours can be into one commit because of data latency, there can be 15000 and more data files if changed partition is up to 1000, can we use complete parition name instead of just bucket? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org