jerryzhujing commented on PR #14797: URL: https://github.com/apache/iceberg/pull/14797#issuecomment-4009215865
> > @t3hw @rainerschamm In my testing there are still duplicated records if oner records are updated frequently. My commit time is 3 minutes. Below are two records updated within one minute and are only two duplicated records in the table. There are 434192 records in the table with 434191 distinct id records. > > ## updated_at > > 2026-03-05 03:37:58.685000 2026-03-05 03:37:59.076000 > > Hmm, we have not seen any duplicates yet in our tests but we only tested it in this setup: > > * no partitioning > * merge-on-read > * 5 minute commit > > ... iceberg.tables.auto-create-props.write.delete.mode: merge-on-read iceberg.tables.auto-create-props.write.merge.mode: merge-on-read iceberg.tables.auto-create-props.write.update.mode: merge-on-read ... > > Also we make sure all identifier fields are strictly non-null in the resulting iceberg table schema. @rainerschamm Do you mind share the complete sink properties? Let me check if anything wrong with my config -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
