W-I-D-EE commented on issue #8702: URL: https://github.com/apache/iceberg/issues/8702#issuecomment-1765308407
Further to this, i have actually had a lot of trouble getting delete from or merge into working with removing duplicate rows. Today the only way i have been able to remove deuplicates its by selecting a dataset and then using the Dataframe.dropDuplicates function in spark. Finally using the dynamic overwrite to rewrite the partition. Does anyone know of a better way to do this. Everything i have done with merge into or delete from always results in all records being removed instead of just the duplicated rows. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org