singhpk234 commented on code in PR #11273: URL: https://github.com/apache/iceberg/pull/11273#discussion_r1794316342
########## core/src/main/java/org/apache/iceberg/TableProperties.java: ########## @@ -383,4 +383,8 @@ private TableProperties() {} public static final int ENCRYPTION_DEK_LENGTH_DEFAULT = 16; public static final int ENCRYPTION_AAD_LENGTH_DEFAULT = 16; + + public static final String MAINTAIN_POSITION_DELETES_DURING_WRITE = Review Comment: > write.delete.granularity to file [doubt] are any other writers accept from spark respecting this property ? Are other writers also gonna respect this property going forward, if yes how ? ########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatchQueryScan.java: ########## @@ -158,6 +163,26 @@ public void filter(Predicate[] predicates) { } } + protected Map<String, DeleteFileSet> dataToFileScopedDeletes() { Review Comment: [doubt] why do we need this whole hash-map of all the files with deletes being broadcasted from driver to executor ? since they are being anyways derived from scanTasks and spark executor ideally should have scanTask() so can we not create a local hashmap within executor and merge ? Am i missing something here ? [a bit orthogonal] can we put an estimate on the size of the HM ? if it goes very high it can fail the query ? i think the size is 8GB if IIRC. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org