mukund-thakur commented on code in PR #16190:
URL: https://github.com/apache/iceberg/pull/16190#discussion_r3183252963
##########
core/src/main/java/org/apache/iceberg/actions/BinPackRewriteFilePlanner.java:
##########
@@ -91,6 +91,19 @@ public class BinPackRewriteFilePlanner
*/
public static final String MAX_FILES_TO_REWRITE = "max-files-to-rewrite";
+ /**
+ * Controls whether to rewrite files written with a partition spec different
from the configured
+ * output spec.
+ *
+ * <p>This can be used to migrate files created before partition spec
evolution (for example, when
+ * the spec evolved from month to month plus day).
+ *
+ * <p>Defaults to false.
+ */
+ public static final String REWRITE_PARTITION_SPEC_MISMATCH =
"rewrite-partition-spec-mismatch";
Review Comment:
Thanks for looking @nastra !
I went though the code but couldn't find any flag to achieve this.
We can use filters to filter old data based on some column values for
example timestamp for rewriting first time but that won't work if there are so
many files and jobs fail half way. When we rerun, it will again pick up the
same files even if we have rewritten 50% of files successfully. Currently, we
can't filter data files based on the spec ID.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]