rdblue commented on code in PR #14533:
URL: https://github.com/apache/iceberg/pull/14533#discussion_r2520505191


##########
api/src/main/java/org/apache/iceberg/FileContent.java:
##########
@@ -18,11 +18,27 @@
  */
 package org.apache.iceberg;
 
-/** Content type stored in a file, one of DATA, POSITION_DELETES, or 
EQUALITY_DELETES. */
+/**
+ * Content type stored in a file.
+ *
+ * <p>For V1-V3 tables: DATA, POSITION_DELETES, or EQUALITY_DELETES.
+ *
+ * <p>For V4 tables: DATA, POSITION_DELETES, EQUALITY_DELETES, DATA_MANIFEST, 
DELETE_MANIFEST, or
+ * MANIFEST_DV.
+ */
 public enum FileContent {
   DATA(0),
   POSITION_DELETES(1),
-  EQUALITY_DELETES(2);
+  EQUALITY_DELETES(2),
+  /** Data manifest entry (V4+ only) - references data files in a root 
manifest. */
+  DATA_MANIFEST(3),
+  /** Delete manifest entry (V4+ only) - references delete files in a root 
manifest. */
+  DELETE_MANIFEST(4),
+  /**
+   * Manifest deletion vector entry (V4+ only) - marks entries in a manifest 
as deleted without
+   * rewriting the manifest.
+   */
+  MANIFEST_DV(5);

Review Comment:
   I prefer the option of having the DV located in a field of the data or 
delete manifest record. That way we don't have to wait to find the DV before 
processing a manifest file. Not sure what others think here, but since the DV 
metadata/content is likely going to be different between the Metadata DV 
(inline) and Data DV (stored in Puffin), I don't see much value in trying to 
reuse metadata fields for it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to