ajantha-bhat commented on code in PR #10835:
URL: https://github.com/apache/iceberg/pull/10835#discussion_r1699922544


##########
format/spec.md:
##########
@@ -241,7 +241,9 @@ Struct evolution requires the following rules for default 
values:
 
 #### Column Projection
 
-Columns in Iceberg data files are selected by field id. The table schema's 
column names and order may change after a data file is written, and projection 
must be done using field ids. If a field id is missing from a data file, its 
value for each row should be `null`.
+Columns in Iceberg data files are selected by field id. The table schema's 
column names and order may change after a data file is written, and projection 
must be done using field ids.
+
+When a projected column has an [identity partition 
transform](#partition-transforms) applied to it for a data file, the value from 
the [manifest file](#manifests) must be used for that column (i.e. the column 
should not be read from the data file). This is to support tables that were 
migrated from other table formats (notably Hive) that do not write partition 
values to data files. Otherwise, if a field id is missing from a data file, its 
value for each row should be `null`.

Review Comment:
   Question, if projected column has other partition transform, then also the 
value must be read from manifest file itself right? I am not clear on why this 
is specific to identity partition transform?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to