zhongyujiang commented on code in PR #8611:
URL: https://github.com/apache/iceberg/pull/8611#discussion_r1778399294


##########
format/spec.md:
##########
@@ -434,7 +434,8 @@ The schema of a manifest file is a struct called 
`manifest_entry` with the follo
 | _optional_ |            | ~~**`107  sort_columns`**~~       | `list<112: 
int>`             | **Deprecated. Do not write.** |
 | _optional_ | _optional_ | **`108  column_sizes`**           | `map<117: int, 
118: long>`   | Map from column id to the total size on disk of all regions 
that store the column. Does not include bytes necessary to read other columns, 
like footers. Leave null for row-oriented formats (Avro) |
 | _optional_ | _optional_ | **`109  value_counts`**           | `map<119: int, 
120: long>`   | Map from column id to number of values in the column (including 
null and NaN values) |
-| _optional_ | _optional_ | **`110  null_value_counts`**      | `map<121: int, 
122: long>`   | Map from column id to number of null values in the column |
+| _optional_ | _optional_ | **`110  null_value_counts`**      | `map<121: int, 
122: long>`   | Map from column id to number of null values in the column. If 
the 
+null value cannot be correctly determined for a column, the field can remain 
unpopulated. |

Review Comment:
   Perhaps it is worth adding a note somewhere to remind people that the null 
count values collected in the manifest for nested columns currently may be 
inaccurate?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to