Fokko commented on code in PR #8611:
URL: https://github.com/apache/iceberg/pull/8611#discussion_r1362621678


##########
format/spec.md:
##########
@@ -450,6 +451,48 @@ Notes:
 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the 
IEEE 754 `totalOrder` predicate. NaNs are not permitted as lower or upper 
bounds.
 3. If sort order ID is missing or unknown, then the order is assumed to be 
unsorted. Only data files and equality delete files should be written with a 
non-null order id. [Position deletes](#position-delete-files) are required to 
be sorted by file and position, not a table order, and should set sort order id 
to null. Readers must ignore sort order id for position delete files.
 4. The following field ids are reserved on `data_file`: 141.
+5. For nested structures, the null counts are as following:
+   ##### Struct
+   Counts are only for explicit nulls in a field. A nested field which is not 
counted as null if the parent is null.
+   ```
+   schema {
+     1: nested_struct<2: int, 3: boolean>
+   }
+   ```
+   The following holds true:
+   ```
+   null               null_value_counts={1: 1, 2: 0, 3: 0}
+   struct<1, True>    null_value_counts={1: 0, 2: 0, 3: 0}
+   struct<1, null>    null_value_counts={1: 0, 2: 0, 3: 1}
+   ```
+   ##### List
+   For list types, the number of null elements in the list is counted. If the 
elements are not counted if the parent is null.
+   ```
+   schema {
+     1: list[2: int]
+   }
+   ```
+   The following holds true:
+   ```
+   null               null_value_counts={1: 1, 2: 0}
+   [1, 2, 3]          null_value_counts={1: 0, 2: 0}
+   [1, null, 3]       null_value_counts={1: 0, 2: 1}
+   [null, null, 3]    null_value_counts={1: 0, 2: 2}
+   ```
+   ##### Maps
+   For map-elements the number of null values is counted within the map. The 
values are not counted if the parent is null. Keep in mind that map keys can't 
be null, so the field will be zero.

Review Comment:
   Great suggestion, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to