kevinjqliu commented on issue #1255:
URL: 
https://github.com/apache/iceberg-python/issues/1255#issuecomment-2442979795

   > So I expect that the value of the second record other than 
"string_field_1" to be null when I insert these records into the iceberg table 
using pyiceberg.
   ```
   import pyarrow as pa
   
   schema = pa.schema(
       [
           pa.field("string_field_1", pa.string(), True),
           pa.field("int_field_1", pa.int32(), True),
           pa.field("float_field_1", pa.float32(), True),
           pa.field(
               "struct_field_1",
               pa.struct(
                   [
                       pa.field("string_nested_1", pa.string()),
                       pa.field("int_item_2", pa.int32()),
                       pa.field("float_item_2", pa.float32()),
                   ]
               ),
           ),
           pa.field("list_field_1", pa.list_(pa.string())),
           pa.field("list_field_2", pa.list_(pa.int32())),
           pa.field("list_field_3", pa.list_(pa.float32())),
           pa.field("map_field_1", pa.map_(pa.string(), pa.string())),
           pa.field("map_field_2", pa.map_(pa.string(), pa.int32())),
           pa.field("map_field_3", pa.map_(pa.string(), pa.float32())),
       ]
   )
   records = [
       {
           "string_field_1": "field_1",
           "int_field_1": 123,
           "float_field_1": 1.23,
           "struct_field_1": {
               "string_nested_1": "nest_1",
               "int_item_2": 1234,
               "float_item_2": 1.234,
           },
           "list_field_1": ["a", "b", "c"],
           "list_field_2": [1, 2, 3],
           "list_field_3": [0.1, 0.2, 0.3],
           "map_field_1": {"a": "b", "b": "c"},
           "map_field_2": {"a": 1, "b": 2},
           "map_field_3": {"a": 0.1, "b": 0.2},
       },
       {
           "string_field_1": "field_1_b",
       },
   ]
   pyarrow_table: pa.Table = pa.Table.from_pylist(records, schema=schema)
   
   print(pyarrow_table["struct_field_1"].to_pandas())
   ```
   returns
   ```
   0    {'string_nested_1': 'nest_1', 'int_item_2': 12...
   1                                                 None
   Name: struct_field_1, dtype: object
   ```
   which confirms the value is None in pyarrow. 
   
   After appending to the table, is the record None when you read it back? 
`table.scan().to_pandas()`
   
   > I then checked the table using AWS Athena, but the "struct_field_1" of the 
second record is not null.
   
   It is not clear to me that Athena returns a non-null value in the example 
you provided.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to