Fokko commented on issue #584:
URL: https://github.com/apache/iceberg-python/issues/584#issuecomment-2042077008

   Generated a Parquet file using both Spark and Python:
   
   
![image](https://github.com/apache/iceberg-python/assets/1134248/b382632a-e5ef-4c3d-82bd-6efbe2ced53f)
   
![image](https://github.com/apache/iceberg-python/assets/1134248/a79c1cc3-f4ff-41d4-99d8-837475cf4be8)
   
   Looking at the Python file:
   ```
   parq 00000-0-624f6b17-7d8d-435e-9fbe-217caeb25ca4.parquet --schema 
   
    # Schema 
    <pyarrow._parquet.ParquetSchema object at 0x108ba68c0>
   required group field_id=-1 schema {
     optional double field_id=1 ABC-GG-1-A;
   }
   ```
   
   And the Spark file:
   ```
   parq 00000-0-875786b8-dbcd-4f8c-80a7-d261205a0333-0-00001.parquet --schema 
   
    # Schema 
    <pyarrow._parquet.ParquetSchema object at 0x126623780>
   required group field_id=-1 table {
     required double field_id=1 ABC_x2DGG_x2D1_x2DA;
   }
   ```
   
   I would argue that the Python one is correct, but probably missing some 
context. The names shouldn't matter in the end, so probably we should look them 
up by ID.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to