Re: [I] [BUG] Valid column characters fail on to_arrow() or to_pandas() ArrowInvalid: No match for FieldRef.Name [iceberg-python]

via GitHub Mon, 08 Apr 2024 00:46:48 -0700


Fokko commented on issue #584:
URL: https://github.com/apache/iceberg-python/issues/584#issuecomment-2042077008


   Generated a Parquet file using both Spark and Python:
   
   
![image](https://github.com/apache/iceberg-python/assets/1134248/b382632a-e5ef-4c3d-82bd-6efbe2ced53f)
   
![image](https://github.com/apache/iceberg-python/assets/1134248/a79c1cc3-f4ff-41d4-99d8-837475cf4be8)
   
   Looking at the Python file:
   ```
   parq 00000-0-624f6b17-7d8d-435e-9fbe-217caeb25ca4.parquet --schema 
   
    # Schema 
    <pyarrow._parquet.ParquetSchema object at 0x108ba68c0>
   required group field_id=-1 schema {
     optional double field_id=1 ABC-GG-1-A;
   }
   ```
   
   And the Spark file:
   ```
   parq 00000-0-875786b8-dbcd-4f8c-80a7-d261205a0333-0-00001.parquet --schema 
   
    # Schema 
    <pyarrow._parquet.ParquetSchema object at 0x126623780>
   required group field_id=-1 table {
     required double field_id=1 ABC_x2DGG_x2D1_x2DA;
   }
   ```
   
   I would argue that the Python one is correct, but probably missing some 
context. The names shouldn't matter in the end, so probably we should look them 
up by ID.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [BUG] Valid column characters fail on to_arrow() or to_pandas() ArrowInvalid: No match for FieldRef.Name [iceberg-python]

Reply via email to