syun64 commented on issue #226: URL: https://github.com/apache/iceberg-python/issues/226#issuecomment-1900604096
Thank you for raising this @asheeshgarg . To me, the issue seems to be that some of the "large" types are not yet accounted for within PyIceberg type conversions. https://arrow.apache.org/docs/python/generated/pyarrow.types.is_string.html#pyarrow.types.is_string `pa.is_string(pa.large_string()) == False` https://arrow.apache.org/docs/python/generated/pyarrow.types.is_large_string.html#pyarrow.types.is_large_string `pa.is_large_string(pa.large_string()) == True` On that note, should we review the pyarrow Schema to Iceberg Schema type mappings within the repository and ensure that all types that are supported in the existing **parquet type -> Spark data type -> Iceberg data type** conversions are supported in **parquet type -> PyArrow data type -> Iceberg data type** conversions? @Fokko - should we rename this issue to "large primitive pyarrow type support"? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org