jayceslesar opened a new issue, #1037: URL: https://github.com/apache/iceberg-python/issues/1037
### Question How would I go about using a field with mixed datatypes? Is that recommended/possible? I am a fan of tall-tidy data and am wondering how to properly go about the following? ```py from pydantic import BaseModel from datetime import datetime import pyarrow as pa from pyiceberg.catalog.sql import SqlCatalog class Message(BaseModel): system: str node: str message_name: str signal: str bus: str timestamp: datetime value: int | float | bool | str @staticmethod def to_pyarrow_schema(): return pa.schema([ pa.field('system', pa.string()), pa.field('node', pa.string()), pa.field('message_name', pa.string()), pa.field('signal', pa.string()), pa.field('bus', pa.string()), pa.field('timestamp', pa.timestamp('s', tz='UTC')), pa.field(pa.union([pa.field("value", pa.int32()), pa.field("value", pa.float64()), pa.field("value", pa.bool_()), pa.field("value", pa.string())], mode=pa.lib.UnionMode_SPARSE)), ]) catalog = SqlCatalog( "default", **{ "uri": "my_uri/catalog", }, ) catalog.create_table( identifier="default.messages", schema=Message.to_pyarrow_schema(), ) ``` Right now it throws an error `TypeError: Expected primitive type, got: <class 'pyarrow.lib.SparseUnionType'>` which makes sense as what I am attempting isn't supported. Should I be using a string type and casting in my queries? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org