wgtmac commented on issue #2: URL: https://github.com/apache/iceberg-cpp/issues/2#issuecomment-2496086676
Thanks @zeroshade for the detail! The table below is the type mapping between iceberg and arrow. I think we can provide a wrapper around arrow data types to use only a subset of them. On the read path, the mapping is pretty clear except for String/LargeString/Binary/LargeBinary. We can by default use String/Binary unless explicitly configured. On the write path, we can simply error out for unsupported arrow types. Just want to add that the ongoing iceberg `variant` and `geometry` types will not have any issue, parquet-cpp will anyway implement them because they are part of the parquet spec. Therefore I don't think there is a compelling reason not to use `arrow::DataType` directly. | iceberg | arrow | |---------|-------| | unknown | Null | | boolean | Boolean | | int | Int32 | | long | Int64 | | float | Float32 | | double | Float64 | | decimal(P,S) | Decimal(P,S) | | date | Date32 | | time | Time64 | | timestamp | Timestamp(MICRO) | | timestamptz | Timestamp(MICRO,UTC) | | timestamp_ns | Timestamp(NANO) | | timestamptz_ns | Timestamp(NANO,UTC) | | string | String/LargeString | | uuid | UUID canonical extension type | | fixed(L) | FixedSizeBinary (L) | | binary | Binary/LargeBinary | | struct | Struct | | list | List/LargeList | | map | Map | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org