scravy opened a new issue, #44350: URL: https://github.com/apache/arrow/issues/44350
### Describe the bug, including details regarding any error messages, version, and platform. When defining a column as dictionary from int32 → binary16, saving that as parquet, and reading it back – the schema is not the same as the one written, see example: ``` from tempfile import NamedTemporaryFile import pyarrow as pa import pyarrow.parquet as pq if __name__ == "__main__": z = b"\0" * 16 B16 = pa.binary(16) D32dict = pa.dictionary(pa.int32(), B16) tbl = pa.Table.from_arrays([[z, z, z, z]], names=["clmn"]) tbl = tbl.set_column(0, "clmn", tbl["clmn"].cast(D32dict)) assert ( tbl.schema.field("clmn").type == D32dict ), f"{tbl.schema.field('clmn').type} ≠ {D32dict}" with NamedTemporaryFile() as fn: pq.write_table(tbl, fn) t = pq.read_table(fn) assert ( t.schema.field("clmn").type == D32dict ), f"schema of table read is broken: {t.schema.field('clmn').type} ≠ {D32dict}" ``` ### Component(s) Parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org