Rich-T-kid commented on issue #8791: URL: https://github.com/apache/datafusion/issues/8791#issuecomment-4200102916
@alamb I’m a bit confused by what you meant by: "However, those are not reading from Parquet files (as I could not figure out how to make a parquet file have the same schema I wanted)" In dictionary.slt I see that this statement exists: ``` statement ok COPY (SELECT arrow_cast(column1, 'Dictionary(Int32, Utf8)') AS column1, column2 FROM test0) TO 'test_files/scratch/dictionary/part_dict_test' STORED AS PARQUET PARTITIONED BY (column1); ``` which writes out the table as a partitioned Parquet dataset. I’ve also included my own test in a [draft PR](https://github.com/apache/datafusion/pull/21444) where I create a dictionary-encoded column, write it out to a Parquet file, then read it back in again and do a schema check (confirm it remains Dictionary(...)) as well as a basic filter. Is this what you meant by “reading from a Parquet file”? Or were you referring to a more specific Parquet scenario (e.g., a particular schema/layout/encoding that you weren’t able to reproduce) for the coverage you’re looking for? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
