zhuqi-lucas commented on issue #21290: URL: https://github.com/apache/datafusion/issues/21290#issuecomment-4169936540
Actually, let me correct my previous comment. After further analysis: In DF 51, `SchemaAdapterFactory` was built into `ParquetOpener` itself — meaning **all** users of `ParquetSource` got batch-level schema adaptation automatically, regardless of whether they went through `ListingTable` or created `ParquetSource` directly via a custom `TableProvider`/`ExecutionPlanFactory`. In DF 52, `SchemaAdapterFactory` was removed from `ParquetOpener` and replaced with `PhysicalExprAdapterFactory`. The new adapter handles coercion at the expression level (e.g. inserting CAST in projections). However, this may not be fully wired up for all code paths — specifically when `ParquetSource` is created directly (not through `ListingTable`). Our custom `ParquetExecFactory` creates `ParquetSource::new(table_schema)` directly. This worked in DF 51 because the opener's built-in `SchemaAdapter` handled everything. In DF 52, the same code path fails because: 1. The opener no longer has `SchemaAdapter::map_batch()` for batch-level coercion 2. `PhysicalExprAdapterFactory` may not be correctly initialized for the direct `ParquetSource` path So this does appear to be a regression for anyone who uses `ParquetSource` directly rather than through `ListingTable`. The `SchemaAdapterFactory` was an implicit contract of the `ParquetOpener` that downstream code relied on. Could you confirm whether `PhysicalExprAdapterFactory` is intended to cover all the same paths that `SchemaAdapterFactory` did? Or is there additional setup needed when creating `ParquetSource` directly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
