zhuqi-lucas commented on issue #21290:
URL: https://github.com/apache/datafusion/issues/21290#issuecomment-4169936540

   Actually, let me correct my previous comment. After further analysis:
   
   In DF 51, `SchemaAdapterFactory` was built into `ParquetOpener` itself — 
meaning **all** users of `ParquetSource` got batch-level schema adaptation 
automatically, regardless of whether they went through `ListingTable` or 
created `ParquetSource` directly via a custom 
`TableProvider`/`ExecutionPlanFactory`.
   
   In DF 52, `SchemaAdapterFactory` was removed from `ParquetOpener` and 
replaced with `PhysicalExprAdapterFactory`. The new adapter handles coercion at 
the expression level (e.g. inserting CAST in projections). However, this may 
not be fully wired up for all code paths — specifically when `ParquetSource` is 
created directly (not through `ListingTable`).
   
   Our custom `ParquetExecFactory` creates `ParquetSource::new(table_schema)` 
directly. This worked in DF 51 because the opener's built-in `SchemaAdapter` 
handled everything. In DF 52, the same code path fails because:
   1. The opener no longer has `SchemaAdapter::map_batch()` for batch-level 
coercion
   2. `PhysicalExprAdapterFactory` may not be correctly initialized for the 
direct `ParquetSource` path
   
   So this does appear to be a regression for anyone who uses `ParquetSource` 
directly rather than through `ListingTable`. The `SchemaAdapterFactory` was an 
implicit contract of the `ParquetOpener` that downstream code relied on.
   
   Could you confirm whether `PhysicalExprAdapterFactory` is intended to cover 
all the same paths that `SchemaAdapterFactory` did? Or is there additional 
setup needed when creating `ParquetSource` directly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to