adriangb commented on PR #21996:
URL: https://github.com/apache/datafusion/pull/21996#issuecomment-4372503809

   Thanks @asolimando !
   
   Thinking about it a bit more the axis that makes the split hard in 
DataFusion is that DataFusion often is federated to or federeates to other 
systems. E.g. if the TableProvider is a Postgres table, it's completely 
reasonable that there is overlap in functionality (Postgres also has a planner, 
stats system, etc.). But DataFusion also talks to dumb Parquet files...
   
   > rather than pushing expression semantics into the provider, which might 
have their own very different notion (classic impedance mismatch in DB systems)
   
   I think the key / nice thing would be to allow the implementer to decide 
what it can and can't provide (which I've kind of attempted in this PR). The 
next step would be to let it provide statistics for some parts of expressions 
and not others (i.e. split up the expression tree).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to