adriangb commented on PR #22680: URL: https://github.com/apache/datafusion/pull/22680#issuecomment-4597926142
> Note that the IMDB_FILE_TYPE=csv will OOM on most systems because csv doesn't infer statistics and thus won't get scan predicates and dynamic filters pushed into DataSourceExec. This results in queries such a 16a doing joining large tables/intermediates before enough of the selective filters have reduced the data size to not OOM (tested on a 96GB system). Setting PARTITION=1 does not solve the issue. I assume this was already the case? Thanks for investigating the root cause. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
