kosiew commented on code in PR #20047:
URL: https://github.com/apache/datafusion/pull/20047#discussion_r3158582525
##########
datafusion/execution/src/cache/cache_manager.rs:
##########
@@ -410,8 +454,10 @@ pub const DEFAULT_METADATA_CACHE_LIMIT: usize = 50 * 1024
* 1024; // 50M
pub struct CacheManagerConfig {
/// Enable caching of file statistics when listing files.
/// Enabling the cache avoids repeatedly reading file statistics in a
DataFusion session.
- /// Default is disabled. Currently only Parquet files are supported.
- pub table_files_statistics_cache: Option<Arc<dyn FileStatisticsCache>>,
+ /// Default is enabled with 1MiB. Currently only Parquet files are
supported.
Review Comment:
Small doc nit here. The comment is a bit out of sync with the current
behavior.
First, "Default is enabled" is misleading. The field itself defaults to
None, and the cache only gets created in CacheManager::try_new when
file_statistics_cache_limit > 0.
Second, the "1MiB" value is stale. The default limit was bumped to 20 MiB,
and the nearby docs already reflect that.
Could you update this comment to match the current behavior and default? It
will save future readers a bit of head scratching 🙂
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]