kosiew commented on code in PR #21993:
URL: https://github.com/apache/datafusion/pull/21993#discussion_r3180936389


##########
datafusion/physical-plan/src/operator_statistics/mod.rs:
##########
@@ -160,26 +159,22 @@ impl ExtendedStatistics {
 
     /// Get a reference to a custom statistics extension by type.
     pub fn get_extension<T: 'static + Send + Sync>(&self) -> Option<&T> {
-        self.extensions
-            .get(&TypeId::of::<T>())
-            .and_then(|ext| ext.downcast_ref())
+        self.extensions.get::<T>()
     }
 
     /// Set a custom statistics extension.
-    pub fn set_extension<T: 'static + Send + Sync>(&mut self, value: T) {
-        self.extensions.insert(TypeId::of::<T>(), Arc::new(value));
+    pub fn set_extension<T: 'static + Send + Sync>(&mut self, value: Arc<T>) {

Review Comment:
   `ExtendedStatistics::set_extension` now asks callers to allocate and pass an 
`Arc<T>`, while the existing public API accepted `T` and wrapped it internally. 
That makes this a source-breaking API change, even though the refactor is 
mainly about sharing the backing map.
   
   The documented example nearby still shows 
`stats.set_extension(HistogramStats { ... })`, so existing users following that 
pattern will no longer compile. Could we keep the existing method shape, for 
example `set_extension<T>(value: T) { self.extensions.insert(Arc::new(value)); 
}`, and add a separate `set_extension_arc` only if callers need to avoid the 
allocation?



##########
datafusion/core/tests/parquet/custom_reader.rs:
##########
@@ -71,7 +71,7 @@ async fn 
route_data_access_ops_to_parquet_file_reader_factory() {
         .into_iter()
         .map(|meta| {
             PartitionedFile::new_from_meta(meta)

Review Comment:
   It would be great to add one integration regression test that attaches both 
a `ParquetAccessPlan` and a custom reader payload to the same 
`PartitionedFile`. The test could then assert that the custom reader still sees 
its payload and that the parquet access plan is honored.
   
   The current tests cover the generic map and the two consumers separately, 
but not quite the end-to-end invariant this PR is trying to protect.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to