kosiew commented on code in PR #22755:
URL: https://github.com/apache/datafusion/pull/22755#discussion_r3360981579


##########
datafusion/functions-aggregate/src/sum.rs:
##########
@@ -533,25 +533,50 @@ impl SlidingDistinctSumAccumulator {
             data_type: data_type.clone(),
         })
     }
+
+    fn update_value(&mut self, value: i64) {
+        let cnt = self.counts.entry(value).or_insert(0);
+        if *cnt == 0 {
+            // first occurrence in window
+            self.sum = self.sum.wrapping_add(value);
+        }
+        *cnt += 1;
+    }
+
+    fn retract_value(&mut self, value: i64) {
+        if let Some(cnt) = self.counts.get_mut(&value) {
+            *cnt -= 1;
+            if *cnt == 0 {
+                // last copy leaving window
+                self.sum = self.sum.wrapping_sub(value);
+                self.counts.remove(&value);
+            }
+        }
+    }
 }
 
 impl Accumulator for SlidingDistinctSumAccumulator {
     fn update_batch(&mut self, values: &[ArrayRef]) -> Result<()> {
         let arr = values[0].as_primitive::<Int64Type>();
-        for &v in arr.values() {
-            let cnt = self.counts.entry(v).or_insert(0);
-            if *cnt == 0 {
-                // first occurrence in window
-                self.sum = self.sum.wrapping_add(v);
+        if arr.null_count() == 0 {

Review Comment:
   Nice cleanup overall. One small thought: the NULL-aware iteration logic 
looks very similar between `update_batch` and `retract_batch`. Would it make 
sense to extract a small helper that keeps the current no-null fast path and 
accepts the value operation, something like `apply_valid_values(values, 
Self::update_value)` and `apply_valid_values(values, Self::retract_value)`?
   
   That would keep the invariant that only valid slots affect `counts` and 
`sum` in one place and help avoid the two paths drifting apart over time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to