Copilot commented on code in PR #17186:
URL: https://github.com/apache/pinot/pull/17186#discussion_r2517082227


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/stats/NoDictColumnStatisticsCollector.java:
##########
@@ -52,6 +54,9 @@ public class NoDictColumnStatisticsCollector extends 
AbstractColumnStatisticsCol
   private boolean _sealed = false;
   // HLL Plus generally returns approximate cardinality >= actual cardinality 
which is desired
   private final HyperLogLogPlus _hllPlus;
+  // Track exact uniques up to a threshold to avoid small-N underestimation 
and test flakiness
+  private static final int EXACT_UNIQUE_TRACKING_THRESHOLD = 2048;
+  private Set<Object> _exactUniques = new HashSet<>();

Review Comment:
   The `_exactUniques` field is reassigned to `null` after exceeding the 
threshold, but it's not marked as `volatile` or synchronized. If this class is 
used in a concurrent context, this pattern could lead to visibility issues. 
Consider documenting thread-safety assumptions or using `volatile` if 
concurrent access is expected.
   ```suggestion
     private volatile Set<Object> _exactUniques = new HashSet<>();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to