tarun11Mavani commented on code in PR #16344:
URL: https://github.com/apache/pinot/pull/16344#discussion_r2306200098


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/readers/CompactedPinotSegmentRecordReader.java:
##########
@@ -40,60 +42,159 @@ public class CompactedPinotSegmentRecordReader implements 
RecordReader {
   private final String _deleteRecordColumn;
   // Reusable generic row to store the next row to return
   private final GenericRow _nextRow = new GenericRow();
-  // Valid doc ids iterator
+
+  // Iterator approach for valid document IDs
   private PeekableIntIterator _validDocIdsIterator;
+
+  // Index-based approach for sorted valid document IDs
+  private int[] _sortedValidDocIds;
+  private int _currentDocIndex = 0;
+
   // Flag to mark whether we need to fetch another row
   private boolean _nextRowReturned = true;
 
   public CompactedPinotSegmentRecordReader(RoaringBitmap validDocIds) {
     this(validDocIds, null);
   }
 
-  public CompactedPinotSegmentRecordReader(RoaringBitmap validDocIds,
-      @Nullable String deleteRecordColumn) {
+  public CompactedPinotSegmentRecordReader(RoaringBitmap validDocIds, 
@Nullable String deleteRecordColumn) {
     _pinotSegmentRecordReader = new PinotSegmentRecordReader();
     _validDocIdsBitmap = validDocIds;
     _validDocIdsIterator = validDocIds.getIntIterator();
     _deleteRecordColumn = deleteRecordColumn;
   }
 
+  public CompactedPinotSegmentRecordReader(ThreadSafeMutableRoaringBitmap 
validDocIds) {
+    this(validDocIds, null);
+  }
+
+  public CompactedPinotSegmentRecordReader(ThreadSafeMutableRoaringBitmap 
validDocIds,

Review Comment:
   your understanding is correct. This only removes the record that were 
invalidated in the same segment during ingestion. It will not compact already 
committed segments. To compact those segments, we will need UpsertCompaction or 
UpsertMergeCompaction minion tasks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to