Jackie-Jiang commented on code in PR #12976:
URL: https://github.com/apache/pinot/pull/12976#discussion_r1605907484


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -1058,11 +1149,31 @@ protected void removeDocId(IndexSegment segment, int 
docId) {
    * Use the segmentContexts to collect the contexts for selected segments. 
Reuse the segmentContext object if
    * present, to avoid overwriting the contexts specified at the others places.
    */
-  public void setSegmentContexts(List<SegmentContext> segmentContexts) {
+  public void setSegmentContexts(List<SegmentContext> segmentContexts, 
Map<String, String> queryOptions) {
+    if (_consistencyMode == UpsertConfig.ConsistencyMode.NONE) {
+      setSegmentContexts(segmentContexts);
+      return;
+    }
+    if (_consistencyMode == UpsertConfig.ConsistencyMode.SYNC) {
+      _upsertViewLock.readLock().lock();
+      try {
+        setSegmentContexts(segmentContexts);
+        return;
+      } finally {
+        _upsertViewLock.readLock().unlock();
+      }
+    }
+    // If batch refresh is enabled, the copy of bitmaps is kept updated and 
ready to use for a consistent view.
+    // The locking between query threads and upsert threads can be avoided 
when using batch refresh.
+    // Besides, queries can share the copy of bitmaps, w/o cloning the bitmaps 
by every single query.
+    // If query has specified a need for certain freshness, check the view and 
refresh it as needed.
+    // When refreshing the copy of map, we need to take the WLock so only one 
thread is refreshing view.
+    long upsertViewFreshnessMs = 
QueryOptionsUtils.getUpsertViewFreshnessMs(queryOptions);
+    doBatchRefreshUpsertView(upsertViewFreshnessMs);
     for (SegmentContext segmentContext : segmentContexts) {
       IndexSegment segment = segmentContext.getIndexSegment();
-      if (_trackedSegments.contains(segment)) {
-        
segmentContext.setQueryableDocIdsSnapshot(getQueryableDocIdsSnapshotFromSegment(segment));
+      if (_segmentQueryableDocIdsMap.containsKey(segment)) {
+        
segmentContext.setQueryableDocIdsSnapshot(_segmentQueryableDocIdsMap.get(segment));

Review Comment:
   We can use `get()` and `null` check to save one map lookup



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -1058,11 +1149,31 @@ protected void removeDocId(IndexSegment segment, int 
docId) {
    * Use the segmentContexts to collect the contexts for selected segments. 
Reuse the segmentContext object if
    * present, to avoid overwriting the contexts specified at the others places.
    */
-  public void setSegmentContexts(List<SegmentContext> segmentContexts) {
+  public void setSegmentContexts(List<SegmentContext> segmentContexts, 
Map<String, String> queryOptions) {
+    if (_consistencyMode == UpsertConfig.ConsistencyMode.NONE) {
+      setSegmentContexts(segmentContexts);
+      return;
+    }
+    if (_consistencyMode == UpsertConfig.ConsistencyMode.SYNC) {
+      _upsertViewLock.readLock().lock();
+      try {
+        setSegmentContexts(segmentContexts);
+        return;
+      } finally {
+        _upsertViewLock.readLock().unlock();
+      }
+    }
+    // If batch refresh is enabled, the copy of bitmaps is kept updated and 
ready to use for a consistent view.
+    // The locking between query threads and upsert threads can be avoided 
when using batch refresh.
+    // Besides, queries can share the copy of bitmaps, w/o cloning the bitmaps 
by every single query.
+    // If query has specified a need for certain freshness, check the view and 
refresh it as needed.
+    // When refreshing the copy of map, we need to take the WLock so only one 
thread is refreshing view.
+    long upsertViewFreshnessMs = 
QueryOptionsUtils.getUpsertViewFreshnessMs(queryOptions);
+    doBatchRefreshUpsertView(upsertViewFreshnessMs);

Review Comment:
   I think we always need to do this check even if query doesn't have query 
option. There is no guarantee there are updates in the past 
`upsertViewFreshnessMs` and we might miss updates



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -1019,34 +1038,106 @@ public synchronized void close()
     _logger.info("Closed the metadata manager");
   }
 
-  protected void replaceDocId(ThreadSafeMutableRoaringBitmap validDocIds,
+  /**
+   * Use WLock to make updates on two segments' bitmaps atomically.
+   */
+  protected void replaceDocId(IndexSegment newSegment, 
ThreadSafeMutableRoaringBitmap validDocIds,
       @Nullable ThreadSafeMutableRoaringBitmap queryableDocIds, IndexSegment 
oldSegment, int oldDocId, int newDocId,
       RecordInfo recordInfo) {
-    removeDocId(oldSegment, oldDocId);
-    addDocId(validDocIds, queryableDocIds, newDocId, recordInfo);
+    // For SNAPSHOT consistency mode, we can use RLock here. But for 
simplicity and considering there is only one

Review Comment:
   Suggest keeping it consistent. I found it easier to understand if we always 
use RL for updates and WL for sync



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -1058,11 +1149,31 @@ protected void removeDocId(IndexSegment segment, int 
docId) {
    * Use the segmentContexts to collect the contexts for selected segments. 
Reuse the segmentContext object if
    * present, to avoid overwriting the contexts specified at the others places.
    */
-  public void setSegmentContexts(List<SegmentContext> segmentContexts) {
+  public void setSegmentContexts(List<SegmentContext> segmentContexts, 
Map<String, String> queryOptions) {
+    if (_consistencyMode == UpsertConfig.ConsistencyMode.NONE) {
+      setSegmentContexts(segmentContexts);
+      return;
+    }
+    if (_consistencyMode == UpsertConfig.ConsistencyMode.SYNC) {
+      _upsertViewLock.readLock().lock();
+      try {
+        setSegmentContexts(segmentContexts);
+        return;
+      } finally {
+        _upsertViewLock.readLock().unlock();
+      }
+    }
+    // If batch refresh is enabled, the copy of bitmaps is kept updated and 
ready to use for a consistent view.
+    // The locking between query threads and upsert threads can be avoided 
when using batch refresh.
+    // Besides, queries can share the copy of bitmaps, w/o cloning the bitmaps 
by every single query.
+    // If query has specified a need for certain freshness, check the view and 
refresh it as needed.
+    // When refreshing the copy of map, we need to take the WLock so only one 
thread is refreshing view.
+    long upsertViewFreshnessMs = 
QueryOptionsUtils.getUpsertViewFreshnessMs(queryOptions);
+    doBatchRefreshUpsertView(upsertViewFreshnessMs);
     for (SegmentContext segmentContext : segmentContexts) {
       IndexSegment segment = segmentContext.getIndexSegment();
-      if (_trackedSegments.contains(segment)) {
-        
segmentContext.setQueryableDocIdsSnapshot(getQueryableDocIdsSnapshotFromSegment(segment));
+      if (_segmentQueryableDocIdsMap.containsKey(segment)) {

Review Comment:
   We need to read the value into a local variable, or it might be updated in 
the middle



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -1081,6 +1192,52 @@ private static MutableRoaringBitmap 
getQueryableDocIdsSnapshotFromSegment(IndexS
     return queryableDocIdsSnapshot;
   }
 
+  private void setSegmentContexts(List<SegmentContext> segmentContexts) {
+    for (SegmentContext segmentContext : segmentContexts) {
+      IndexSegment segment = segmentContext.getIndexSegment();
+      if (_trackedSegments.contains(segment)) {
+        
segmentContext.setQueryableDocIdsSnapshot(getQueryableDocIdsSnapshotFromSegment(segment));
+      }
+    }
+  }
+
+  private boolean skipUpsertViewRefresh(long upsertViewFreshnessMs) {
+    long nowMs = System.currentTimeMillis();

Review Comment:
   (minor) Check `upsertViewFreshnessMs` before getting the current time. We 
may also check for `== 0` and directly return `false`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to