swaminathanmanish commented on code in PR #14451:
URL: https://github.com/apache/pinot/pull/14451#discussion_r1860973442


##########
pinot-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java:
##########
@@ -397,6 +398,35 @@ public ValidDocIdsBitmapResponse 
getValidDocIdsBitmapFromServer(String tableName
     return response;
   }
 
+  public Map<String, List<NeedRefreshResponse>> 
getSegmentsForRefreshFromServer(
+      String tableNameWithType, Set<String> serverInstances, BiMap<String, 
String> endpoints, int timeoutMs) {
+    LOGGER.debug("Getting list of segments for refresh from servers for table 
{}.", tableNameWithType);
+    List<String> serverURLs = new ArrayList<>();
+    for (String serverInstance : serverInstances) {
+      serverURLs.add(generateNeedRefreshSegmentsServerURL(tableNameWithType, 
endpoints.get(serverInstance)));
+    }
+    BiMap<String, String> endpointsToServers = endpoints.inverse();
+    CompletionServiceHelper completionServiceHelper =
+        new CompletionServiceHelper(_executor, _connectionManager, 
endpointsToServers);
+    CompletionServiceHelper.CompletionServiceResponse serviceResponse =
+        completionServiceHelper.doMultiGetRequest(serverURLs, 
tableNameWithType, true, timeoutMs);
+    Map<String, List<NeedRefreshResponse>> serverResponses = new HashMap<>();
+
+    for (Map.Entry<String, String> streamResponse : 
serviceResponse._httpResponses.entrySet()) {
+      try {
+        // TODO: RV - get the instance name instead of the endpoint

Review Comment:
   What does this TODO mean?



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/data/manager/TableDataManager.java:
##########
@@ -323,4 +323,12 @@ default void onConsumingToDropped(String segmentNameStr) {
    */
   default void onConsumingToOnline(String segmentNameStr) {
   }
+
+  /**
+   * Return list of segment names that need to be refreshed along with reason.
+   * @param tableConfig Table Config of the table
+   * @param schema Schema of the table
+   * @return List of {@link NeedRefreshResponse} with segment names and reason 
for refresh
+   */
+  List<NeedRefreshResponse> getSegmentsForRefresh(TableConfig tableConfig, 
Schema schema);

Review Comment:
   
   Would a name like getSegmentsOutOfSync be better. We use both reload and 
refresh to bring the segment in sync with table config, so we can separate the 
action from current state. 
   



##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java:
##########
@@ -1046,6 +1064,261 @@ public boolean needReloadSegments()
     return needReload;
   }
 
+  @Override
+  public List<NeedRefreshResponse> getSegmentsForRefresh(TableConfig 
tableConfig, Schema schema) {
+    List<NeedRefreshResponse> segmentsRequiringRefresh = new ArrayList<>();
+    List<SegmentDataManager> segmentDataManagers = acquireAllSegments();

Review Comment:
   @vrajat and I chatted . I think we can start with current code 
(acquireAllSegments) . These checks are local and in-memory and should be 
quick. We can record the start/end times for this operation. If needed we can 
optimize this further. 
   



##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotSegmentRestletResource.java:
##########
@@ -891,6 +892,31 @@ public String getTableReloadMetadata(
     }
   }
 
+  @GET
+  @Path("segments/{tableNameWithType}/needRefresh")
+  @Authorize(targetType = TargetType.TABLE, paramName = "tableNameWithType", 
action = Actions.Table.GET_METADATA)
+  @Produces(MediaType.APPLICATION_JSON)
+  @ApiOperation(value = "Gets a list of segments that need to be refreshed 
from servers hosting the table", notes =
+      "Gets a list of segments that need to be refreshed from servers hosting 
the table")
+  public Map<String, List<NeedRefreshResponse>> getTableRefreshMetadata(

Review Comment:
   Discussed offline. 
   1. For a successful entry we will have non-empty/empty list indicating 
segments that need to be refreshed
   2. For unsuccessful entry, we will have null response. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to