swaminathanmanish commented on code in PR #14451: URL: https://github.com/apache/pinot/pull/14451#discussion_r1860973442
########## pinot-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java: ########## @@ -397,6 +398,35 @@ public ValidDocIdsBitmapResponse getValidDocIdsBitmapFromServer(String tableName return response; } + public Map<String, List<NeedRefreshResponse>> getSegmentsForRefreshFromServer( + String tableNameWithType, Set<String> serverInstances, BiMap<String, String> endpoints, int timeoutMs) { + LOGGER.debug("Getting list of segments for refresh from servers for table {}.", tableNameWithType); + List<String> serverURLs = new ArrayList<>(); + for (String serverInstance : serverInstances) { + serverURLs.add(generateNeedRefreshSegmentsServerURL(tableNameWithType, endpoints.get(serverInstance))); + } + BiMap<String, String> endpointsToServers = endpoints.inverse(); + CompletionServiceHelper completionServiceHelper = + new CompletionServiceHelper(_executor, _connectionManager, endpointsToServers); + CompletionServiceHelper.CompletionServiceResponse serviceResponse = + completionServiceHelper.doMultiGetRequest(serverURLs, tableNameWithType, true, timeoutMs); + Map<String, List<NeedRefreshResponse>> serverResponses = new HashMap<>(); + + for (Map.Entry<String, String> streamResponse : serviceResponse._httpResponses.entrySet()) { + try { + // TODO: RV - get the instance name instead of the endpoint Review Comment: What does this TODO mean? ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/data/manager/TableDataManager.java: ########## @@ -323,4 +323,12 @@ default void onConsumingToDropped(String segmentNameStr) { */ default void onConsumingToOnline(String segmentNameStr) { } + + /** + * Return list of segment names that need to be refreshed along with reason. + * @param tableConfig Table Config of the table + * @param schema Schema of the table + * @return List of {@link NeedRefreshResponse} with segment names and reason for refresh + */ + List<NeedRefreshResponse> getSegmentsForRefresh(TableConfig tableConfig, Schema schema); Review Comment: Would a name like getSegmentsOutOfSync be better. We use both reload and refresh to bring the segment in sync with table config, so we can separate the action from current state. ########## pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java: ########## @@ -1046,6 +1064,261 @@ public boolean needReloadSegments() return needReload; } + @Override + public List<NeedRefreshResponse> getSegmentsForRefresh(TableConfig tableConfig, Schema schema) { + List<NeedRefreshResponse> segmentsRequiringRefresh = new ArrayList<>(); + List<SegmentDataManager> segmentDataManagers = acquireAllSegments(); Review Comment: @vrajat and I chatted . I think we can start with current code (acquireAllSegments) . These checks are local and in-memory and should be quick. We can record the start/end times for this operation. If needed we can optimize this further. ########## pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotSegmentRestletResource.java: ########## @@ -891,6 +892,31 @@ public String getTableReloadMetadata( } } + @GET + @Path("segments/{tableNameWithType}/needRefresh") + @Authorize(targetType = TargetType.TABLE, paramName = "tableNameWithType", action = Actions.Table.GET_METADATA) + @Produces(MediaType.APPLICATION_JSON) + @ApiOperation(value = "Gets a list of segments that need to be refreshed from servers hosting the table", notes = + "Gets a list of segments that need to be refreshed from servers hosting the table") + public Map<String, List<NeedRefreshResponse>> getTableRefreshMetadata( Review Comment: Discussed offline. 1. For a successful entry we will have non-empty/empty list indicating segments that need to be refreshed 2. For unsuccessful entry, we will have null response. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org