suddendust edited a comment on issue #7338: URL: https://github.com/apache/pinot/issues/7338#issuecomment-993418445
@Jackie-Jiang Reload already piggybacks on the refresh semaphore to control the number of parallel `SegmentReloadMessage` it is processing: ``` HelixTaskResult helixTaskResult = new HelixTaskResult(); _logger.info("Handling message: {}", _message); try { if (_segmentName.equals("")) { acquireSema("ALL", _logger); // NOTE: the method aborts if any segment reload encounters an unhandled exception, // and can lead to inconsistent state across segments _instanceDataManager.reloadAllSegments(_tableNameWithType, _forceDownload, _reloadParallelism); } else { // Reload one segment acquireSema(_segmentName, _logger); _instanceDataManager.reloadSegment(_tableNameWithType, _segmentName, _forceDownload); } ``` So my first concern is that perhaps it should get its own semaphore to not mixup refresh and reload? We can control using `max.parallel.reload.threads` just like we are doing `max.parallel.refresh.threads` for refresh. Second concern is that we're trying to parallelise the actual reloading of segments in `reloadAllSegments` which currently is sequential. As you proposed, the parallelism factor can be a part of the `SegmentReloadMessage` but how will the user configure it? Are we planning to add a new configuration? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org