klsince commented on a change in pull request #7301: URL: https://github.com/apache/pinot/pull/7301#discussion_r689228030
########## File path: pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SingleFileIndexDirectory.java ########## @@ -327,51 +336,176 @@ private void persistIndexMap(IndexEntry entry) throws IOException { File mapFile = new File(_segmentDirectory, V1Constants.INDEX_MAP_FILE_NAME); try (PrintWriter writer = new PrintWriter(new BufferedWriter(new FileWriter(mapFile, true)))) { - String startKey = getKey(entry.key.name, entry.key.type.getIndexName(), true); - - StringBuilder sb = new StringBuilder(); - sb.append(startKey).append(" = ").append(entry.startOffset); - writer.println(sb.toString()); - - String endKey = getKey(entry.key.name, entry.key.type.getIndexName(), false); - sb = new StringBuilder(); - sb.append(endKey).append(" = ").append(entry.size); - writer.println(sb.toString()); + persistIndexMap(entry, writer); } } - private String getKey(String column, String indexName, boolean isStartOffset) { - return column + MAP_KEY_SEPARATOR + indexName + MAP_KEY_SEPARATOR + (isStartOffset ? "startOffset" : "size"); - } - private String allocationContext(IndexKey key) { return this.getClass().getSimpleName() + key.toString(); } + /** + * This method sweeps the indices marked for removal. Exception is simply bubbled up w/o + * trying to recover disk states from failure. This method is expected to run during segment + * reloading, which has failure handling by creating a backup folder before conduct reloading. + */ + private void cleanupRemovedIndices() + throws IOException { + if (!_shouldCleanupRemovedIndices) { + return; + } + + // To keep track of indices to be retained and put them together + // compactly in the new index file. + long nextOffset = 0; + List<IndexEntry> retained = new ArrayList<>(); + File tmpIdxFile = new File(_segmentDirectory, V1Constants.INDEX_FILE_NAME + ".tmp"); + + // With FileChannel, we can seek to the data flexibly. + try (FileChannel srcCh = new RandomAccessFile(_indexFile, "r").getChannel(); Review comment: yeah, the changes here are pretty much doing what you've mentioned. The two copy methods below (L368, L371) iterate through the indices still in [_columnEntries](https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SingleFileIndexDirectory.java#L82) (which keep track of indices should be `retained`), just that it tries to store forward indices firstly, followed by all kinds of inverted indices, in the file, following the current convention of day layout. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org