doom369 commented on code in PR #14920: URL: https://github.com/apache/lucene/pull/14920#discussion_r2192873292
########## lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java: ########## @@ -145,14 +145,28 @@ public static String stripSegmentName(String filename) { /** Returns the generation from this file name, or 0 if there is no generation. */ public static long parseGeneration(String filename) { assert filename.startsWith("_"); - String[] parts = stripExtension(filename).substring(1).split("_"); + int dot = filename.indexOf('.'); + int end = (dot != -1) ? dot : filename.length(); + int start = 1; // skip initial '_' + + int first = filename.indexOf('_', start); + if (first == -1 || first >= end) { + return 0; + } + + int second = filename.indexOf('_', first + 1); + int third = (second != -1) ? filename.indexOf('_', second + 1) : -1; + + int parts = (second == -1 || second >= end) ? 2 : (third == -1 || third >= end) ? 3 : 4; + Review Comment: In our scenario, index updates (and we have ~6 different indexes) can happen every 5 seconds. It's profiling info from a 5-minute production run with a async-profiler (one of the best at the marker at the moment, imho). So, if these allocations were captured, they're not rare. Agree about complexity. But this method now should be also 5-10x faster if that matters for you. If you want, I can provide you with a microbenchmark. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org