https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #14 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The trunk branch has been updated by Andi Kleen <a...@gcc.gnu.org>: https://gcc.gnu.org/g:4a992ecad0f302f69c4f6c42708c737eabaa60dc commit r15-7327-g4a992ecad0f302f69c4f6c42708c737eabaa60dc Author: Andi Kleen <a...@gcc.gnu.org> Date: Wed Dec 25 14:41:49 2024 -0800 Rebalance file_cache input line cache dynamically The input context file_cache maintains an array of anchors to speed up accessing lines before the previous line. The array has a fixed upper size and the algorithm relies on the linemap reporting the maximum number of lines in the file in advance to compute the position of each anchor in the cache. This doesn't work for C which doesn't know the maximum number of lines before the files has finished parsing. The code has a fallback for this, but it is quite inefficient and effectively defeats the cache, so many accesses have to go through most of the input buffer to compute line boundaries. For large files this can be very costly as demonstrated in PR118168. Use a different algorithm to maintain the cache without needing the maximum number of lines in advance. When the cache runs out of entries and the gap to the last line anchor gets too large, prune every second entry in the cache. This maintains even spacing of the line anchors without requiring the maximum index. For the original PR this moves the overhead of enabling -Wmisleading-indentation to 32% with the default cache size. With a 10k entry cache it becomes noise. cc1 -O0 -fsyntax-only mypy.c -quiet ran 1.03 ± 0.05 times faster than cc1 -O0 -fsyntax-only mypy.c -quiet -Wmisleading-indentation --param=file-cache-lines=10000 1.09 ± 0.08 times faster than cc1 -O0 -fsyntax-only mypy.c -quiet -Wmisleading-indentation --param=file-cache-lines=1000 1.32 ± 0.07 times faster than cc1 -O0 -fsyntax-only mypy.c -quiet -Wmisleading-indentation The code could be further optimized, e.g. use the vectorized line search functions the preprocessor uses. Also it seems the input cache always reads the whole file into memory, so perhaps it should just be using file mmap if possible. gcc/ChangeLog: PR preprocessor/118168 * input.cc (file_cache_slot::get_next_line): Use new algorithm to maintain (file_cache_slot::read_line_num): Use binary search for lookup.