On Sat, 2025-01-25 at 23:30 -0800, Andi Kleen wrote: > From: Andi Kleen <a...@gcc.gnu.org> > > The input context file_cache maintains an array of anchors > to speed up accessing lines before the previous line. > The array has a fixed upper size and the algorithm relies > on the linemap reporting the maximum number of lines in the file > in advance to compute the position of each anchor in the cache. > > This doesn't work for C which doesn't know the maximum number > of lines before the files has finished parsing. The code > has a fallback for this, but it is quite inefficient and > effectively defeats the cache, so many accesses have to > go through most of the input buffer to compute line > boundaries. For large files this can be very costly > as demonstrated in PR118168. > > Use a different algorithm to maintain the cache without > needing the maximum number of lines in advance. When the cache > runs out of entries and the gap to the last line anchor gets > too large, prune every second entry in the cache. This maintains > even spacing of the line anchors without requiring the maximum > index. > > For the original PR this moves the overhead of enabling > -Wmisleading-indentation to 32% with the default cache size. > With a 10k entry cache it becomes noise. > > cc1 -O0 -fsyntax-only mypy.c -quiet ran > 1.03 ± 0.05 times faster than cc1 -O0 -fsyntax-only mypy.c - > quiet -Wmisleading-indentation --param=file-cache-lines=10000 > 1.09 ± 0.08 times faster than cc1 -O0 -fsyntax-only mypy.c - > quiet -Wmisleading-indentation --param=file-cache-lines=1000 > 1.32 ± 0.07 times faster than cc1 -O0 -fsyntax-only mypy.c - > quiet -Wmisleading-indentation > > The code could be further optimized, e.g. use the vectorized > line search functions the preprocessor uses. > > Also it seems the input cache always reads the whole file into > memory, so perhaps it should just be using file mmap if possible. > > gcc/ChangeLog: > > PR preprocessor/118168 > * input.cc (file_cache_slot::get_next_line): Use new > algorithm > to maintain > (file_cache_slot::read_line_num): Use binary search for > lookup.
Thanks; this is OK for trunk. I spent some time stepping through this to get clear in my mind how the new algorithm works. FWIW I found the patch below helpful, to clarify in dumps about the slot index versus the line number of the slot; I'll commit this as a followup at some point: Dave diff --git a/gcc/input.cc b/gcc/input.cc index 11dd39412ea1..2abd2cd8eb0f 100644 --- a/gcc/input.cc +++ b/gcc/input.cc @@ -663,10 +663,11 @@ file_cache_slot::dump (FILE *out, int indent) const indent, "", (int)m_missing_trailing_newline); fprintf (out, "%*sline records (%i):\n", indent, "", m_line_record.length ()); + int idx = 0; for (auto &line : m_line_record) - fprintf (out, "%*sline %zi: byte offsets: %zi-%zi\n", + fprintf (out, "%*s[%i]: line %zi: byte offsets: %zi-%zi\n", indent + 2, "", - line.line_num, line.start_pos, line.end_pos); + idx++, line.line_num, line.start_pos, line.end_pos); } /* Returns TRUE iff the cache would need to be filled with data coming