On Sat, 2025-01-25 at 23:30 -0800, Andi Kleen wrote:
> From: Andi Kleen <a...@gcc.gnu.org>
> 
> The input context file_cache maintains an array of anchors
> to speed up accessing lines before the previous line.
> The array has a fixed upper size and the algorithm relies
> on the linemap reporting the maximum number of lines in the file
> in advance to compute the position of each anchor in the cache.
> 
> This doesn't work for C which doesn't know the maximum number
> of lines before the files has finished parsing. The code
> has a fallback for this, but it is quite inefficient and
> effectively defeats the cache, so many accesses have to
> go through most of the input buffer to compute line
> boundaries. For large files this can be very costly
> as demonstrated in PR118168.
> 
> Use a different algorithm to maintain the cache without
> needing the maximum number of lines in advance. When the cache
> runs out of entries and the gap to the last line anchor gets
> too large, prune every second entry in the cache. This maintains
> even spacing of the line anchors without requiring the maximum
> index.
> 
> For the original PR this moves the overhead of enabling
> -Wmisleading-indentation to 32% with the default cache size.
> With a 10k entry cache it becomes noise.
> 
>   cc1 -O0 -fsyntax-only mypy.c   -quiet  ran
>     1.03 ± 0.05 times faster than cc1 -O0 -fsyntax-only  mypy.c   -
> quiet -Wmisleading-indentation --param=file-cache-lines=10000
>     1.09 ± 0.08 times faster than cc1 -O0 -fsyntax-only  mypy.c   -
> quiet -Wmisleading-indentation --param=file-cache-lines=1000
>     1.32 ± 0.07 times faster than cc1 -O0 -fsyntax-only  mypy.c   -
> quiet -Wmisleading-indentation
> 
> The code could be further optimized, e.g. use the vectorized
> line search functions the preprocessor uses.
> 
> Also it seems the input cache always reads the whole file into
> memory, so perhaps it should just be using file mmap if possible.
> 
> gcc/ChangeLog:
> 
>       PR preprocessor/118168
>       * input.cc (file_cache_slot::get_next_line): Use new
> algorithm
>       to maintain
>       (file_cache_slot::read_line_num): Use binary search for
> lookup.

Thanks; this is OK for trunk.

I spent some time stepping through this to get clear in my mind how the
new algorithm works.  FWIW I found the patch below helpful, to clarify
in dumps about the slot index versus the line number of the slot; I'll
commit this as a followup at some point:

Dave

diff --git a/gcc/input.cc b/gcc/input.cc
index 11dd39412ea1..2abd2cd8eb0f 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -663,10 +663,11 @@ file_cache_slot::dump (FILE *out, int indent) const
           indent, "", (int)m_missing_trailing_newline);
   fprintf (out, "%*sline records (%i):\n",
           indent, "", m_line_record.length ());
+  int idx = 0;
   for (auto &line : m_line_record)
-    fprintf (out, "%*sline %zi: byte offsets: %zi-%zi\n",
+    fprintf (out, "%*s[%i]: line %zi: byte offsets: %zi-%zi\n",
             indent + 2, "",
-            line.line_num, line.start_pos, line.end_pos);
+            idx++, line.line_num, line.start_pos, line.end_pos);
 }
 
 /* Returns TRUE iff the cache would need to be filled with data coming



Reply via email to