https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168

--- Comment #6 from Andi Kleen <andi-gcc at firstfloor dot org> ---
So the file cache has a window of 100 lines:

static const size_t line_record_size = 100;

The indentation code rereads the line of the guard, body, next statement and
that is all cached if it's all within 100 lines of where the lexer is.

But for some reason in your code the lexer is often very far ahead, e.g. in one
example it was 60k lines ahead. 

So that means the file cache reads the previous window, and then move forward
again, which is lots of extra reading. It also reads the full 100 lines.

(I think sometimes when it is very far away it may even have lost the line
offset and needs to read even more)

Also it's strange that ferror is expensive (and feof is not)?

I'm not fully sure why the lexer is so far ahead. Maybe there is lots of
peeking somewhere?

A fix would be to always reread the line when the parser is in the right spot
and remember the indentation until the end of the statement to check.

or use mmap on the whole file to make it a lot cheaper (but would still need a
line table)

Reply via email to