https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #6 from Andi Kleen <andi-gcc at firstfloor dot org> --- So the file cache has a window of 100 lines: static const size_t line_record_size = 100; The indentation code rereads the line of the guard, body, next statement and that is all cached if it's all within 100 lines of where the lexer is. But for some reason in your code the lexer is often very far ahead, e.g. in one example it was 60k lines ahead. So that means the file cache reads the previous window, and then move forward again, which is lots of extra reading. It also reads the full 100 lines. (I think sometimes when it is very far away it may even have lost the line offset and needs to read even more) Also it's strange that ferror is expensive (and feof is not)? I'm not fully sure why the lexer is so far ahead. Maybe there is lots of peeking somewhere? A fix would be to always reread the line when the parser is in the right spot and remember the indentation until the end of the statement to check. or use mmap on the whole file to make it a lot cheaper (but would still need a line table)