On Thu, 7 Nov 2013 13:48:14, Jakub Jelinek wrote: > On Thu, Nov 07, 2013 at 01:25:00PM +0100, Bernd Edlinger wrote: >> just some thoughts... >> >> >> fgetc will definitely be much faster than fread 16K and fseek back to the >> end of line position. >> >> Note: fgetc is already buffered and not too slow on average, but only if you >> do not >> fseek around. In that case the buffered file-stream data is lost. > > fgetc (unless you are talking about fgetc_unlocked which is somewhat better) > is actually very expensive. Of course, fseek is highly undersirable, but > I've already said that. > > Jakub
yes. That's true. I think, maybe this patch tries just to reach too many goals at the same time. What is the _real_ bug which should be fixed first, is that even a very short file with NUL-chars goes OOM in read_line. So stay with fgets, just detect that the file is binary and return in that case. The performance and duplicate code can still be improved by a follow-up patch. We do not really need any caret positions on a binary file, after all, we just should not ICE. Maybe that could be done by just detecting that we have a BINARY file, and stop printing any further lines. like this? --- input.c.jj 2013-01-10 21:38:27.000000000 +0100 +++ input.c 2013-11-07 15:33:53.804990865 +0100 @@ -106,12 +106,15 @@ read_line (FILE *file) { size_t len = strlen (string + pos); - if (string[pos + len - 1] == '\n') + if (len && string[pos + len - 1] == '\n') { string[pos + len - 1] = 0; return string; } pos += len; + /* test if binary file or last line incomplete? */ + if (pos < string_len-1) + return pos && feof (file) ? string : NULL; string = XRESIZEVEC (char, string, string_len * 2); string_len *= 2; } Bernd.