On Mon, Nov 01, 2021 at 10:27:40AM -0600, Todd C. Miller wrote: > On Mon, 01 Nov 2021 10:36:08 -0500, Scott Cheloha wrote: > > > My own testing here with pathological inputs didn't show that large of > > a performance difference between fgets(3) and getline(3). There was > > a difference but it was closer to like 5-10%. > > With your updated patch I see: > > % wc -l /tmp/z > 10000000 /tmp/z > % /usr/bin/time uniq /tmp/z > /dev/null > 0.48 real 0.49 user 0.00 sys > % /usr/bin/time ./obj/uniq /tmp/z > /dev/null > 0.53 real 0.52 user 0.00 sys > > which seems perfectly reasonable to me. > > It may make sense to preallocate prevline and thisline to 8K to > avoid reallocating in the common case where there are lines of > varying sizes.
I will leave that optimization for a separate patch. > Also, I don't think there is any point in the free(prevline); before > exit when the initial getline() fails. IMHO, that's just good hygiene. getline(3) is allowed to allocate memory even if it fails, and free(3) checks for e.g. buffer overruns.