On Mon, Nov 01, 2021 at 10:27:40AM -0600, Todd C. Miller wrote:
> On Mon, 01 Nov 2021 10:36:08 -0500, Scott Cheloha wrote:
> 
> > My own testing here with pathological inputs didn't show that large of
> > a performance difference between fgets(3) and getline(3).  There was
> > a difference but it was closer to like 5-10%.
> 
> With your updated patch I see:
> 
> % wc -l /tmp/z
>  10000000 /tmp/z
> % /usr/bin/time uniq /tmp/z > /dev/null
>         0.48 real         0.49 user         0.00 sys
> % /usr/bin/time ./obj/uniq /tmp/z > /dev/null
>         0.53 real         0.52 user         0.00 sys
> 
> which seems perfectly reasonable to me.
> 
> It may make sense to preallocate prevline and thisline to 8K to
> avoid reallocating in the common case where there are lines of
> varying sizes.

I will leave that optimization for a separate patch.

> Also, I don't think there is any point in the free(prevline); before
> exit when the initial getline() fails.

IMHO, that's just good hygiene.  getline(3) is allowed to allocate
memory even if it fails, and free(3) checks for e.g. buffer overruns.

Reply via email to