On 2021/11/01 10:36, Scott Cheloha wrote: > How did you generate this input? Is it just ten million lines with a > single 'z' character? `jot -bz 10000000`?
That one was lots of copies of ports/infrastructure/bsd.port.mk catted together. > Updated patch. > > I screwed up. We don't need to free(3) prevline when we print a line. > We can just swap the pointers (and now, sizes) as we did before and > let getline(3) handle reallocation. Aha! That's a lot better and restores performance to how it was before. On a different machine so times aren't directly comparable with the previous, but: $ hyperfine 'uniq /tmp/z' 'obj/uniq /tmp/z' Benchmark 1: uniq /tmp/z Time (mean ± σ): 2.564 s ± 0.105 s [User: 2.380 s, System: 0.173 s] Range (min … max): 2.423 s … 2.748 s 10 runs Benchmark 2: obj/uniq /tmp/z Time (mean ± σ): 2.575 s ± 0.141 s [User: 2.350 s, System: 0.214 s] Range (min … max): 2.478 s … 2.940 s 10 runs Summary 'uniq /tmp/z' ran 1.00 ± 0.07 times faster than 'obj/uniq /tmp/z' Which I think is pretty much a noop in terms of time. OK sthen@ > Index: uniq.c > =================================================================== > RCS file: /cvs/src/usr.bin/uniq/uniq.c,v > retrieving revision 1.27 > diff -u -p -r1.27 uniq.c > --- uniq.c 31 Jul 2018 02:55:57 -0000 1.27 > +++ uniq.c 1 Nov 2021 15:30:55 -0000 > @@ -45,8 +45,6 @@ > #include <wchar.h> > #include <wctype.h> > > -#define MAXLINELEN (8 * 1024) > - > int cflag, dflag, iflag, uflag; > int numchars, numfields, repeats; > > @@ -59,10 +57,10 @@ __dead void usage(void); > int > main(int argc, char *argv[]) > { > - char *t1, *t2; > + char *prevline, *t1, *t2, *thisline; > FILE *ifp = NULL, *ofp = NULL; > + size_t prevsize, thissize, tmpsize; > int ch; > - char *prevline, *thisline; > > setlocale(LC_CTYPE, ""); > > @@ -133,15 +131,18 @@ main(int argc, char *argv[]) > if (pledge("stdio", NULL) == -1) > err(1, "pledge"); > > - prevline = malloc(MAXLINELEN); > - thisline = malloc(MAXLINELEN); > - if (prevline == NULL || thisline == NULL) > - err(1, "malloc"); > - > - if (fgets(prevline, MAXLINELEN, ifp) == NULL) > + prevline = NULL; > + prevsize = 0; > + if (getline(&prevline, &prevsize, ifp) == -1) { > + free(prevline); > + if (ferror(ifp)) > + err(1, "getline"); > exit(0); > - > - while (fgets(thisline, MAXLINELEN, ifp)) { > + } > + > + thisline = NULL; > + thissize = 0; > + while (getline(&thisline, &thissize, ifp) != -1) { > /* If requested get the chosen fields + character offsets. */ > if (numfields || numchars) { > t1 = skip(thisline); > @@ -157,11 +158,20 @@ main(int argc, char *argv[]) > t1 = prevline; > prevline = thisline; > thisline = t1; > + tmpsize = prevsize; > + prevsize = thissize; > + thissize = tmpsize; > repeats = 0; > } else > ++repeats; > } > + free(thisline); > + if (ferror(ifp)) > + err(1, "getline"); > + > show(ofp, prevline); > + free(prevline); > + > exit(0); > } > >