Re: why GNU grep is fast

Gabor Kovesdan Mon, 23 Aug 2010 03:24:17 -0700

Later on, he summarizes some of the existing implementations,including comments about the Plan 9 implementation and his own RE2,both of which efficiently handle international text (which seems tobe a major concern of Gabor's).
I believe Gabor is considering TRE for a good replacement regex library.

Yes. Oniguruma is slow, Google RE2 only supports Perl and fgrep syntaxbut not standard regex and Plan 9 implementation iirc only supportsfgrep syntax and Unicode but not wchar_t in general.

The key comment in Mike's GNU grep notes is the one about notbreaking into lines. That's simply double-scanning the input;instead, run the matcher over blocks of text and, when it finds amatch, work backwards from the match to find the appropriate linebeginning. This is efficient because most lines don't match.
I do like the idea.

So do I.

BTW, the fastgrep portion of bsdgrep is my fault/contribution to do afaster search bypassing the regex library. :) It certainly was notwritten with any encodings in mind; it was purely ASCII. As I havenot kept up with it, I do not know if anyone improved it or not.

It has been made wchar-compliant.

Gabor
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: why GNU grep is fast

Reply via email to