Daniel,
Thanks for the help.
I focused on rephrasing and doing some simple pre-processing.
By re-working the problem, the processing time was cut in half.
The CPU utilization is still high, but I think that is the
nature of the problem. The links helped.
The demo at the Jakarta site is terrific!
Thanks again!
Malcolm
PS: I know JavaScript supports Regular Expressions, do you plan
on including JavaScript rules in a future package release?
> -----Original Message-----
> From: Daniel F. Savarese [mailto:[EMAIL PROTECTED]]
> Sent: Monday, June 04, 2001 1:02 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Different Lib / Performance / Ack&Perl / Testing
>
>
>
> >I also noticed the same tools at http://www.savarese.org/oro/downloads/
>
> This is the old software on which jakarta-oro is based and is no
> longer supported.
>
> >The PerlTools have text/regex/Perl5StreamInput, and Jakarta
> source does not.
> >Are there any other significant differences?
>
> See the CHANGES file:
> http://jakarta.apache.org/cvsweb/index.cgi/~checkout~/jakarta-or
> o/CHANGES?conte
> nt-type=text/plain
>
> >I�m reading and parsing a large text file (50MB) using Perl5Match.
> >Doing a performance analysis I notice much of the processing time
> >is around the Perl5Match.contains(buffer, pattern) routine.
> >Is there anything I can do to resolve performance issues with
> Perl5Match?
> >On solaris, the cpu takes a hit for several minutes (30%).
>
> You're reading a 50 MB file, something that Java is very bad at doing.
> Performance will be dependent on a large variety of factors including
> the regular expressions you're searching for. Most pattern matching
> performance issues can be resolved by rewriting the regular expression.
> However, your issues would appear to be related to the 50 MB file.
>
> >Ack/Perl:
> >Doe anyone have a good source for the difference between Ack & Perl
> >regular expressions? The only thing I have been able to find is
>
> The O'Reilly book "Mastering Regular Expressions" by Jeffrey Friedl is
> a good reference. As for the differences between the .awk and .regex
> packages, you can compare:
> http://jakarta.apache.org/oro/api/org/apache/oro/text/awk/AwkCom
piler.html
and
http://jakarta.apache.org/oro/api/org/apache/oro/text/regex/package-summary.
htm
l
(just noticed this one is not up to date; it doesn't mention the POSIX
character classes).
We'll be writing a new user's guide soon to make it clear how to use the
packages and what the supported syntax is.
>Testing:
>I enjoy the applet @ oro that test Perl regular expressions!
Good to hear. In case you're referring to the old one, you can find
an updated version for jakarta-oro at:
http://jakarta.apache.org/oro/demo.html
The source code is also included in the 2.0.3 distribution.
daniel