I'm using contains. Strange .. not sure what's going on. > Are you using contains() or match()? If you're using match(), then > switch to contains() and it should work. Here's my sanity check for > the pattern (to avoid having to write a Java test program): > > ~> wget -O - http://www.myspace.com/pain 2> /dev/null | perl -e '@txt = > <STDIN>; $txt = join("", @txt); $txt =~ > m#<span\s+class="nametext">[^<]*</span><br>[^<]*<font\s[^>]*><strong>([^<]+)</strong></font>#si; > print "$1\n";' > > Metal / Industrial >
Ah yes I figured that was the issue after I saw your pattern. The bits I don't understand though is how [^<]* is working. What exactly does that part of the pattern mean? In any case, the key to prevent excessive backtracking is to make the > pattern as specific as possible. The original pattern posed problems > because of the leading .* as well as following .+ pattern elements which > caused a lot of backtracking. > >
