Re: Complex regex help

Rob Dixon Fri, 01 Dec 2006 18:22:44 -0800

Omega -1911 wrote:


Hi Rob & Dani,

Thanks for your help!!! I will try the suggestion you made Rob and as soon
as I finish typing this, I'll try Dani's code. I had someone by the name of
Chen Ken contact me off-list and provided me with the following regex that
appeared to work. Please let me know what you think:

my( $title,  $event) = $data_string =~
    m|([^>]*)(?:</FONT></b>)([^\]]*)([^<]*)|;


Hello Dave

You will need help to use HTML::TreeBuilder as it's fairly complex, and to help
you we need fuller information on the HTML you're processing. Can you publish a
bigger chunk? Or, better still, the URL where it is coming from?

The regex doesn't look right at all, the (?: .. ) around the closing font and
bold tags has no effect, and the ] in the character class needn't be escaped.
Apart from that it will grab everything from EVENT up to the end of the Ref #
value into $event and the closing ] into $3 which is then discarded. Not good at
all.

Against my better judgement I could offer

 my @stuff = $data =~ />\s*([^<>]+)\s*</g;

which will return all the text between the HTML tags, but this will fall down if
you have something like <i>...</i> in the middle of one of the fields, which
will result in the text being broken into multiple segments. Better all round to
use a proper parser.

HTH,

Rob


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Complex regex help

Reply via email to