Omega -1911 wrote:
Hi Rob & Dani,
Thanks for your help!!! I will try the suggestion you made Rob and as soon
as I finish typing this, I'll try Dani's code. I had someone by the name of
Chen Ken contact me off-list and provided me with the following regex that
appeared to work. Please let me know what you think:
my( $title, $event) = $data_string =~
m|([^>]*)(?:</FONT></b>)([^\]]*)([^<]*)|;
Hello Dave
You will need help to use HTML::TreeBuilder as it's fairly complex, and to help
you we need fuller information on the HTML you're processing. Can you publish a
bigger chunk? Or, better still, the URL where it is coming from?
The regex doesn't look right at all, the (?: .. ) around the closing font and
bold tags has no effect, and the ] in the character class needn't be escaped.
Apart from that it will grab everything from EVENT up to the end of the Ref #
value into $event and the closing ] into $3 which is then discarded. Not good at
all.
Against my better judgement I could offer
my @stuff = $data =~ />\s*([^<>]+)\s*</g;
which will return all the text between the HTML tags, but this will fall down if
you have something like <i>...</i> in the middle of one of the fields, which
will result in the text being broken into multiple segments. Better all round to
use a proper parser.
HTH,
Rob
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>