Hello,
I am a Perl newcomer, and I'm trying to use the TokeParser module to extract
text from an HTML file. Here's the Perl code:
use HTML::TokeParser;
my $p = HTML::TokeParser->new("test.htm");
while ($p -> get_tag('b'))
{
print $p -> get_text(),"\n";
}
This works only on bold tags that are not 'inside' other tags. For the
following HTML:
<html>
<body>
<h1>Head 1</h1>
<b>Bolded</b>
<p><b><u>Bolded and underlined</u></b></p>
<p>New line</p>
</body>
</html>
I only get a printout of "Bolded", but not "Bolded and underlined" as I
expect.
What could be going on?
Thanks!
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]