Re: HTML::TokeParser/Parsing problem

Ovid Mon, 25 Nov 2002 09:06:57 -0800

--- sulfericacid <[EMAIL PROTECTED]> wrote:
> use HTML::TokeParser
> my $p = HTML::TokeParser->new(\$content);
> 
> my %meta;
> while (my $token = $p->get_token) {
>         next unless $token->[1] eq 'meta' && $token->[0] eq 'S';
>         $meta{$token->[2]->{name}} = $token->[2]{content};
> }
> 
> print "$_: $meta{$_}<br>\n" foreach (keys %meta);


You might also find using HTML::TokeParser::Simple a bit easier to use.  The above can 
be written
as:

  use HTML::TokeParser::Simple;
  my $parser = HTML::TokeParser::Simple->new( \$content );

  my %meta;
  while (my $token = $parser->get_token ) {
    next unless $token->is_start_tag( 'meta' );
    my $attr = $token->return_attr;

    # skip meta tags without a name element (content-type or refreshes)
    if ( exists $attr->{name} ) {
      $meta{ $attr->{name} } = $attr->{content};
    }
  }

Note that there is a bit more error checking in that.  I assume that you did not want 
to deal with
meta tags that replace headers (http-equivs and refreshes).

The nice thing about HTML::TokeParser::Simple is that it is a drop-in replacement.  If 
you use
HTML::TokeParser extensively in your code, you can then use the Simple version instead 
and only
change the bits you need to.  The rest of the code should still work fine.

Cheers,
Ovid

Cheers,
Ovid

=====
"Ovid" on http://www.perlmonks.org/
Web Programming with Perl:  http://users.easystreet.com/ovid/cgi_course/
Silence Is Evil: http://users.easystreet.com/ovid/philosophy/decency.txt

__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus � Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: HTML::TokeParser/Parsing problem

Reply via email to