small modification below
> -----Original Message-----
> From: Toby Stuart [mailto:[EMAIL PROTECTED]
> Sent: Thursday, February 19, 2004 2:37 PM
> To: 'Christian Watteng�rd'; [EMAIL PROTECTED]
> Subject: RE: Extracting data from html structure.
>
>
> > -----Original Message-----
> > From: Christian Watteng�rd [mailto:[EMAIL PROTECTED]
> > Sent: Wednesday, February 18, 2004 4:41 AM
> > To: [EMAIL PROTECTED]
> > Subject: Extracting data from html structure.
> >
> >
> > I have the following html structure:
> > --------------------------------------------------------------
> > ----------
>
> [long HTML snipped]
>
> > --------------------------------------------------------------
> > -----------
> > And I want to extract from it the chekbox values and their
> respective
> > channel names (contained in the link beside the checkbox).
> > I have checked a lot of modules on cpan but I haven't found
> > one that does it
> > just the way I want it to yet. Actually I havent found any
> > that I can get to
> > work at all.
> >
> > Any tips?
> >
> > Christian...
> >
>
> I snipped the HTML you provided cause it was sooooo long.
> Try and trim it
> down next time.
> Anyhow, I think the code below does what you want.
>
>
> use strict;
> use warnings;
>
> use HTML::Parser;
>
>
> my $HTML = <<EOF;
> <table border=0 cellpadding=0 cellspacing=0 width=156>
> <tr>
> <td colspan=2 bgcolor=#CDC9C0><b><font
> face=verdana,arial,helvetica,sans-serif size=-2
> color=#666666> Norske</font></b></td>
> </tr>
> <tr>
> <td width=78 valign=top><font class=link-00-ul-l size=1>
> <input type="checkbox" name=kanal_id[] value=1 CHECKED>
> <a
> href="index.html?kanal_id=1&dag=0&fra_tid=0&til_tid=24&kategor
> i_id=">NRK
> 1</a><br>
> <input type="checkbox" name=kanal_id[] value=3 >
> <a
> href="index.html?kanal_id=3&dag=0&fra_tid=0&til_tid=24&kategor
> i_id=">TV
> 2</a><br>
> <input type="checkbox" name=kanal_id[] value=5 >
> <a
> href="index.html?kanal_id=5&dag=0&fra_tid=0&til_tid=24&kategor
> i_id=">TVNorge
> </a><br>
> </font></td>
> </tr>
> </table>
> EOF
>
>
>
> my $current_tag; # i'm not happy with using this.
> # is there a better way? anyone?
>
> my $p = HTML::Parser->new(
> api_version => 3,
> start_h => [ \&start_tag, 'tagname,attr' ],
> text_h => [ \&text, 'text' ]
> );
>
> $p->parse($HTML);
> $p->eof;
>
> sub start_tag
> {
> my $name = shift;
> my $attrs = shift;
# my $text = shift; # removed
>
> $current_tag = $name;
>
> if ($name eq 'input' and $attrs->{'type'} eq 'checkbox')
> {
> print $attrs->{'value'}, "=";
> }
> }
>
> sub text
> {
> my $text = shift;
> if ($current_tag eq 'a')
> {
> print "$text\n";
> }
>
> }
>
>
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>