--- Rasoul Hajikhani <[EMAIL PROTECTED]> wrote:
> Hi there,
> I am trying to match an expression that would perform different tasks
> depending on the returned value:
> 
>       #if (arguments begin with "<A HREF=")
>       if ($args =~ /^\<A HREF=.*/i)
>       {
>               # do this
>       }
>       else
>       {
>               # do this
>       }
> 
> but it always fails to return any thing. Can some one tell me what am I
> doing wrong? Appreciate all the help...
> -r

Parding HTML with a regular expression is difficult and error-prone.  I would strongly 
recommend
against.  The following snippet only works for a very small test case:

    foreach my $args ( <DATA> ) {
        if ($args =~ /^<\s*a\s*href\s*=/i) {
            print "HREF: $args";
        } else  {
            print "Not and HREF: $args";
        }
    }
    __DATA__
    <a href="test.cgi">
    <a hREf = "something_else.htm">
    <a name="bob">
    <a    href   =   '#bob'>

Knowing how your data gets into the system is at least as important as how your data 
leaves the
system.  Knowing your data source allows you to craft a better solution to the 
problem.  For
example, consider your regex:

    /^\<A HREF=.*/i

What is the source of the data?  Is it generated by another process or could humans 
affect it? 
There are several places where you can insert whitespace into that anchor tag, have 
valid HTML,
and cause your regex to fail.  Here's an example which will break code *and* mine:

    <a
     href=
     "somefile.html"
    >

That's annoying, but some of the documents I get have HTML formatted like that.  Also, 
you don't
need the dot star at the end.  You don't use that information and forcing the regex 
engine to
match it is wasteful.

I would recommend learning to use HTML::TokeParser or a similar module to parse HTML.  
If you are
only extracting links, try HTML::LinkExtor.

Cheers,
Curtis "Ovid" Poe

=====
Senior Programmer
Onsite! Technology (http://www.onsitetech.com/)
"Ovid" on http://www.perlmonks.org/

__________________________________________________
Terrorist Attacks on U.S. - How can you help?
Donate cash, emergency relief information
http://dailynews.yahoo.com/fc/US/Emergency_Information/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to