Hi.  I'm still very much a Perl beginner, but I'm reading "Mastering
Regular Expressions" by Friedl, so I expect to be able to at least help the
group out with regexs when I'm a few more chapters in.  ;o)

I've just created a script which, when run like this:

tags.pl filename.html

Produces results like this:

#snipped for brevity
<title>, 1 occurrances
<td>, 81 occurrances
</form>, 1 occurrances
</tr>, 23 occurrances
</font>, 10 occurrances
</title>, 1 occurrances
</strong>, 1 occurrances
<applet>, 1 occurrances
<script>, 6 occurrances
<hr>, 2 occurrances
<h3>, 1 occurrances
<img>, 3 occurrances
<h1>, 1 occurrances
<body>, 1 occurrances
</td>, 81 occurrances
<head>, 1 occurrances
</option>, 10 occurrances



Is there an easy way to have this print out the matching opening and
closing tags on one line:
Example:  <td> 81, </td> 81

If not, is there an easy way to sort the hash before printing, so that
either it's sorted by the value ,
(hopefully the <td> and </td> will not be too far apart in the output
then),
or sorted by key disregarding the optional "/"?

As always, if my beginner script is "functional but less-than-elegant" in
any regard, please feel free to educate me.

Thank you,
Shawn


tags.pl:
=======================================================================================================================
#!/usr/bin/perl
use warnings;
use strict;

my %files;
my $inFile = $ARGV[0];
my @tags;

open (IN, "<$inFile") || die "It blew up:\n$!\nCould not open file $inFile.
\n\n";

while (<IN>){


   @tags = split(">");

   foreach (@tags){

      $_ = "$_>";
      #table tags only
      #$files{lc $1} += 1 if $_ =~ /<(\/?t(d|r|able))[^>]*>/gi;

      #all tags
      $files{lc $1} += 1 if $_ =~ /<(\/?\w+)[^>]*>/gi;
   }

}

my @values = values %files;
my @keys = keys %files;

while (@keys){
   print "<" . pop(@keys) . ">, " , pop(@values), " occurrances\n";
}

=======================================================================================================================


Note:  Any disclaimers below this line are auto-appended by my company's
system.  I apologize for the ominous verbage.




**********************************************************************
This e-mail and any files transmitted with it may contain 
confidential information and is intended solely for use by 
the individual to whom it is addressed.  If you received
this e-mail in error, please notify the sender, do not 
disclose its contents to others and delete it from your 
system.

**********************************************************************


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to