Hi,
I'm having some trouble trying to easily remove lines from a data file using
a regular expression. I can do it by reading the file in
a line at a time then deciding whether to chuck it or write it out. My data
looks something like this -
ENQ:SIMS RE:ELLIOTT,DONALD
ELLIOTT,DONALD,LAWRENCE
- DOB 1963SEP30 SEX:M
223 OREGAN CR SCORE:27
BUSINESS: 306-975-8315
RELATED EVENTS
GO0024158 1997APR18 COMPLAINANT CRIMINAL ACTIVIT
<<TAG:REPORT:GO 1997 0024158>>
GO0006897 1987FEB26 REG OWNER SEIZED VEHICLES
<<TAG:REPORT:GO 1987 0006897>>
AC0040436 2002MAY21 REG OWNER FAIL TO ST/REMAI
<<TAG:REPORT:AC 2002 0040436>>
AC0000072 1994JAN04 DRIVER IN NON FAT INJ ACC
<<TAG:REPORT:AC 1994 0000072>>
----------------------------------------------------------------------
<<TAG:REPORT:DATA Complicated multi-line
tags are possible. This really complicates
my parsing >>
MORE MATCHING PERSONS ON FILE
What I need to do is to remove all of the 'tags' from the file
my best attempt so far has been
$file_with_no_tags =~ s/<<TAG:.+>>//sig;
which removes everything from the first '<<TAG:' to the last '>>'
Is their a better way? (Actually any way that works would be better)
at a different part in my program I need to collect all of the tags.
this is the code I use for that -
my %tag_hash;
my @lines = split /\n/,$src;
my ($in_tag, $long_tag);
$in_tag = 'FALSE';
foreach my $line (@lines) {
if ($line =~ /<<TAG.+>>/ims) { # tag is
contained in one line
my ($label,$tagname,$tagval) = split /:/,$line,3;
chop $tagval; #remove trailing >
chop $tagval; #remove trailing >
$tag_hash{$tagname} = $tagval;
}
elsif ($line =~ /<<TAG/i) { # start of a
multi-line tag
$in_tag = 'TRUE';
$long_tag = $line;
}
elsif ($in_tag eq 'TRUE' and $line =~ />>/i) { # end of a
multi-line tag
$in_tag = 'FALSE';
$long_tag = "$long_tag\n$line";
my ($label,$tagname,$tagval) = split /:/,$long_tag,3;
chop $tagval; #remove trailing >
chop $tagval; #remove trailing >
$tag_hash{$tagname} = $tagval;
}
elsif ($in_tag eq 'TRUE') { #middle of a
multi-line tag
$long_tag = "$long_tag\n$line";
}
}
This strikes me as being a little long to do something this simple in perl.
Can anyone point me in a better/shorter/more easily understood direction?
Don
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]