[EMAIL PROTECTED] wrote:
>
> hi,
Hello,
> im pasting some more of the lines i need to parse. i guess
> im just learning regex and just espesially learning how to
> ask the correct questions! heh, dont ask regex without
> showing enuf of the stuff you want to parse :)
>
> theres is these three basic entries (skakkebaek is spelled awfully. only skak
>matches em all :-) :
>
> Hits Cited Author Cited Work Volume Page Year
>
>______________________________________________________________________________________________
>
> [_][667] 279 ...Skakkebaek NE ENVIRON HEALTH PERSP 104 741
>1996
> [_] 1 SKAKKEBAEK NE EARLY DETECTION TEST 26 1981
> [_] 3 SKAKKEBAEK NE EARLY DETECTION TEST 1981
>
> then there is these freaks:
> this one contains NE in the name and NE in GENE so GENE is truncated if care is not
>taken.
>
> [_][718] 18 ...Skakkebaek NE GENE CHROMOSOME CANC 20 412
>1997
>
> here the journal name starts with 7
>
> [_] 3 SKAKKEBAEK NE 7 WORLD C FERT STER 1971
>
> here the journal name ends with a digit thus entangling it in the following page
>number 101.
>
> [_] 1 SKAKKEBAEK NE ENV HLTH PERSPECT S2 101 1 1993
>
> here is my mathing routine - works with all but the last freak!:
>
> [snip code]
This works with the data above:
while ( <DATA> ) {
chomp;
my @field = split /\s{2,}/;
shift @field if $field[0] =~ /]$/;
(my $citations) = (shift @field) =~ /(\d+)$/;
shift @field if $field[0] =~ /skak.*ne$/i;
my $journal = shift @field;
my $year = pop @field || '';
my $page = pop @field || '';
my $volume = pop @field || '';
print "Citation: $citations\nJournal: $journal\nVolume: $volume\nPage:
$page\nYear: $year\n\n";
}
__DATA__
[_][667] 279 ...Skakkebaek NE ENVIRON HEALTH PERSP 104 741
1996
[_] 1 SKAKKEBAEK NE EARLY DETECTION TEST 26 1981
[_] 3 SKAKKEBAEK NE EARLY DETECTION TEST 1981
[_][718] 18 ...Skakkebaek NE GENE CHROMOSOME CANC 20 412
1997
[_] 3 SKAKKEBAEK NE 7 WORLD C FERT STER 1971
[_] 1 SKAKKEBAEK NE ENV HLTH PERSPECT S2 101 1 1993
John
--
use Perl;
program
fulfillment
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]