Pedro Soto wrote:
> Hi,
> I am trying to write script to retrieve info from a file that looks like:
>
> col1 col2 col3
> A 5 10
> A 5 10
> A 5 11
> A 6 8
> A 7 9
> B 5 8
> B 6 9
> what i need is to get for each (non redundant) value from column 1, the
> corresponding non redundant values from column 2 and 3. e.g:
> For A (col 1), I want 5 -10, 5-11 and 6-8. For B: 5-8 and 6-9.
> I wrote a script to get rid of the redundant values using hashes and
> subroutines and it worked. However I still need to compare the elements from
> col2 and col3 with other values. To do this I want to sort the data, but I
> am struggling to sort the hash. It prints what I want but only if ask it to
> print within the subroutine (line 29). I do not know how to return a hash
> with the sorted values. I hope someone could help me out with this. The code
> is below:
>
>
> #! usr/local/bin/perl
>
> use warnings;
> use strict;
> my %db_del;
> my %std_dup;
> open(IN,"file.csv") || die;
> while (<IN>) {
> my @temp=split/,/;
> push (@{$db_del{$temp[0]}}, $temp[1]."\t".$temp[2]);
> }
> &NONRE(%db_del,%std_dup);
>
> foreach my $e(%db_dup) {
> foreach my $l (@{$db_dup{$e}}) {
> print "$e,$l,$std_dup{$l}\n"; #does not print $std_dup{$l}
> }}
>
> ########sub##############
> sub NONRE {
> my %hash;
> my %seen;
> my @uniq;
> my %st;
> %hash = @_;
> foreach my $k (sort keys%hash) {
> foreach my $item(@{$hash{$k}}) {
> push(@uniq,$item) unless $seen{$item}++;
> }
> foreach my $item(@uniq) {
> my @stend =split/\t/,$item;
> $st{$stend[0]}= $stend[1];
> }
> @{$hash{$k}}= sort {$a <=> $b} keys%st;
> foreach my $f(keys%hash){
> foreach my $l(@{$hash{$f}}) {
> print "$f,$l,$st{$l} ok\n";# it prints OK
> }
> }
> }
> @uniq =();
> %seen =();
> return(%hash,%st);
> }
I think this doesn't do what you want, because the hash %st is keyed by the
values from column 2, so pairs like (5,10) and (5,11) cannot both exist in %st.
But you do pass in a hash called %st_dup, so you may want something like that.
You can pass single hashes to and from subroutines as a simple list. So you
successfully passed in %db_del, for instance, but if you need to keep two or
more hashes separate you must pass them by reference.
having said that, I don't see any reason to pass in %st_dup, as it seems to be
only a return value. Remember that Perl doesn't pass its return values to the
parameters like this: it is possible to modify the contents of the @_ array,
which will alter the parameters that were passed, but that isn't recommended
unless you know what you're doing. Collect the return values from a subroutine
with a simple assignment, like this
my $return = subroutine($p1, $p2);
and if you need to pass back two hashes, you could write
return \%hash, \%st;
and then make the call like this.
{
my ($r1, $r2) = NONRE(%db_del);
%db_del = %$r1;
%std_dup = %$r2;
}
Finally, the program below does what I think you want (removes duplicate records
and prints the rest in sorted order) but you haven't said enough to be sure.
HTH,
Rob
use strict;
use warnings;
my %db_del;
open IN, '<', 'file.csv' or die $!;
while (<IN>) {
chomp;
my ($key, $f1, $f2) = split/,/;
$db_del{$key}{$f1,$f2} = [$f1, $f2];
}
foreach my $key (sort keys %db_del) {
my @vals = sort {
$a->[0] <=> $b->[0] or $a->[1] <=> $b->[1]
} values %{$db_del{$key}};
foreach my $val (@vals) {
print join ',', $key, @$val;
print "\n";
}
}
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/