On Thu, May 31, 2012 at 11:37 AM, nathalie <[email protected]> wrote:
>
>
> Hi
> I have this format of file: (see attached example)
> 1 3206102-3207048 3411782-3411981 3660632-3661428
> 2 4481796-4482748 4483180-4483486
>
>
> and I would like to change it to this
> 1 3206102-3207048
> 1 3411782-3411981
> 1 3660632-3661428
> 2 4481796-4482748
> 2 4483180-4483486 .....
>
>
> I have tried with this script to create an array for each line, and to
> print the first element (1 or 2) with the rest of the line but the output
> don't seem to be right, could you please advise?
> #!/software/bin/perl
> use warnings;
> use strict;
> my $file="example.txt";
> my $in;
> open( $in , '<' , $file ) or die( $! );
> #open( $out, ">>txtout");
>
>
> while (<$in>){
> next if /^#/;
> my @lines=split(/\t/);
> chomp;
> for (@lines) { print $lines[0],"\t",$_,"\n"; };
>
>
> ouput
> 1 1 i don't want this
> 1 3206102-3207048
> 1 3411782-3411981
> 1 3660632-3661428
> 1 i don't want this
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
>
> 1 1
> 1 4334680-4340171
> 1 4341990-4342161
> 1 4342282-4342905
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
>
> 1 1
> 1 4481796-4482748
> 1 4483180-4483486
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
>
> 1 1
> 1 4797994-4798062
> 1 4798535-4798566
> 1 4818664-4818729
> 1 4820348-4820395
> 1 4822391-4822461
> 1 4827081-4827154
> 1 4829467-4829568
> 1 4831036-4831212
> 1 4835043-4835096
>
> many thanks
> Nathalie
>
>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research
> Limited, a charity registered in England with number 1021457 and a company
> registered in England with number 2742969, whose registered office is 215
> Euston Road, London, NW1 2BE.
> --
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> http://learn.perl.org/
>
>
Hi Nathalie,
Instead of using the split function I would personally go for a regular
expression as it allows for a lot more control over what you want to find.
Here is my solution...
#!/usr/local/bin/perl
use strict;
use warnings;
my $fh;
my %results;
open ( $fh, '<', 'temp.txt' ) or die $!;
while ( <$fh> ) {
chomp;
my $line = $_;
my $rownum = substr($line, 0, 1);
my @othernumbers;
while ( /(\d{7}-\d{7})/g ) {
push ( @othernumbers, $1 );
}
$results{$rownum} = \@othernumbers;
}
close $fh;
use Data::Dumper;
print Dumper %results;
This should print the results below:
$VAR1 = '1';
$VAR2 = [
'3206102-3207048',
'3411782-3411981',
'3660632-3661428'
];
$VAR3 = '2';
$VAR4 = [
'4481796-4482748',
'4483180-4483486'
];
And this is I believe where you wanted to go. Of course you could just
print it directly without the need for the temp variables etc but I assume
that you want to do something more with the found values then just dump
them on your screen.
Regards,
Rob