Hello Folks,
This is my first post here.
I am trying to emulate Linux 'sort' command through Perl. I got following
code through Internet to sort the text file:
# cat sort.pl
my $column_number = 2; # Sorting by 3rd column since 0-origin based
my $prev = "";
for (
map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { [$_, (split)[$column_number]] }
<>
) {
print unless $_ eq $prev;
$prev = $_;
}
Suppose I want to sort the data of text file having following rows &
columns:
# cat test.out
jhvXgF U13GWt 3OvMCf VMkAWj
4ewejk pFnjd4 ie0hZF pPipQJ
4ewejk 4sqprx ie0hZF cqtexi
FT9mWp d4fgMB gvZRJU XRRu0N
hnzI2c GXAXWF 6xKH7A 3dLh18
When I sort it using the 'sort' command by 3rd column I get following
output:
# sort -u -k 3 test.out
jhvXgF U13GWt 3OvMCf VMkAWj
hnzI2c GXAXWF 6xKH7A 3dLh18
FT9mWp d4fgMB gvZRJU XRRu0N
4ewejk 4sqprx ie0hZF cqtexi
4ewejk pFnjd4 ie0hZF pPipQJ
However when I sort the same text file by 3rd column using the piece of
code, I get following:
jhvXgF U13GWt 3OvMCf VMkAWj
hnzI2c GXAXWF 6xKH7A 3dLh18
FT9mWp d4fgMB gvZRJU XRRu0N
4ewejk pFnjd4 ie0hZF pPipQJ
4ewejk 4sqprx ie0hZF cqtexi
Difference can be seen the last 2 row values of 2nd column.
The reason being 'ie0hZF' is repeated twice in 3rd column and also
corresponding values in 1st column are same - '4ewejk' so discrepancy has
occured in 2nd column.
Can anybody help me fix the bug in the above code.
Also as I am a beginner in Perl, I couldn't understand the code completely
so once the bug is fixed if someone could explain it line by line, I would
be grateful to him/her...
Cheers,
Parag