Konstantin Harasov created PIG-5038:
---------------------------------------

             Summary: Pig e2e test failed with Sort check failed (TEST: Limit_2)
                 Key: PIG-5038
                 URL: https://issues.apache.org/jira/browse/PIG-5038
             Project: Pig
          Issue Type: Bug
            Reporter: Konstantin Harasov
             Fix For: 0.17.0


{noformat}
error: Going to run sort check command: sort -cs -t     -k 1,3 
./out/pigtest/../..-1475241304-nightly.conf/Limit_2.out/out_original
/bin/sort: 
./out/pigtest/../..-1475241304-nightly.conf/Limit_2.out/out_original:27: 
disorder:       18
Sort check failed
INFO: TestDriver::runTestGroup() at 706:Test Limit_2 FAILED at 1475241624
Ending test Limit_2 at 1475241624
{noformat}


The test failed because of difference in sorting in Pig {{(ORDER BY $0,$1,$2)}} 
and {{sort -t  $'\t'-k 1,3}} in bash.
The problem is that empty fields are sorted/processed differently 
in Pig using {{ORDER BY}} and bash using {{sort}}.

See example for file studentnulltab10k.

*Pig*:

{code:linenumbers=true}
                
                
                
                0.12
                1.04
                1.15
                1.25
                1.27
                1.31
                1.59
                1.61
                1.62
                1.76
                1.95
                2.09
                2.35
                2.66
                3.04
                3.23
                3.31
                3.39
                3.46
                3.54
                3.65
                3.75
                3.97
        18      
        18      0.41
{code}

*bash: sort -t  $'\t'-k 1,3*

{code:linenumbers=true}
                
                
                
                0.12
                1.04
                1.15
                1.25
                1.27
                1.31
                1.59
                1.61
                1.62
                1.76
        18      
        18      0.41
        18      0.54
        18      1.78
        18      2.46
        18      2.54
        19      0.07
        19      0.27
        19      0.39
        19      2.27
        19      2.50
        19      2.60
        19      2.89
        19      3.87
                1.95
{code}


*bash: sort -t  $'\t'-k 1,2*

{code:linenumbers=true}
                
                
                
                0.12
                1.04
                1.15
                1.25
                1.27
                1.31
                1.59
                1.61
                1.62
                1.76
                1.95
                2.09
                2.35
                2.66
                3.04
                3.23
                3.31
                3.39
                3.46
                3.54
                3.65
                3.75
                3.97
        18      
        18      0.41
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to