[
https://issues.apache.org/jira/browse/PIG-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15612202#comment-15612202
]
Nandor Kollar commented on PIG-5048:
------------------------------------
Do we need this UnlimitedNullTuple class at all? The only place where it is
used is in POForEach:
{code}
if (inp.returnStatus == POStatus.STATUS_EOP) {
if (parentPlan!=null && parentPlan.endOfAllInput &&
!endOfAllInputProcessed && endOfAllInputProcessing) {
// continue pull one more output
inp = new Result(POStatus.STATUS_OK, new
UnlimitedNullTuple());
} else {
return inp;
}
}
{code}
As far as I understood, this is used to allow UDF to produce a last record in
close, does close here mean the cleanup phase of map tasks? What if we use
RESULT_EMPTY from PhysicalOperator instead of UnlimitedNullTuple with OK
status? The description of STATUS_NULL tells 'This is represented as 'null'
with STATUS_OK', and it seems this is what we need instead of
UnlimitedNullTuple. [~daijy] could you please review my second patch, and help
me understand why UnlimitedNullTuple was required? I'd like to add a test case
where the UDF produces a last record in close to ensure that my patch doesn't
brake it, but I don't know when this happens.
> HiveUDTF fail if it is the first expression in projection
> ---------------------------------------------------------
>
> Key: PIG-5048
> URL: https://issues.apache.org/jira/browse/PIG-5048
> Project: Pig
> Issue Type: Bug
> Components: impl
> Reporter: Daniel Dai
> Assignee: Nandor Kollar
> Fix For: 0.17.0, 0.16.1
>
> Attachments: PIG-5048.patch
>
>
> The following script fail:
> {code}
> define explode HiveUDTF('explode');
> A = load 'bag.txt' as (a0:{(b0:chararray)});
> B = foreach A generate explode(a0);
> dump B;
> {code}
> Message: Unimplemented at
> org.apache.pig.data.UnlimitedNullTuple.size(UnlimitedNullTuple.java:31)
> If it is not the first projection, the script pass:
> {code}
> define explode HiveUDTF('explode');
> A = load 'bag.txt' as (a0:{(b0:chararray)});
> B = foreach A generate a0, explode(a0);
> dump B;
> {code}
> Thanks [~nkollar] reporting it!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)