[
https://issues.apache.org/jira/browse/PIG-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387849#comment-14387849
]
liyunzhang_intel commented on PIG-4269:
---------------------------------------
Hi [~mohitsabharwal]:
The reason why testAccumWithSort and testAccumAfterNestedOp fail in spark
while not in MR is:
POSort in not generated in the plan of MR while in the plan of spark.
Currently, we can not remove POSort in spark plan because we need poSort to
generate SortConverter to implement the sort feature in spark. The detail
error stackflow of "Caught error from UDF:
org.apache.pig.test.utils.AccumulatorBagCount exec() should not be called. " is
org.apache.pig.backend.hadoop.executionengine.util.AccumulatorOptimizerUtil#addAccumulatorSpark
org.apache.pig.backend.hadoop.executionengine.util.AccumulatorOptimizerUtil#check
if
org.apache.pig.backend.hadoop.executionengine.util.AccumulatorOptimizerUtil#check
meets POSort, it returns false and makes
AccumulatorOptimizerUtil.java#foundUDF is false and
po_foreach#setAccumulative()(see
https://github.com/kellyzly/pig/blob/spark/src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java#L129)
will not be executed and po_foreach#isAccumulative() is false. If
po_foreach#isAccumulative() false,org.apache.pig.EvalFunc#exec will be
executed(see
https://github.com/kellyzly/pig/blob/spark/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L361).
In org.apache.pig.test.utils.AccumulatorBagCount#exec(), exception "exec()
should not be called." is thrown out.
> Enable unit test "TestAccumulator" for spark
> --------------------------------------------
>
> Key: PIG-4269
> URL: https://issues.apache.org/jira/browse/PIG-4269
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4269.patch, PIG-4269_1.patch, PIG-4269_Jekins.png,
> TEST-org.apache.pig.test.TestAccumulator.txt
>
>
> error log is attached
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)