Hi,
I wanted to test Pig-HiveUDF support, and the example in HiveUDTF failed
with trunk version.
I tried 'explode' HiveUDTF using a file with a simple bag: {(A), (B), (C),
(D)}
When I launched this script (like in the example in HiveUDTF source code
suggests):
define explode HiveUDTF('explode');
A = load 'bag.txt' as (a0:{(b0:chararray)});
B = foreach A generate flatten(explode(a0));
dump B
It failed with this error message:
java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 0: Exception while executing [POCast (Name: Cast[bag:{(chararray)}] -
scope-2 Operator Key: scope-2) children: [[POProject (Name:
Project[bytearray][0] - scope-1 Operator Key: scope-1) children: null at
[]]] at [a0[-1,-1]]]: java.lang.RuntimeException: Unimplemented
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:489)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:549)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
Exception while executing [POCast (Name: Cast[bag:{(chararray)}] - scope-2
Operator Key: scope-2) children: [[POProject (Name: Project[bytearray][0] -
scope-1 Operator Key: scope-1) children: null at []]] at [a0[-1,-1]]]:
java.lang.RuntimeException: Unimplemented
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:364)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:216)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:270)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDataBag(POUserFunc.java:370)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:335)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:404)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:321)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.cleanup(PigGenericMapBase.java:123)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:148)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
*Caused by: java.lang.RuntimeException: Unimplemented*
at org.apache.pig.data.UnlimitedNullTuple.size(UnlimitedNullTuple.java:31)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:165)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNextDataByteArray(POProject.java:323)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNextDataBag(POCast.java:1833)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:335)
... 17 more
It looks to me, that this is an unimplemented feature, or am I missing
something?
Thanks,
Nandor