[
https://issues.apache.org/jira/browse/PIG-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515725#comment-15515725
]
liyunzhang_intel commented on PIG-4815:
---------------------------------------
[~szita]: Left some comments on review board. LGTM for spark part except some
code style problem, let's wait Daniel or other committee to review mr part.
For 4.2 in mr mode, it only consider whether there are 2 poloads in the mr
plan and whether one of is temporary and another is not temporary. In spark
mode, the spark plan is not complex like mr plan, it is ok to compare the
number of POForeach, POCast and POLoad with what we expect.
the spark plan and mr plan in
org.apache.pig.test.TestPigServer#testExplainXmlComplex is like
Spark plan:
{code}
Spark node scope-20
e: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-19
|
|---POJoinGroupSpark[tuple] - scope-15
|
|---d: New For Each(true)[bag] - scope-14
| |
| Project[bag][1] - scope-12
|
|---POJoinGroupSpark[tuple] - scope-8
|
|---b: New For Each(false,false,false)[bag] - scope-7
| |
| Project[bytearray][0] - scope-1
| |
| Project[bytearray][1] - scope-3
| |
| Project[bytearray][2] - scope-5
|
|---a:
Load(hdfs://zly1.sh.intel.com:8020/user/root/SkewedJoinInput1.txt:org.apache.pig.builtin.PigStorage)
- scope-0--------
{code}
MR plan:
{code}
#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-20
Map Plan
c: Local Rearrange[tuple]{bytearray}(false) - scope-10
| |
| Project[bytearray][0] - scope-11
|
|---b: New For Each(false,false,false)[bag] - scope-7
| |
| Project[bytearray][0] - scope-1
| |
| Project[bytearray][1] - scope-3
| |
| Project[bytearray][2] - scope-5
|
|---a:
Load(hdfs://zly1.sh.intel.com:8020/user/root/SkewedJoinInput1.txt:org.apache.pig.builtin.PigStorage)
- scope-0--------
Reduce Plan
Store(hdfs://zly1.sh.intel.com:8020/tmp/temp1364619035/tmp-960829474:org.apache.pig.impl.io.InterStorage)
- scope-21
|
|---d: New For Each(true)[bag] - scope-14
| |
| Project[bag][1] - scope-12
|
|---c: Package(Packager)[tuple]{bytearray} - scope-9--------
Global sort: false
----------------
MapReduce node scope-23
Map Plan
e: Local Rearrange[tuple]{bytearray}(false) - scope-17
| |
| Project[bytearray][2] - scope-18
|
|---Load(hdfs://zly1.sh.intel.com:8020/tmp/temp1364619035/tmp-960829474:org.apache.pig.impl.io.InterStorage)
- scope-22--------
Reduce Plan
e: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-19
|
|---e: Package(Packager)[tuple]{bytearray} - scope-16--------
Global sort: false
{code}
> Add xml format support for 'explain' in spark engine
> -----------------------------------------------------
>
> Key: PIG-4815
> URL: https://issues.apache.org/jira/browse/PIG-4815
> Project: Pig
> Issue Type: Task
> Components: spark
> Reporter: Prateek Vaishnav
> Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-4815.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)