Rohini Palaniswamy created PIG-3878:
---------------------------------------
Summary: Improve parallelism of union and join
Key: PIG-3878
URL: https://issues.apache.org/jira/browse/PIG-3878
Project: Pig
Issue Type: Sub-task
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy
Fix For: tez-branch
Currently if user has no parallel clause specified, then it defaults to 1 and
it is bad for performance. MR does not have this issue as for each job number
of mappers are determined by input splits and number for reducers by
InputSizeReducerEstimator. Automatic reducer parallelism for Tez in general
will be handled in separate jiras. But a quick workaround can be done for joins
and unions by setting the parallelism of the reduce task to be sum of join
tasks till ARP is put in and better estimation is done.
--
This message was sent by Atlassian JIRA
(v6.2#6252)