Re: The location of the map execution

Hassen Riahi Sun, 04 Mar 2012 06:23:54 -0800

Thanks for the suggestion!

The map is executed where the data is located when using theFairScheduler.


Hassen

Sorry, I meant have you set the mapred.jobtracker.taskScheduler
property in your mapred-site.xml file. If not, you're using the
standard, FIFO scheduler. The default scheduler doesn't do data-local
scheduling, but the fair scheduler and capacity scheduler do. You want
to set mapred.jobtracker.taskScheduler to either
org.apache.hadoop.mapred.FairScheduler (for the fair scheduler) or
org.apache.hadoop.mapred.CapacityTaskScheduler (for the capacity
scheduler) and then restart the JobTracker. You can read about the two
schedulers here:

http://hadoop.apache.org/common/docs/current/fair_scheduler.html
http://hadoop.apache.org/common/docs/current/capacity_scheduler.html

-Joey

On Sat, Mar 3, 2012 at 6:32 PM, Hassen Riahi <[email protected]>wrote:

The jobtracker is running in another machine (node C)

Hassen
Which scheduler are you using?

-Joey

On Mar 3, 2012, at 18:52, Hassen Riahi <[email protected]> wrote:
Hi all,
We tried using mapreduce to execute a simple map code which reada txt
file stored in HDFS and write then the output.
The file to read is a very small one. It was not split and written
entirely and only in a single datanode (node A). This node isconfigured
also as a tasktracker node
While we was expecting that the location of the map execution isnode A(since the input is stored there), from log files, we see thatthe map was
executed in another tasktracker (node B) of the cluster.
Am I missing something?

Thanks for the help!
Hassen




--
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Re: The location of the map execution

Reply via email to