Hi Harsh, Now I know the number of maps and reduces run simultaneously is set by the administrator in mapred-site.xml with default value 2. But I cant get the point about number of slots. For my understanding by now, the number of slots is decides by hardware that administrator cannot change. Is that wright?
Tan Jun From: Harsh J Date: 2011-12-12 12:22 To: mapreduce-user Subject: Re: About slots of tasktracker and munber of map taskers Hi Tan, On 12-Dec-2011, at 8:48 AM, Tan Jun wrote: Hi, I dont really understand the meaning of the sentences in "The Definitive Guide"(page 155): Tasktrackers have a fixed number of slots for map tasks and for reduce tasks: for example, a tasktracker may be able to run two map tasks and two reduce tasks simultaneously. (The precise number depends on the number of cores and the amount of memory on the tasktracker; see “Memory” on page 254.) Does that mean the number of slots is fixed and the number of maps run simultaneously is set by user? Not by the user, but by the administrator. Each tasktracker is configured in production with a 'task slot' upper limit - say, 8 maps and 4 reducers for a 12-core machine. This is not auto-configured (unless you use auto cluster setup+configuration tools that determine it for you [0]), and has to be set when configuring Hadoop daemons. The book means to imply that you need to set these, based on the memory and CPU configuration of your machines. By default, tasktrackers have limits of 2+2. See http://wiki.apache.org/hadoop/LimitingTaskSlotUsage [0] - http://www.cloudera.com/products-services/tools/ is one.
