Re: Multi-threaded map task

Bertrand Dechoux Mon, 14 Jan 2013 00:07:04 -0800

Well... It all depends on where is your bottleneck. Do a benchmark for your
use case if it is critical. Multi-threading might be useful not always. And
you would rather want to avoid having a locally shared mutable state
because it can become a pain to manage. But it doesn't mean you can't do
multi-threading...


You only need to browse the type hierarchy a bit to find about
http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.html

Regards

Bertrand

On Mon, Jan 14, 2013 at 8:22 AM, Mark Olimpiati <[email protected]> wrote:

> Thanks for the reply Nitin, but I don't see what's the bottleneck of having
> it distributed with multi-threaded maps ?
>
> I see your point in that each map is processing different splits, but my
> question is if each map task had 2 threads multiplexing  or running in
> parallel if there is enough cores to process the same split, wouldn't that
> be faster with enough cores?
>
> Mark
>
>
> On Sun, Jan 13, 2013 at 10:34 PM, Nitin Pawar <[email protected]
> >wrote:
>
> > Thats because its distributed processing framework over network
> > On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <[email protected]> wrote:
> >
> > > Hi, this is a simple question, but why wasn't map or reduce tasks
> > > programmed to be multi-threaded ? ie. instead of spawning 6 map tasks
> > for 6
> > > cores, run one map task with 6 parallel threads.
> > >
> > > In fact I tried this myself, but turns that threading is not helping as
> > it
> > > would be in regular java programs for some reason .. any feedback on
> this
> > > topic?
> > >
> > > Thanks,
> > > Mark
> > >
> >
>



-- 
Bertrand Dechoux

Re: Multi-threaded map task

Reply via email to