Dmitriy, You can set for map or reduce tasks.
Please refer this link: http://hadoop.apache.org/common/docs/r1.0.1/mapred_tutorial.html#Task+Execution+%26+Environment <property> <name>mapred.map.child.java.opts</name> <value> -Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@[email protected] -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false </value> </property> <property> <name>mapred.reduce.child.java.opts</name> <value> -Xmx1024M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@[email protected] -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false </value> </property> On Tue, Mar 27, 2012 at 8:08 PM, Dmitriy Lyubimov <[email protected]> wrote: > Thank you, George. I assume you are referring to setenv.sh on the data > nodes to set library paths for task tracker, right? > On Mar 27, 2012 7:19 PM, "George Datskos" <[email protected]> > wrote: > >> Dmitriy, >> >> I just double-checked, and the caveat I stated earlier is incorrect. So, >> "-Djava.library.path" set in the client's {mapred.child.java.opts} should >> just append to to the "-Djava.library.path" that each TaskTracker has when >> creating the library path for each child (M/R) task. So that's even better >> I guess. >> >> >> George >> >> >> On 2012/03/28 11:06, George Datskos wrote: >> >>> Dmitriy, >>> >>> To deal with different servers having various shared libraries in >>> different locations, you can simply make sure the _TaskTracker_'s >>> -Djava.library.path is set correctly on each server. That library path >>> should be passed along to each child (M/R) task. (in *addition* to the >>> {mapred.child.java.opts} that you specify on the client-side configuration >>> options) >>> >>> One caveat: on the client-side, don't include "-Djava.library.path" or >>> that path will be passed along to all of the child tasks, overriding >>> site-specific one you set on the TaskTracker. >>> >>> >>> George >>> >>> >>> On 2012/03/28 10:43, Dmitriy Lyubimov wrote: >>> >>>> Hello, >>>> >>>> I have a couple of questions regarding mapreduce configurations. >>>> >>>> We install various platforms on data nodes that require mixed set of >>>> native libraries. >>>> >>>> Part of the problem is that in general case, this software platforms >>>> may be installed into different locations in the backend. (we try to >>>> unify it, but still). What it means, it may require site-specific >>>> -Djava.library.path setting. >>>> >>>> I configured individual jvm options (mapred.child.java.opts) on each >>>> node to include specific set of paths. However, i encountered 2 >>>> problems: >>>> >>>> #1: my setting doesn't go into effect unless I also declare it final >>>> in the data node. It's just being overriden by default -Xmx200 value >>>> from the driver EVEN when i don't set it on the driver at all (and >>>> there seems to be no way to unset it). >>>> >>>> However, using "final" spec at the backend creates a problem if some >>>> of numerous jobs we run wishes to override the setting still. The >>>> ideal behavior is if i don't set it in the driver, then backend value >>>> kicks in, otherwise it's driver's value. But i did not find a way to >>>> do that for this particular setting for some reason.Could somebody >>>> clarify the best workaround? thank you. >>>> >>>> #2. Ideal behavior would actually be to merge driver-specific and >>>> backend-specific settings. E.g. backend may need to configure specific >>>> software package locations while client may wish sometimes to set heap >>>> etc. Is there a best practice to achieve this effect? >>>> >>>> Thank you very much in advance. >>>> -Dmitriy >>>> >>>> >>>> >>> >>> >>> >>> >> >>
