Check this article from Cloudera on different ways of distributing a jar file to the job.
http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/ Praveen On Wed, Dec 28, 2011 at 5:40 AM, Eyal Golan <[email protected]> wrote: > Hello, > Another newbie question. > Suppose I want to use an external library (jar) in the mapper / reducer > classes. > (commons-lang, google's guava, etc.) > In our environment, I added the jars into a specific folder and added them > to HADOOP-CLASSPATH. > However, when running mapper that uses one of the jars, it could not find > the classes in that jar. > > I thought that it might be in our environment (I am not managing our > cluster). > > Then I read about DistributedCache. > Should I use it with methods such as addArchiveToClassPath, > addFileToClassPath, addCachArchive to use jar libraries? > If so, which method is more appropriate ? > > If not, how do we load jar libraries to each VM? > > Thanks, > > Eyal Golan > [email protected] > > Visit: http://jvdrums.sourceforge.net/ > LinkedIn: http://www.linkedin.com/in/egolan74 > Skype: egolan74 > > P Save a tree. Please don't print this e-mail unless it's really necessary > >
