I'm writing a plugin for Nutch 2.  When built, it creates a job file which
is then executed via Hadoop.

My plugin uses a third party library that requires a File object pointing
to a directory of files.

I looked at DistributedCache, but I'm not sure how to use it to get a File
object.

On Thu, Oct 11, 2012 at 10:26 AM, Harsh J <[email protected]> wrote:

> Hi Bai,
>
> What exactly do you mean by a 'job file' and have you considered using
> DistributedCache, as detailed at
> http://hadoop.apache.org/docs/stable/mapred_tutorial.html#DistributedCache
> ?
>
> On Thu, Oct 11, 2012 at 7:44 PM, Bai Shen <[email protected]> wrote:
> > I'm trying to reference a directory inside my job file from my code.  I
> > have a third party library that I need to pass a File object to which
> > references the directory in the job file.
> >
> > How do I go about doing this?  If I just do new File("dir") it looks for
> > the directory on the client machine that I'm calling the job from instead
> > of the directory in the actual job file itself.
> >
> > Thanks.
>
>
>
> --
> Harsh J
>

Reply via email to