I'm trying to use the DistributedCache but having an issue resolving the
symlinks to my files.
My Driver class writes some hashmaps to files in the DC like this:
Path tPath = new Path("/data/cache/fd", UUID.randomUUID().toString());
os = new ObjectOutputStream(fs.create(tPath));
os.writeObject(myHashMap);
os.close();
URI uri = new URI(tPath.toString() + "#" + "q_map");
DistributedCache.addCacheFile(uri, config);
DistributedCache.createSymlink(config);
But what Path() do I need to access to read the symlinks?
I tried variations of "q_map", "work/q_map" but neither works.
The files are definitely there because I can set a config var to the path and
read the files in my reducer. For example, in my Driver class I set a variable
via
config.set(q_map, tPath.toString());
And then in my Reducer's setup() I do something like
Path q_map_path = new Path(config.get(q_map_path));
if (fs.exists(q_map_path)) {
HashMap<String,String> qMap = loadmap(conf,q_map_path);
}
I tried to resolve the path to the symlinks via ${mapred.local.dir}/work but
that doesn't work either.
In the STDOUT of my mapper attempt I see:
2012-05-29 03:59:54,369 - INFO [main:TaskRunner@759] -
Creating symlink:
/tmp/hadoop-mapred/mapred/local/taskTracker/distcache/-3168904771265144450_-884848596_406879224/varuna010/data/cache/fd/6dc9d5c0-98be-4105-bd59-b344924dd989
<-
/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0/work/q_map
Which says it's creating the symlinks, BUT I also see this output:
mapred.local.dir:
/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0
job.local.dir:
/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/work
mapred.task.id: attempt_201205250826_0020_m_000000_0
Path [work/q_map] does not exist
Path
[/tmp/hadoop-mapred/mapred/local/taskTracker/root/jobcache/job_201205250826_0020/attempt_201205250826_0020_m_000000_0/work/q_map]
does not exist
Which is from this code in my mapper's setup() method:
try {
System.out.printf("mapred.local.dir: %s\n",
conf.get("mapred.local.dir"));
System.out.printf(" job.local.dir: %s\n", conf.get("job.local.dir"));
System.out.printf(" mapred.task.id: %s\n", conf.get("mapred.task.id"));
fs = FileSystem.get(conf);
Path symlink = new Path("work/q_map");
Path fullpath = new Path(conf.get("mapred.local.dir") + "/work/q_map");
System.out.printf("Path [%s] ",symlink.toString());
if (fs.exists(symlink)) {
System.out.println("exists");
} else {
System.out.println("does not exist");
}
System.out.printf("Path [%s] ",fullpath.toString());
if (fs.exists(fullpath)) {
System.out.println("exists");
} else {
System.out.println("does not exist");
}
} catch (IOException e1) {
e1.printStackTrace();
}
Regards,
Alan