Thanks Jeremy. I tried with your first suggestion and the mappers ran into
completion. But then the reducers failed with another exception related to
pipes. I believe it may be due to permission issues again. I tried setting a
few additional config parameters but it didn't do the job. Please find the
command used and the error logs from jobtracker web UI
hadoop jar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar
-D hadoop.tmp.dir=/home/streaming/tmp/hadoop/ -D
dfs.data.dir=/home/streaming/tmp -D
mapred.local.dir=/home/streaming/tmp/local -D
mapred.system.dir=/home/streaming/tmp/system -D
mapred.temp.dir=/home/streaming/tmp/temp -input
/userdata/bejoy/apps/wc/input -output /userdata/bejoy/apps/wc/output
-mapper /home/streaming/WcStreamMap.py -reducer
/home/streaming/WcStreamReduce.py
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 127
at
org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at
org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:478)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
The folder permissions at the time of job execution are as follows
cloudera@cloudera-vm:~$ ls -l /home/streaming/
drwxrwxrwx 5 root root 4096 2011-09-12 05:59 tmp
-rwxrwxrwx 1 root root 707 2011-09-11 23:42 WcStreamMap.py
-rwxrwxrwx 1 root root 1077 2011-09-11 23:42 WcStreamReduce.py
cloudera@cloudera-vm:~$ ls -l /home/streaming/tmp/
drwxrwxrwx 2 root root 4096 2011-09-12 06:12 hadoop
drwxrwxrwx 2 root root 4096 2011-09-12 05:58 local
drwxrwxrwx 2 root root 4096 2011-09-12 05:59 system
drwxrwxrwx 2 root root 4096 2011-09-12 05:59 temp
Am I missing some thing here?
It is not for long I'm into Linux so couldn't try your second suggestion on
setting up the Linux task controller.
Thanks a lot
Regards
Bejoy.K.S
On Mon, Sep 12, 2011 at 6:20 AM, Jeremy Lewi <[email protected]> wrote:
> I would suggest you try putting your mapper/reducer py files in a directory
> that is world readable at every level . i.e /tmp/test. I had similar
> problems when I was using streaming and I believe my workaround was to put
> the mapper/reducers outside my home directory. The other more involved
> alternative is to setup the linux task controller so you can run your MR
> jobs as the user who submits the jobs.
>
> J
>
>
> On Mon, Sep 12, 2011 at 2:18 AM, Bejoy KS <[email protected]> wrote:
>
>> Hi
>> I wanted to try out hadoop steaming and got the sample python code
>> for mapper and reducer. I copied both into my lfs and tried running the
>> steaming job as mention in the documentation.
>> Here the command i used to run the job
>>
>> hadoop jar
>> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar
>> -input /userdata/bejoy/apps/wc/input -output /userdata/bejoy/apps/wc/output
>> -mapper /home/cloudera/bejoy/apps/inputs/wc/WcStreamMap.py -reducer
>> /home/cloudera/bejoy/apps/inputs/wc/WcStreamReduce.py
>>
>> Here other than input and output the rest all are on lfs locations. How
>> ever the job is failing. The error log from the jobtracker url is as
>>
>> java.lang.RuntimeException: Error in configuring object
>> at
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>> at
>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>> at
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:386)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>> at org.apache.hadoop.mapred.Child.main(Child.java:262)
>> Caused by: java.lang.reflect.InvocationTargetException
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>> ... 9 more
>> Caused by: java.lang.RuntimeException: Error in configuring object
>> at
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>> at
>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>> at
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>> ... 14 more
>> Caused by: java.lang.reflect.InvocationTargetException
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>> ... 17 more
>> Caused by: java.lang.RuntimeException: configuration exception
>> at
>> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230)
>> at
>> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
>> ... 22 more
>> Caused by: java.io.IOException: Cannot run program
>> "/home/cloudera/bejoy/apps/inputs/wc/WcStreamMap.py": java.io.IOException:
>> error=13, Permission denied
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>> at
>> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
>> ... 23 more
>> Caused by: java.io.IOException: java.io.IOException: error=13, Permission
>> denied
>> at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
>> at java.lang.ProcessImpl.start(ProcessImpl.java:65)
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
>> ... 24 more
>>
>> On the error I checked the permissions of mapper and reducer. Issued a
>> chmod 777 command as well. Still no luck.
>>
>> The permission of the files are as follows
>> cloudera@cloudera-vm:~$ ls -l /home/cloudera/bejoy/apps/inputs/wc/
>> -rwxrwxrwx 1 cloudera cloudera 707 2011-09-11 23:42 WcStreamMap.py
>> -rwxrwxrwx 1 cloudera cloudera 1077 2011-09-11 23:42 WcStreamReduce.py
>>
>> I'm testing the same on Cloudera Demo VM. So the hadoop setup would be on
>> pseudo distributed mode. Any help would be highly appreciated.
>>
>> Thank You
>>
>> Regards
>> Bejoy.K.S
>>
>>
>