The transient Map->Reduce files do not go to the DFS, but rather onto
the local filesystem directories specified by the "mapred.local.dir"
parameter. If you expand this configuration to be similar to
"dfs.data.dir" (as your DataNode may be carrying), then it will get
more space/disks to do its work.

See this very recent conversation for more information:
http://search-hadoop.com/m/DWbsZ1m0Ttx

On Thu, Apr 26, 2012 at 1:24 AM, Nuthalapati, Ramesh
<[email protected]> wrote:
> Harsh -
>
> Even if it's the case, my free tmp memory is more than the dfs used - Isn't 
> it ?
>
> Configured Capacity: 116258406400 (108.27 GB)
> Present Capacity: 110155911168 (102.59 GB)
> DFS Remaining: 101976682496 (94.97 GB)
> DFS Used: 8179228672 (7.62 GB)
> DFS Used%: 7.43%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 172.17.7.83:50010
> Decommission Status : Normal
> Configured Capacity: 116258406400 (108.27 GB)
> DFS Used: 8179228672 (7.62 GB)
> Non DFS Used: 6102495232 (5.68 GB)
> DFS Remaining: 101976682496(94.97 GB)
> DFS Used%: 7.04%
> DFS Remaining%: 87.72%
> Last contact: Wed Apr 25 12:52:19 PDT 2012
>
> Thanks !
>
> -----Original Message-----
> From: Harsh J [mailto:[email protected]]
> Sent: Wednesday, April 25, 2012 3:42 PM
> To: [email protected]
> Subject: Re: No Space left on device
>
> Ramesh,
>
> That explains it then.
>
> Going from Map to Reduce requires disk storage worth at least the amount of 
> data you're gonna be sending between them. If you're running your 'cluster' 
> on a single machine, the answer to your question is yes.
>
> On Thu, Apr 26, 2012 at 1:01 AM, Nuthalapati, Ramesh 
> <[email protected]> wrote:
>> I have lot of space available
>>
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/mapper/sysvg-opt
>>                       14G  1.2G   12G   9% /opt
>>
>> My input files are around 10G, is there a requirement that the hadoop tmp 
>> dir should be at certain % of the input files or something ?
>>
>> Thanks !
>>
>> -----Original Message-----
>> From: Harsh J [mailto:[email protected]]
>> Sent: Wednesday, April 25, 2012 3:19 PM
>> To: [email protected]
>> Subject: Re: No Space left on device
>>
>> This is from your mapred.local.dir (which by default may reuse 
>> hadoop.tmp.dir).
>>
>> Do you see free space available when you do the following?:
>> df -h /opt/hadoop
>>
>> On Thu, Apr 26, 2012 at 12:43 AM, Nuthalapati, Ramesh 
>> <[email protected]> wrote:
>>> Strangely isee the tmp folder has enough space. What else could be the 
>>> problem ? How much should my tmp space be ?
>>>
>>>
>>> Error: java.io.IOException: No space left on device
>>>        at java.io.FileOutputStream.writeBytes(Native Method)
>>>        at java.io.FileOutputStream.write(FileOutputStream.java:260)
>>>        at
>>> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write
>>> (
>>> RawLocalFileSystem.java:190)
>>>        at
>>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65
>>> )
>>>        at
>>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>>>        at
>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOut
>>> p
>>> utStream.java:49)
>>>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>>        at
>>> org.apache.hadoop.mapred.IFileOutputStream.write(IFileOutputStream.ja
>>> v
>>> a:84)
>>>        at
>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOut
>>> p
>>> utStream.java:49)
>>>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>>        at
>>> org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:218)
>>>        at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:157)
>>>        at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(Re
>>> d
>>> uceTask.java:2454)
>>>
>>> java.io.IOException: Task: attempt_201204240741_0003_r_000000_1 - The
>>> reduce copier failed
>>>        at
>>> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
>>>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
>>> Could not find any valid local directory for
>>> file:/opt/hadoop/tmp/hadoop-hadoop/mapred/local/taskTracker/jobcache/
>>> j
>>> ob_201204240741_0003/attempt_201204240741_0003_r_000000_1/output/map_
>>> 1
>>> 22.out
>>>        at
>>> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPa
>>> t
>>> hForWrite(LocalDirAllocator.java:343)
>>>        at
>>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirA
>>> l
>>> locator.java:124)
>>>        at
>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(Re
>>> d
>>> uceTask.java:2434)
>>>
>>>
>>
>>
>>
>> --
>> Harsh J
>
>
>
> --
> Harsh J



-- 
Harsh J

Reply via email to