[jira] Commented: (HADOOP-6664) fs.inmemory.size.mb not listed in conf. Cluster setup page gives wrong advice.

Edward Capriolo (JIRA) Tue, 20 Apr 2010 13:03:14 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859044#action_12859044
 ]


Edward Capriolo commented on HADOOP-6664:
-----------------------------------------

If I understandard correctly the docs for current are based on current stable 
0.20.2.  Current stable does not use fs.inmemory.size.mb.

http://hadoop.apache.org/common/docs/current/cluster_setup.html. Under real 
world configurations 

{noformat}
conf/core-site.xml      fs.inmemory.size.mb     200      Larger amount of 
memory allocated for the in-memory file-system used to merge map-outputs at the 
reduces. 
{noformat}

As to "io.sort.factor and io.sort.mb"

They both appear in mapred-default.xml
{noformat}
[edw...@ec src]$ grep -R "io.sort.factor" */*.xml
mapred/mapred-default.xml:  <name>io.sort.factor</name>
{noformat}

They should be in core-default.xml (only), or in both core-default.xml and 
mapred-default.conf.

Think about the end user. An end user might read a blog that states, 
"io.sort.factor is a magic tune set this to XXXX for awesome performance". 
Which file should end user put this variable in?

{noformat}
grep -R "io.sort.factor" */*.xml    
mapred/mapred-default.xml:  <name>io.sort.factor</name>
{noformat}

End user thinks, "Since I found this variable in mapred-default.xml it makese 
sense that I should override it in mapred-site.xml" 

The user puts the variable in the wrong place, because end user has no (easy) 
way of knowing that SequenceFile uses io.sort.factor or io.sort.mb. Does that 
make sense?


> fs.inmemory.size.mb not listed in conf. Cluster setup page gives wrong advice.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6664
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6664
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: conf, documentation
>    Affects Versions: 0.20.2
>            Reporter: Edward Capriolo
>
> http://hadoop.apache.org/common/docs/current/cluster_setup.html
> fs.inmemory.size.mb does not appear in any xml file
> {noformat}
> grep "fs.inmemory.size.mb" ./mapred/mapred-default.xml 
> [edw...@ec src]$ grep "fs.inmemory.size.mb" ./hdfs/hdfs-default.xml 
> [edw...@ec src]$ grep "fs.inmemory.size.mb" ./core/core-default.xml 
> {noformat}
> http://hadoop.apache.org/common/docs/current/cluster_setup.html
> Documentation error:
> Real-World Cluster Configurations
> {noformat}
> conf/core-site.xml    io.sort.factor          100     More streams merged at 
> once while sorting files.
> conf/core-site.xml    io.sort.mb      200     Higher memory-limit while 
> sorting data.
> {noformat}
> core --- io.sort.factor                                       -- should be 
> mapred
> core --- io.sort.mb                                   -- should be mapred

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-6664) fs.inmemory.size.mb not listed in conf. Cluster setup page gives wrong advice.

Reply via email to