[
https://issues.apache.org/jira/browse/HADOOP-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859044#action_12859044
]
Edward Capriolo commented on HADOOP-6664:
-----------------------------------------
If I understandard correctly the docs for current are based on current stable
0.20.2. Current stable does not use fs.inmemory.size.mb.
http://hadoop.apache.org/common/docs/current/cluster_setup.html. Under real
world configurations
{noformat}
conf/core-site.xml fs.inmemory.size.mb 200 Larger amount of
memory allocated for the in-memory file-system used to merge map-outputs at the
reduces.
{noformat}
As to "io.sort.factor and io.sort.mb"
They both appear in mapred-default.xml
{noformat}
[edw...@ec src]$ grep -R "io.sort.factor" */*.xml
mapred/mapred-default.xml: <name>io.sort.factor</name>
{noformat}
They should be in core-default.xml (only), or in both core-default.xml and
mapred-default.conf.
Think about the end user. An end user might read a blog that states,
"io.sort.factor is a magic tune set this to XXXX for awesome performance".
Which file should end user put this variable in?
{noformat}
grep -R "io.sort.factor" */*.xml
mapred/mapred-default.xml: <name>io.sort.factor</name>
{noformat}
End user thinks, "Since I found this variable in mapred-default.xml it makese
sense that I should override it in mapred-site.xml"
The user puts the variable in the wrong place, because end user has no (easy)
way of knowing that SequenceFile uses io.sort.factor or io.sort.mb. Does that
make sense?
> fs.inmemory.size.mb not listed in conf. Cluster setup page gives wrong advice.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-6664
> URL: https://issues.apache.org/jira/browse/HADOOP-6664
> Project: Hadoop Common
> Issue Type: Task
> Components: conf, documentation
> Affects Versions: 0.20.2
> Reporter: Edward Capriolo
>
> http://hadoop.apache.org/common/docs/current/cluster_setup.html
> fs.inmemory.size.mb does not appear in any xml file
> {noformat}
> grep "fs.inmemory.size.mb" ./mapred/mapred-default.xml
> [edw...@ec src]$ grep "fs.inmemory.size.mb" ./hdfs/hdfs-default.xml
> [edw...@ec src]$ grep "fs.inmemory.size.mb" ./core/core-default.xml
> {noformat}
> http://hadoop.apache.org/common/docs/current/cluster_setup.html
> Documentation error:
> Real-World Cluster Configurations
> {noformat}
> conf/core-site.xml io.sort.factor 100 More streams merged at
> once while sorting files.
> conf/core-site.xml io.sort.mb 200 Higher memory-limit while
> sorting data.
> {noformat}
> core --- io.sort.factor -- should be
> mapred
> core --- io.sort.mb -- should be mapred
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.