[
https://issues.apache.org/jira/browse/HADOOP-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172743#comment-14172743
]
Byron Wong commented on HADOOP-6857:
------------------------------------
*Scenario 1*: "/test" is a snapshottable directory with a file "a" that has 41
bytes, replication factor 3.
We run {{hadoop fs -du /test}}:
{code}
41 123 /test/a
{code}
which is consistent with what we get when we run {{hadoop fs -du -s /test}}:
{code}
41 123 /test
{code}
When we create a snapshot "ss1" and rerun the -du commands again, we still get
the same results as seen above.
Let's say we now run {{hadoop fs -mv /test/a /test/b}}.
Now, when we run {{hadoop fs -du /test}}, we get:
{code}
41 123 /test/b
{code}
which is inconsistent with what we see when we run {{hadoop fs -du -s /test}}:
{code}
41 246 /test
{code}
If we report this process again (i.e. create snapshot, rename /test/b to
/test/a), we get more and more deviations between the 2 commands.
> FsShell should report raw disk usage including replication factor
> -----------------------------------------------------------------
>
> Key: HADOOP-6857
> URL: https://issues.apache.org/jira/browse/HADOOP-6857
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Reporter: Alex Kozlov
> Assignee: Byron Wong
> Attachments: HADOOP-6857.patch, show-space-consumed.txt
>
>
> Currently FsShell report HDFS usage with "hadoop fs -dus <path>" command.
> Since replication level is per file level, it would be nice to add raw disk
> usage including the replication factor (maybe "hadoop fs -dus -raw <path>"?).
> This will allow to assess resource usage more accurately. -- Alex K
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)