We use snapshots in the cluster, but I've not seen any snapshot folders 
underneath the folder in question. I'd need to verify with the application team 
if snapshots for this folder are available anywhere.

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Arpit Agarwal <[email protected]>
Sent: Friday, November 8, 2019 11:41:31 AM
To: David M <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: HDFS du Utility Inconsistencies?

Got any snapshots?

On Fri, Nov 8, 2019, 09:38 David M 
<[email protected]<mailto:[email protected]>> wrote:

All,



I’m working on a cluster that is running Hadoop 2.7.3. I have one folder in 
particular where the command hdfs dfs -du is giving me strange results. If I 
query the folder and ask for a summary, it tells me 10 GB. If I don’t ask for a 
summary, all of the folders underneath don’t even add up to 1 GB, much less 10 
GB.



I’ve verified this is true over time and is true using the hdfs user or any 
other user. We are on an HDP cluster, so we are using Ranger for HDFS security, 
and Kerberos for authentication. We see similar results in -count, where the 
size and counts are both different. We have not seen this behavior in any other 
folders.



See below for a sample output we are seeing. I’ve replaced the full path with a 
fake path to protect the data we have on the cluster. Does anyone know anything 
that would cause this behavior? Thanks!



$ hdfs dfs -du -h /randomFolder

119.9 M  /randomFolder/bug

1.0 M    /randomFolder/commitment

86.8 K   /randomFolder/customfield

31.3 M   /randomFolder/epic

10.3 M   /randomFolder/feature

4.0 M    /randomFolder/insprintbug

372.9 K  /randomFolder/project

15.1 K   /randomFolder/projectstatus

330.9 M  /randomFolder/story

256.3 M  /randomFolder/subtask

74.7 K   /randomFolder/subtemplate

89.6 M   /randomFolder/task

7.4 M    /randomFolder/techdebt

117.7 K  /randomFolder/template

617.9 K  /randomFolder/tempomember

8.2 K    /randomFolder/tempoteam

1.4 M    /randomFolder/tempoworklog



$ hdfs dfs -du -h -s /randomFolder

10.6 G  /randomFolder



David McGinnis


Reply via email to