Do you have a heapdump? Without a heapdump it's not easy to definitely point to DirectoryScanner for GC issues.
That said, I did notice DirectoryScanner holding global lock for quite a few seconds periodically, but that's unrelated. to GC. On Thu, Nov 29, 2018 at 12:56 AM Yen-Onn Hiu <[email protected]> wrote: > hi all, > > I am on hadoop hdfs version of Hadoop 2.6.0-cdh5.8.0. I discovered that > the DirectoryScanner is keep causing java GC and slow down the hadoop > nodes. Digging into the log file I discovered this: > > 2018-11-29 13:34:37,995 INFO > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool > BP-1850109981-192.168.1.1-1413178082983 Total blocks: 3896197, missing > metadata files:214, missing block files:214, missing blocks in memory:103, > mismatched blocks:0 > > Reading from internet posting, there are postings saying this is from > DirectoryScanner which will be executed in every 6 hours. This directory > scanning caused GC hiccup in all nodes and caused performance issues on the > cluster. > > Question: when I am doing the hdfs dfsadmin -report. It does not say that > I have any corrupted files. Also, I did the hdfs fsck / onto the directory > and it does not yield any problems. How can I know what is the missing > block files, missing blocks in memory and missing metadata files? > > > Thanks! > > -- > Hiu Yen Onn > > >
