Thanks, Bob.  Comments inline.

On Mon, Sep 13, 2021 at 8:06 PM Bob Metelsky <[email protected]> wrote:

> I think there are some fsck queries you can run where it will show the
> full path and then (MISSING) you can find that with google pretty easy.
>
Think of it… the namenode has to keep track where all the blocks are
> something like hostname/path that’s the job of the nn
>

The command "hdfs fsck /path/to/file.txt -files -blocks -locations" shows
the datanode IP address/port but not the local filesystem path on that
datanode.

Are you sure the NameNode actually knows those paths?  The code suggests
that it doesn't:
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java#L203-L208

 * The NameNode controls two critical tables:
 *   1)  filename{@literal ->}blocksequence (namespace)
 *   2)  block{@literal ->}machinelist ("inodes")
 *
 * The first table is stored on disk and is very precious.
 * The second table is rebuilt every time the NameNode comes up.


If you can (I would) let all the missing block rebalance, during that time
> try to identify the missing file paths. Once you find that, introduce one
> drive at a time and see…
>

I'll try this on a non-production setup first, and then apply it to the
production cluster if it works.  Thanks.

Reply via email to