Just pinging to see if anyone has any insight here? On Mon, May 13, 2019 at 10:31 PM Lars Francke <[email protected]> wrote:
> Hi, > > I'm working with a few clusters of 100+ nodes and I've been wondering how > exactly the failover, as well as a cold start, works in respect to the > block reports. > > I sometimes see failover times of 15-45 minutes waiting in the safe mode > for all blocks to report in. > > Datanodes usually send a report every six hours I believe, so there must > be something else going on. > > How are Datanodes informed of the new Namenode? > How do they know that they should send a full block report (assuming this > is what happens)? > -> I assume the answer to both lies in Heartbeats? > > Are there any guidelines on how long recovery should take and are there > any options that can be used to decrease the time? > > Thank you! >
