On Thu, Mar 04, 2021 at 01:09:55PM +0100, Michael Ströder wrote: > On 3/4/21 12:20 PM, Ondřej Kuzník wrote: >> If it takes 1 second to replicate a change and a previous change >> happened x seconds before this one there's going to be a window of 1 >> second where you see an x second CSN difference between the provider and >> consumer. In no way does it mean the consumer is x seconds behind. > > I'm talking about the contextCSN difference being visible for several > *hours* while the changes have been already successfully replicated. > Replication delay is very short, syncrepl type is refreshAndPersist.
Don't think I've ever seen this outside slapcat (only checkpoints affect the on-disk version). Please submit a bug if you can replicate this. >> If there's an acceptable delay of n seconds, you better wait for that >> amount of time before raising an alarm, > > And what's an appropriate value for n? 86400? ;-] Depends where in the galaxy you place your replicas :) >> See the logic in syncmonitor[0] > > Ideally I'd like to query cn=monitor whether slapd thinks replication is > in a healthy state. Consumer will never think its replication is slow/broken (unless it gets an actual error and you can already see that in cn=monitor). Provider might want to expose some information but that's not implemented yet and will not be able to spot many issues if other providers exist. -- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP
