Mircea Baciu wrote:
> Hi,
> 
> I have an issue with a consumer replication starting to fail until OpenLDAP 
> is restarted.
> 
> My setup consists of a pair of on-prem MirrorMode replicated providers (only 
> one is active at a given time using a virtual IP managed by Keepalived), and 
> one
> off-site (AWS) consumer. The providers use a dedicated port (LDAPS on 1636) 
> for their own replication, as well as for the consumer to connect to them, so 
> the
> consumer has access to both servers, regardless of where the providers' 
> virtual IP is residing.
> 
> All the connections happen over LDAPS, and the syncrepl configs have the 
> tls_reqcert=allow option.
> 
> The providers are always in sync and I'm able to switch make one or the other 
> one the "active" one with ease. The consumer does the initial sync and stays 
> in
> sync for a while, but I find it often (almost daily) out of sync. I see error 
> messages on both the consumer and provider side:

Sounds like an issue in the TLS layer. You should increase the debug level on 
both provider and consumer to see
if there are any TLS-specific error messages being generated. If you have 
cn=monitor configured you can set the
debuglevel using ldapmodify, so no need to restart the servers for it to take 
effect. That'll let you see the
problem as it's occurring.
> 
> On the consumer (every minute):
> Sep 20 08:19:31 <consumer> slapd[1440]: slap_client_connect: 
> URI=ldaps://<provider1>:1636/ 
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> ldap_sasl_bind_s failed (-1)
> Sep 20 08:19:31 <consumer> slapd[1440]: do_syncrepl: rid=001 rc -1 retrying
> Sep 20 08:19:31 <consumer> slapd[1440]: slap_client_connect: 
> URI=ldaps://<provider2>:1636/ 
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> ldap_sasl_bind_s failed (-1)
> Sep 20 08:19:31 <consumer> slapd[1440]: do_syncrepl: rid=002 rc -1 retrying
> Sep 20 08:20:31 <consumer> slapd[1440]: slap_client_connect: 
> URI=ldaps://<provider1>:1636/ 
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> ldap_sasl_bind_s failed (-1)
> Sep 20 08:20:31 <consumer> slapd[1440]: do_syncrepl: rid=001 rc -1 retrying
> Sep 20 08:20:31 <consumer> slapd[1440]: slap_client_connect: 
> URI=ldaps://<provider2>:1636/ 
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> ldap_sasl_bind_s failed (-1)
> Sep 20 08:20:31 <consumer> slapd[1440]: do_syncrepl: rid=002 rc -1 retrying
> 
> On the provider (every minute):
> Sep 20 08:19:31 <provider1> slapd[1057]: conn=11242 fd=14 ACCEPT from 
> IP=<consumer IP>:45438 (IP=<provider1 IP>:1636)
> Sep 20 08:19:31 <provider1> slapd[1057]: conn=11242 fd=14 TLS established 
> tls_ssf=256 ssf=256
> Sep 20 08:19:31 <provider1> slapd[1057]: conn=11242 fd=14 closed (connection 
> lost)
> Sep 20 08:20:31 <provider1> slapd[1057]: conn=11243 fd=14 ACCEPT from 
> IP=<consumer IP>:45458 (IP=<provider1 IP>:1636)
> Sep 20 08:20:31 <provider1> slapd[1057]: conn=11243 fd=14 TLS established 
> tls_ssf=256 ssf=256
> Sep 20 08:20:31 <provider1> slapd[1057]: conn=11243 fd=14 closed (connection 
> lost)
> 
> Sep 20 08:19:31 <provider2> slapd[1051]: conn=215893 fd=18 ACCEPT from 
> IP=<consumer IP>:41706 (IP=<provider2 IP>:1636)
> Sep 20 08:19:31 <provider2> slapd[1051]: conn=215893 fd=18 TLS established 
> tls_ssf=256 ssf=256
> Sep 20 08:19:31 <provider2> slapd[1051]: conn=215893 fd=18 closed (connection 
> lost)
> Sep 20 08:20:31 <provider2> slapd[1051]: conn=215898 fd=18 ACCEPT from 
> IP=<consumer IP>:41726 (IP=<provider2 IP>:1636)
> Sep 20 08:20:31 <provider2> slapd[1051]: conn=215898 fd=18 TLS established 
> tls_ssf=256 ssf=256
> Sep 20 08:20:31 <provider2> slapd[1051]: conn=215898 fd=18 closed (connection 
> lost)
> 
> There must be something wrong on the consumer side since when the issue 
> starts, the consumer is not able to connect to either provider.
> 
> Once I restart the consumer, it quickly resyncs and works just fine, for a 
> while.
> 
> The providers are OpenLDAP 2.4.44 (openldap-2.4.44-24.el7_9.x86_64), running 
> on RHEL 7.
> The consumer is OpenLDAP 2.4.44 (openldap-2.4.44-24.el7_9.x86_64), running on 
> CentOS 7.
> 
> The consumer syncrepl config is:
> olcSyncrepl: {0}rid=001
>   provider=ldaps://<provider1>:1636/
>   searchbase="dc=example,dc=com"
>   type=refreshAndPersist
>   retry="60 +"
>   timeout=1
>   bindmethod=simple
>   binddn="uid=replication,ou=SysAccounts,dc=example,dc=com"
>   credentials=<credentials>
>   tls_reqcert=allow
> olcSyncrepl: {1}rid=002
>   provider=ldaps://<provider1>:1636/
>   searchbase="dc=example,dc=com"
>   type=refreshAndPersist
>   retry="60 +"
>   timeout=1
>   bindmethod=simple
>   binddn="uid=replication,ou=SysAccounts,dc=example,dc=com"
>   credentials=<credentials>
>   tls_reqcert=allow
> 
> The "uid=replication,ou=SysAccounts,dc=example,dc=com" DN has full read-only 
> permissions for the entire "dc=example,dc=com" tree.
> 
> Any idea on what might be my issue here?
> 
> Thank you,
> Mircea
> --
> Mircea Baciu | Senior Unix Systems Administrator
> Simmons University | 300 The Fenway | Boston, MA 02115 | 617-521-2194


-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Reply via email to