Ok, then... either:I'm missing something obvious, or no one have any idea on this... Should I create a bug report based on my findings here?
Thanks! Ildefonso Camargo On Tue, Apr 19, 2011 at 2:12 PM, Jose Ildefonso Camargo Tolosa <[email protected]> wrote: > Greetings, > > Any comments on this? can anybody help me verify this possible bug? > > Ildefonso. > > On Sun, Apr 17, 2011 at 2:24 PM, Jose Ildefonso Camargo Tolosa > <[email protected]> wrote: >> Greetings, >> >> At first, I was going to create a bug report, but decided to send to >> list first. I tried this with both: 2.4.23 (Debian package), and >> 2.4.25, compiled from source, bdb 4.8. >> >> After a couple of entries just disappeared on one multi-master setup I >> had, I decided to further investigate, and found this (there are two >> cases, for the same procedure): >> >> 1. Configure two LDAP servers in multi-master setup. >> 2. Make sure they replicate correctly (off course). >> 3. Shutdown one of the two ldap servers. >> 4. Create a new entry (say, ou1) on the LDAP server that is left up. >> 5. Shutdown the last LDAP server. >> 6. Start the *other* LDAP server, the one where you didn't create the entry. >> 7. Create another entry, say: ou2, so that both servers has a new >> entry, that is *not* on the other server. >> 8. Shutdown the LDAP server (both servers down now). >> 9. Start both LDAP servers. >> >> Result (case 1): one of the two newly created entries is missing on >> *one* of the servers, and only one of the entries is missing on the >> other server. >> >> Result (case 2): one entry is missing on *both* servers. >> >> Both servers has NTP, and has the same timezone (ie, time is synchronized). >> >> I'm *not* replicating cn=config (I shouldn't, because I have different >> SSL certificates on each server). Now, more details: >> >> slapd with -d 16384 gives me this on the server that misses both >> entries, on this server I created the entry dn >> ou=ou2,dc=st-andes,dc=com (and the server decided to delete it!, and, >> for some reason, it didn't detected the new ou1 entry created on the >> other server): >> >> http://www.st-andes.com/openldap/case1/log-server2-case1.txt >> >> The other server (the one that kept one entry and lost the other), on >> this server I created the entry ou=ou1,dc=st-andes,dc=com, and it says >> it was changed by peer.....: >> >> http://www.st-andes.com/openldap/case1/log-server1-case1.txt >> >> Now, I'm seeing here that it is using 000 server id... but on the >> cn=config.ldif I have: >> >> olcServerID: 1 ldap://ldap.ildetech.com:389/ >> olcServerID: 2 ldap://ldap2.ildetech.com:389/ >> >> And the syncrepl: >> >> olcSyncRepl: rid=001 provider=ldap://ldap.ildetech.com:389 >> binddn="cn=admin,dc=st-andes,dc=com" bindmethod=simple >> credentials="secret" searchbase="dc=st-andes,dc=com" >> type=refreshAndPersist retry="3 5 5 +" timeout=7 starttls=critical >> olcSyncRepl: rid=002 provider=ldap://ldap2.ildetech.com:389 >> binddn="cn=admin,dc=st-andes,dc=com" bindmethod=simple >> credentials="secret" searchbase="dc=st-andes,dc=com" >> type=refreshAndPersist retry="3 5 5 +" timeout=7 starttls=critical >> olcMirrorMode: TRUE >> >> And, as you can see on the command line, I have the URL specified on >> the -h parameter, but it seems to be ignoring it!. Or, should I >> specify the *whole* urls that I put on the -h parameter? >> (ldap://ldap2.ildetech.com:389 ldap://127.0.0.1:389/ ldaps:/// >> ldapi:///) >> >> So, I decided to change the config: >> >> On server 1 (kirara): >> >> olcServerID: 1 >> >> and >> >> olcSyncRepl: rid=002 provider=ldap://ldap2.ildetech.com:389 >> binddn="cn=admin,dc=st-andes,dc=com" bindmethod=simple >> credentials="secret" searchbase="dc=st-andes,dc=com" >> type=refreshAndPersist retry="3 5 5 +" timeout=7 starttls=critical >> olcMirrorMode: TRUE >> >> On server 2 (happy): >> >> olcServerID: 2 >> >> and >> >> olcSyncRepl: rid=002 provider=ldap://ldap2.ildetech.com:389 >> binddn="cn=admin,dc=st-andes,dc=com" bindmethod=simple >> credentials="secret" searchbase="dc=st-andes,dc=com" >> type=refreshAndPersist retry="3 5 5 +" timeout=7 starttls=critical >> olcMirrorMode: TRUE >> >> With this new setup, and following the same procedure, I get one >> missing entry on *both* servers (at least servers gets to a consistent >> state), but I still have a missing entry. The logs for this setup: >> >> Server 2 (ID 2, where I created entry: ou2 while the other server was >> down), this server decided, wrongly, to delete entry ou2: >> >> http://www.st-andes.com/openldap/case2/log-server2-case2.txt >> >> And the other server (where I created ou1): >> >> http://www.st-andes.com/openldap/case2/log-server1-case2.txt >> >> This one never saw the other entry, ou2. >> >> For both cases, the syncprov module was with default configuration: >> >> dn: olcOverlay={0}syncprov >> objectClass: olcOverlayConfig >> objectClass: olcSyncProvConfig >> olcOverlay: {0}syncprov >> structuralObjectClass: olcSyncProvConfig >> entryUUID: 24354488-e5bf-102f-9e6a-ad3cba95f7f1 >> creatorsName: cn=config >> createTimestamp: 20110318152128Z >> entryCSN: 20110318152128.935227Z#000000#000#000000 >> modifiersName: cn=config >> modifyTimestamp: 20110318152128Z >> >> What do you think? >> >> Thanks in advance! >> >> Ildefonso Camargo >> >
