On 04/09/2015 07:51 AM, Martin Kosek wrote:
On 04/09/2015 05:59 AM, Alexander Frolushkin wrote:
-----Original Message-----
From: thierry bordaz [mailto:[email protected]]
Sent: Wednesday, April 08, 2015 6:36 PM
To: Alexander Frolushkin (SIB)
Cc: 'Ludwig Krispenz'; Martin Kosek; [email protected]
Subject: Re: [Freeipa-users] Accident upgrade 3.3 to 4.1
On 04/08/2015 02:19 PM, Alexander Frolushkin wrote:
On one of accidently upgraded server I have following error in dirsrv logs:
[08/Apr/2015:13:24:12 +0300] connection - conn=1095 fd=131 Incoming BER Element
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize
attribute in cn=config to increase.
[08/Apr/2015:13:24:12 +0300] connection - conn=1094 fd=124 Incoming BER Element
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize
attribute in cn=config to increase.
[08/Apr/2015:13:24:12 +0300] connection - conn=1096 fd=124 Incoming BER Element
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize
attribute in cn=config to increase.
[08/Apr/2015:13:24:12 +0300] connection - conn=1097 fd=131 Incoming BER Element
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize
attribute in cn=config to increase.
This message is logged if the received message was too large. But here max size
was 200Mb.
I can not imagine a such large message.
Being log at the same second, it could be transient error. Have you seen others
messages like these ?
Yes, it still here.
[08/Apr/2015:14:55:01 +0300] connection - conn=1125 fd=130 Incoming BER Element
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize
attribute in cn=config to increase.
[08/Apr/2015:14:55:01 +0300] connection - conn=1124 fd=126 Incoming BER Element
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize
attribute in cn=config to increase.
[08/Apr/2015:14:55:01 +0300] connection - conn=1126 fd=126 Incoming BER Element
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize
attribute in cn=config to increase.
Those logs mean the connection (e.g. conn=1125) got closed.
Would you grep conn=1125 in access log ?
[08/Apr/2015:14:55:00 +0300] conn=1125 fd=130 slot=130 connection from
10.99.111.42 to 10.163.129.91
[08/Apr/2015:14:55:00 +0300] conn=1125 op=0 SRCH base="" scope=0
filter="(objectClass=*)" attrs="subschemaSubentry dsservicename namingContexts
defaultnamingcontext schemanamingcontext configuratio
nnamingcontext rootdomainnamingcontext supportedControl supportedLDAPVersion
supportedldappolicies supportedSASLMechanisms dnshostname ldapservicename servername
supportedcapabilities"
[08/Apr/2015:14:55:00 +0300] conn=1125 op=0 RESULT err=0 tag=101 nentries=1
etime=0
No closure log ?
Possibly the next op=1, triggered the error and the closure of the
connection.
Do you know if it exists a kind of keep alive mechanism, that would ping
the instance with op=0 and then could send some dummy data ?
Looking for periodicity on the 'Incoming BER Element' event could help
to know who opened that connection
[08/Apr/2015:14:55:26 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://cnt-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:14:55:26 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://cnt-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:14:55:26 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://cnt-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:11 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://sib-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:11 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://sib-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:11 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://sib-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:15 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://vlg-rhidm02.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:15 +0300] attrlist_replace - attr_replace (nsslapd-referral,
ldap://vlg-rhidm02.unix.ad.com:389/o%3Dipaca) failed.
Here it is likely trigger by RUV containing duplicated values (multiple replica
install ?). You may have to use cleanruv after the upgrade.
ipa-replica-manage list-ruv and ipa-replica-manager clean-ruv
Do You mean we need to upgrade all 3.3.3 IPA servers to 4.1 first? Or this can
be cleaned right now on remaining servers?
BTW:
# ipa-replica-manage list-ruv
Directory Manager password:
sib-rhidm03.unix.ad.com:389: 5
dv-rhidm01.unix.ad.com:389: 17
sib-rhidm02.unix.ad.com:389: 3
sib-rhidm01.unix.ad.com:389: 4
url-rhidm01.unix.ad.com:389: 6
url-rhidm02.unix.ad.com:389: 7
....
nw-rhidm01.unix.ad.com:389: 19
This message is harmless. It means that some values of nsds50ruv in the RUV
have identical referral.
This should not occur, but replication is smart enough to just log this warning
and continue working.
I would not recommend cleanup right now. Just clarification of the status.
Would you send all the ruv values returned by 'list-ruv' (here there is no
duplicate).
Here the full command output from the IPA 4.1 server:
# ipa-replica-manage list-ruv
Directory Manager password:
nw-rhidm01.unix.ad.com:389: 19
dv-rhidm02.unix.ad.com:389: 18
vlg-rhidm03.unix.ad.com:389: 12
sib-rhidm01.unix.ad.com:389: 4
dv-rhidm01.unix.ad.com:389: 17
url-rhidm01.unix.ad.com:389: 6
url-rhidm02.unix.ad.com:389: 7
cnt-rhidm01.unix.ad.com:389: 14
sib-rhidm03.unix.ad.com:389: 5
vlg-rhidm02.unix.ad.com:389: 13
msk-rhidm-03.unix.ad.com:389: 10
msk-rhidm-01.unix.ad.com:389: 9
vlg-rhidm01.unix.ad.com:389: 8
cnt-rhidm02.unix.ad.com:389: 15
sib-rhidm02.unix.ad.com:389: 3
msk-rhidm-02.unix.ad.com:389: 11
I'm planning to upgrade all the remaining IPA 3.3.3 to IPA 4.1.
Ok, that should help.
Am I undersanding correctly, that upper messages does not mean something is
terribly wrong in IPA for now?
If you are asking about the attrlist_replace warnings, they should be benign,
caused by the uncleaned RUVs as Thierry indicated. Although the list above
looks OK, without duplicate RUVs.
I agree, those warnings means something needs to be cleaned but not that
things are broken.
Replication should work fine.
Thierry, does this needs to be checked on every IPA server, or are RUVs also
replicated?
I am unsure if list-ruv command is hidding something. The following
command will dump the RUV of the local instance:
ldapsearch -D "cn=directory manager" -W -b "$SUFFIX"
'(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))'
The 'attrlist_replace' message means that the local instance received a
RUV from a remote instance and that remote RUV contained duplicated
referral.
If you want to know which server need to be cleaned, you would do
list-ruv (or the ldapsearch command) on each instance.
I would expect to see duplicates on some instances RUV, like for example:
nw-rhidm01.unix.ad.com:389: 19
dv-rhidm02.unix.ad.com:389: 18
vlg-rhidm03.unix.ad.com:389: 12
sib-rhidm01.unix.ad.com:389: 4
dv-rhidm01.unix.ad.com:389: 17
url-rhidm01.unix.ad.com:389: 6
url-rhidm02.unix.ad.com:389: 7
*cnt-rhidm01.unix.ad.com:389: 14**
**cnt-rhidm01.unix.ad.com:389: 24*
sib-rhidm03.unix.ad.com:389: 5
vlg-rhidm02.unix.ad.com:389: 13
msk-rhidm-03.unix.ad.com:389: 10
msk-rhidm-01.unix.ad.com:389: 9
vlg-rhidm01.unix.ad.com:389: 8
cnt-rhidm02.unix.ad.com:389: 15
sib-rhidm02.unix.ad.com:389: 3
msk-rhidm-02.unix.ad.com:389: 11
Martin
--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project