Hi, I seem to have similar issues, but since 6.1 proper is now out, Im starting again from scratch, I need to improve disk layouts etc anyway.
regards ________________________________________ From: [email protected] [[email protected]] on behalf of Sigbjorn Lie [[email protected]] Sent: Sunday, 22 May 2011 10:16 p.m. To: Rich Megginson Cc: [email protected] Subject: Re: [Freeipa-users] IPA Startup issues On 05/17/2011 07:24 PM, Rich Megginson wrote: > On 05/17/2011 06:40 AM, Sigbjorn Lie wrote: >> On 05/16/2011 04:56 PM, Rich Megginson wrote: >>> On 05/16/2011 08:43 AM, Sigbjorn Lie wrote: >>>> On 05/16/2011 03:52 PM, Simo Sorce wrote: >>>>> On Sat, 2011-05-14 at 16:46 +0200, Sigbjorn Lie wrote: >>>>>> I've noticed that if the machine running IPA is very busy at >>>>>> startup, >>>>>> the IPA services will not be online when the machine is started. >>>>>> >>>>>> I noticed this is as my test virtualization host has had it's >>>>>> power cord >>>>>> knocked out a few times. When I restart the host machine, all the >>>>>> virtual machines is started at the same time, causing (a lot) higher >>>>>> than normal latency for each virtual machine. >>>>>> >>>>>> This causes the IPA daemons to start, while during the startup >>>>>> one or >>>>>> several IPA daemons fails due to dependencies of other daemons >>>>>> which is >>>>>> not started yet, and all the IPA daemons is stopped as not all >>>>>> the IPA >>>>>> daemons started successfully. I've noticed that the default >>>>>> behavior of >>>>>> the ipactl command is to shut down all the IPA daemons, if any of >>>>>> the >>>>>> IPA daemons should fail during startup. >>>>>> >>>>>> This can be seen in the logs of the individual services, as some is >>>>>> started successfully, just to receive a shutdown signal shortly >>>>>> after. >>>>>> It seem to be the pki-ca which shut down my IPA services this >>>>>> morning. >>>>>> >>>>>> When rebooting the virtual machine running the IPA daemons during >>>>>> normal >>>>>> load of the host machine, all the IPA daemons start successfully. >>>>>> Logging on to the IPA server and manually starting the IPA >>>>>> daemons after >>>>>> the load of the host machine has decreased also works. >>>>>> >>>>>> I suggest changing the startup scripts to allow (a lot) longer >>>>>> startup >>>>>> times for the IPA daemons prior to failing them. >>>>> At the moment we just run service<name> start and wait until it is >>>>> done. If the pki-cad service timeouts and returns an error I think we >>>>> need to open a bug against the dogtag component as that is the cause. >>>>> >>>>> Can you open a bug in the freeipa trac with logs showing that >>>>> service is >>>>> responsible for the failure ? >>>> >>>> I haven't been able to figure out which service that failed IPA >>>> yet. A lot of log files scattered around. As you can see from the >>>> slapd errors file, the slapd daemon was available for almost 3 >>>> minutes before receiving the shutdown signal. I notice now that the >>>> PKI daemon failed 8 seconds after slapd had shut down, so I was >>>> wrong in blaming the PKI daemon. >>>> >>>> See below for a list of log files I've been trough. They all have >>>> on thing in common, the daemons starts when the host machine is >>>> started, at approx 06:34, then receives a shutdown signal around >>>> 06:37. Some time later when the host has calmed down, I'm logging >>>> in and manually starting IPA using "ipactl start", and all the >>>> daemons start without any problem. And they keep running after my >>>> manual intervention. >>>> >>>> I wish I could be more specific, but I'm unsure where else to look. >>>> Suggestions? >>>> >>>> >>>> /var/log/krb5kdc.log >>>> /var/log/pki-ca/catalina.out >>>> /var/log/dirsrv/slapd-IX-TEST-COM/errors >>>> /var/log/dirsrv/slapd-PKI-IPA/errors >>>> /var/log/httpd/error_log >>>> /var/log/messages (named log) >>>> >>>> slapd errors: >>>> >>>> [14/May/2011:06:33:52 +0200] - 389-Directory/1.2.8.rc1 >>>> B2011.062.1416 starting up >>>> [14/May/2011:06:33:54 +0200] - Detected Disorderly Shutdown last >>>> time Directory Server was running, recovering database. >>> 1) Disorderly Shutdown means a) crash b) kill -9 or similar - >>> neither of which should be happening - is this the replica install >>> or the first master install? >> >> >> >> First master install. >> > What is in the slapd errors log before [14/May/2011:06:33:52 +0200] - > 389-Directory/1.2.8.rc1 B2011.062.1416 starting up? Hi, Rich, there is nothing above that line. Previous entry was from last time the server started. Yesterday I rebooted my host platform, graceful shutdown this time, and the same problem occurred again when the host, and all the virtual machines started. I had a look in my boot.log file, see below for output. As you can see the "Starting pki-ca" return an "OK", but the next line says: "Failed to start CA Service" "Shutting down". Looking at the timestamps, it looks like the dirsrv instance is shut down before the pki-ca is given a chance to start, or am I looking at the incorrect log files? I have included my boot.log, and the PKI-CA dirsrv log, and the pki-ca debug log. /var/log/boot.log: Starting Directory Service Starting dirsrv: IX-TEST-COM... [ OK ] PKI-IPA... [ OK ] Starting KDC Service Starting Kerberos 5 KDC: [ OK ] Starting KPASSWD Service Starting ipa_kpasswd: [ OK ] Starting DNS Service Starting named: [ OK ] Starting HTTP Service Starting httpd: [ OK ] Starting CA Service Starting pki-ca: [ OK ] Failed to start CA Service Shutting down Stopping Kerberos 5 KDC: [ OK ] Shutting down ipa_kpasswd: [ OK ] Stopping named: [ OK ] Stopping httpd: [ OK ] Stopping pki-ca: [FAILED] Shutting down dirsrv: IX-TEST-COM... [ OK ] PKI-IPA... [ OK ] Aborting ipactl /var/log/dirsrv/slapd-PKI-IPA/errors: [21/May/2011:18:47:41 +0200] - 389-Directory/1.2.8.2 B2011.104.2252 starting up [21/May/2011:18:47:48 +0200] - slapd started. Listening on All Interfaces port 7389 for LDAP requests [21/May/2011:18:47:49 +0200] - Listening on All Interfaces port 7390 for LDAPS requests [21/May/2011:18:50:00 +0200] - slapd shutting down - signaling operation threads [21/May/2011:18:50:01 +0200] - slapd shutting down - closing down internal subsystems and plugins [21/May/2011:18:50:01 +0200] - Waiting for 4 database threads to stop [21/May/2011:18:50:02 +0200] - All database threads now stopped [21/May/2011:18:50:02 +0200] - slapd stopped. /var/lib/pki-ca/logs/debug: [21/May/2011:18:50:15][main]: ============================================ [21/May/2011:18:50:15][main]: ===== DEBUG SUBSYSTEM INITIALIZED ======= [21/May/2011:18:50:15][main]: ============================================ [21/May/2011:18:50:15][main]: CMSEngine: done init id=debug [21/May/2011:18:50:15][main]: CMSEngine: initialized debug [21/May/2011:18:50:15][main]: CMSEngine: initSubsystem id=log [21/May/2011:18:50:15][main]: CMSEngine: ready to init id=log [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUDIT_LOG_STARTUP [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUDIT_LOG_SHUTDOWN [21/May/2011:18:50:16][main]: LogFile: log event type selected: ROLE_ASSUME [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_CERT_POLICY [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_CERT_PROFILE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_CRL_PROFILE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_OCSP_PROFILE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_AUTH [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_ROLE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_ACL [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_SIGNED_AUDIT [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_ENCRYPTION [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_TRUSTED_PUBLIC_KEY [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_DRM [21/May/2011:18:50:16][main]: LogFile: log event type selected: SELFTESTS_EXECUTION [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUDIT_LOG_DELETE [21/May/2011:18:50:16][main]: LogFile: log event type selected: LOG_PATH_CHANGE [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_ARCHIVE_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_ARCHIVE_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_EXPORT_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_EXPORT_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST_ASYNC [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_AGENT_LOGIN [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST_PROCESSED_ASYNC [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_GEN_ASYMMETRIC [21/May/2011:18:50:16][main]: LogFile: log event type selected: NON_PROFILE_CERT_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: PROFILE_CERT_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_STATUS_CHANGE_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_STATUS_CHANGE_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTHZ_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTHZ_FAIL [21/May/2011:18:50:16][main]: LogFile: log event type selected: INTER_BOUNDARY [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTH_FAIL [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTH_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_PROFILE_APPROVAL [21/May/2011:18:50:16][main]: LogFile: log event type selected: PROOF_OF_POSSESSION [21/May/2011:18:50:16][main]: LogFile: log event type selected: CRL_RETRIEVAL [21/May/2011:18:50:16][main]: LogFile: log event type selected: CRL_VALIDATION [21/May/2011:18:50:16][main]: LogFile: log event type selected: CMC_SIGNED_REQUEST_SIG_VERIFY [21/May/2011:18:50:16][main]: LogFile: log event type selected: SERVER_SIDE_KEYGEN_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: SERVER_SIDE_KEYGEN_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: SERVER_SIDE_KEYGEN_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_SESSION_KEY_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_SESSION_KEY_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_SESSION_KEY_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: DIVERSIFY_KEY_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: DIVERSIFY_KEY_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: DIVERSIFY_KEY_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: ENCRYPT_DATA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: ENCRYPT_DATA_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: ENCRYPT_DATA_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_ADD_CA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_ADD_CA_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_REMOVE_CA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_REMOVE_CA_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_REMOVE_CA_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_RANDOM_DATA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_RANDOM_DATA_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_RANDOM_DATA_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CIMC_CERT_VERIFICATION [21/May/2011:18:50:16][main]: CMSEngine: done init id=log [21/May/2011:18:50:16][main]: CMSEngine: initialized log [21/May/2011:18:50:16][main]: CMSEngine: initSubsystem id=os [21/May/2011:18:50:16][main]: CMSEngine: ready to init id=os [21/May/2011:18:50:16][main]: CMSEngine: done init id=os [21/May/2011:18:50:16][main]: CMSEngine: initialized os [21/May/2011:18:50:16][main]: CMSEngine: initSubsystem id=jss [21/May/2011:18:50:16][main]: CMSEngine: ready to init id=jss [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_rc4_40_md5 [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_rc2_40_md5 [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_rc4_128_md5 [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_3des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_fips_des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_fips_3des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher fortezza [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher fortezza_rc4_128_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_null_md5 [21/May/2011:18:50:16][main]: CMSEngine: done init id=jss [21/May/2011:18:50:16][main]: CMSEngine: initialized jss [21/May/2011:18:50:16][main]: CMSEngine: initSubsystem id=dbs [21/May/2011:18:50:16][main]: CMSEngine: ready to init id=dbs [21/May/2011:18:50:16][main]: LdapBoundConnFactory: init [21/May/2011:18:50:16][main]: LdapBoundConnFactory:doCloning true [21/May/2011:18:50:16][main]: LdapAuthInfo: init() [21/May/2011:18:50:16][main]: LdapAuthInfo: init begins [21/May/2011:18:50:16][main]: LdapAuthInfo: init: prompt is Internal LDAP Database [21/May/2011:18:50:16][main]: LdapAuthInfo: init: try getting from memory cache [21/May/2011:18:50:16][main]: LdapAuthInfo: init: password not in memory [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: try to get it from password store [21/May/2011:18:50:16][main]: CMSEngine: getPasswordStore(): password store initialized before. [21/May/2011:18:50:16][main]: CMSEngine: getPasswordStore(): password store initialized. [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: about to get from passwored store: Internal LDAP Database [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: password store available [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: password for Internal LDAP Database not found, trying internaldb [21/May/2011:18:50:16][main]: LdapAuthInfo: password ok: store in memory cache [21/May/2011:18:50:16][main]: LdapAuthInfo: init ends [21/May/2011:18:50:16][main]: init: before makeConnection errorIfDown is true [21/May/2011:18:50:16][main]: makeConnection: errorIfDown true [21/May/2011:18:50:16][main]: CMS:Caught EBaseException Internal Database Error encountered: Could not connect to LDAP server host ipa01.ix.test.com port 7389 Error netscape.ldap.LDAPException: failed to conn ect to server ldap://ipa01.ix.test.com:7389 (91) at com.netscape.cmscore.dbs.DBSubsystem.init(DBSubsystem.java:674) at com.netscape.cmscore.apps.CMSEngine.initSubsystem(CMSEngine.java:837) at com.netscape.cmscore.apps.CMSEngine.initSubsystems(CMSEngine.java:766) at com.netscape.cmscore.apps.CMSEngine.init(CMSEngine.java:302) at com.netscape.certsrv.apps.CMS.init(CMS.java:153) at com.netscape.certsrv.apps.CMS.start(CMS.java:1530) at com.netscape.cms.servlet.base.CMSStartServlet.init(CMSStartServlet.java:85) at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1173) at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:993) at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4187) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4496) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1041) at org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:964) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:502) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:516) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:593) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414) [21/May/2011:18:50:16][main]: CMSEngine.shutdown() _______________________________________________ Freeipa-users mailing list [email protected] https://www.redhat.com/mailman/listinfo/freeipa-users _______________________________________________ Freeipa-users mailing list [email protected] https://www.redhat.com/mailman/listinfo/freeipa-users
