I've been seeing intermittent lookup failures for ' nova.clouds.archive.ubuntu.com'. Looking at the logs, I see these messages during the time when the lookup fails:
Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Wants DNSSEC processing, auth data in query for A Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Looking for CNAME cache hit of ' nova.clouds.archive.ubuntu.com|CNAME' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: No CNAME cache hit of ' nova.clouds.archive.ubuntu.com|CNAME' found Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: No cache hit for ' nova.clouds.archive.ubuntu.com|A', trying to find an appropriate NS record Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] : got TA for '.' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] : setting cut state for . to Secure Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Checking if we have NS in cache for ' nova.clouds.archive.ubuntu.com' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: no valid/useful NS in cache for ' nova.clouds.archive.ubuntu.com' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Checking if we have NS in cache for ' clouds.archive.ubuntu.com' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: NS (with ip, or non-glue) in cache for ' clouds.archive.ubuntu.com' -> 'piru.canonical.com' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: within bailiwick: 0, not in cache / did not look at cache Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: We have NS in cache for ' clouds.archive.ubuntu.com' (flawedNSSet=0) Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: initial validation status for nova.clouds.archive.ubuntu.com is Indeterminate Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Cache consultations done, have 1 NS to contact Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com.: Nameservers: piru.canonical.com.(38.62ms) Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Trying to resolve NS 'piru.canonical.com' (1/1) Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] piru.canonical.com: Wants DNSSEC processing, NO auth data in query for A Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] piru.canonical.com: Looking for CNAME cache hit of 'piru.canonical.com|CNAME' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] piru.canonical.com: No CNAME cache hit of 'piru.canonical.com|CNAME' found Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] piru.canonical.com: Found cache hit for A: 91.189.95.68[ttl=1340] Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] piru.canonical.com: updating validation state with cache content for piru.canonical.com to Indeterminate Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Resolved 'clouds.archive.ubuntu.com' NS piru.canonical.com to: 91.189.95.68 Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Trying IP 91.189.95.68:53, asking ' nova.clouds.archive.ubuntu.com|A' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: timeout resolving after 5535.35msec Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: Failed to resolve via any of the 1 offered NS at level 'clouds.archive.ubuntu.com' Jan 23 05:47:55 n422 pdns_recursor[8739]: [414333] nova.clouds.archive.ubuntu.com: failed (res=-1) My recursor config has 'network-timeout' set to 5500ms (instead of the default 1500ms), so I can understand the "timeout resolving after 5535.35msec" message, but I was surprised by the timestamps of the messages all being the same. Are these messages flushed to syslog all at once or is the timeout really happening immediately? Aside from the timing question, can anyone offer any insight about this sequence of events that lead up to the timeout? Would this "just" indicate a network or transport layer issue or is there anything in the events leading up to the timeout that look suspicious? This has been happening about 10-15 times per day and started happening on Jan 18th. I see similar timeouts to 3 other sites, but this one most frequently. Lastly, is there any way to retry here? I'd rather lower the network timeout and attempt a retry if possible. This is version 4.1.0 Thanks -- Brian Talley
_______________________________________________ Pdns-users mailing list [email protected] https://mailman.powerdns.com/mailman/listinfo/pdns-users
