On 2025-10-22 at 06:41:46 UTC-0400 (Wed, 22 Oct 2025 12:41:46 +0200)
Fourhundred Thecat via Postfix-users <[email protected]>
is rumored to have said:

hello,

I have 2 nameservers in /etc/resolv.conf

when the first one is unreachable policyd-spf does not fail over to the secondary, but instead times out after 45s:

The primary fix for this should be to use nameservers that don't regularly become unreachable. Timeout values and failover to second or third resolvers is normally controlled by the system resolver, not individual programs.

10:51:00 postfix:25/smtpd: connect from mail-4325.protonmail.ch[185.70.43.25] 10:51:00 postfix:25/smtpd: Anonymous TLS connection established from mail-4325.protonmail.ch[185.70.43.25]: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 10:51:45 policyd-spf: prepend Received-SPF: Temperror (mailfrom) identity=mailfrom; client-ip=185.70.43.25; helo=mail-4325.protonmail.ch; [email protected]; [email protected] 10:51:45 postfix:25/smtpd: 30F11116: client=mail-4325.protonmail.ch[185.70.43.25] 10:51:45 postfix/cleanup: 30F11116: message-id=<aZUuZ3LpGZRq7o13iKqnK2j2KrRl36S8a6zc59NSow20Q5_yGQfYVOMKheGrlnIl44w2sUzBSTzyEylH934WNdcvRSLA8vWKAla_YwZYIHQ=@protonmail.ch> 10:51:45 opendkim: 30F11116: s=protonmail3 d=protonmail.ch SSL 10:51:45 postfix/qmgr: 30F11116: from=<[email protected]>, size=2234, nrcpt=1 (queue active) 10:51:45 postfix:25/smtpd: disconnect from mail-4325.protonmail.ch[185.70.43.25] ehlo=2 starttls=1 mail=1 rcpt=1 data=1 quit=1 commands=7

in contrast, when I was troubleshooting with dig, everything worked fine because dig does failover to the second ns.

Also, dig does not use the system resolver but instead has its own internal DNS client.

is this expected behaviour?

It seems normal to me. 45s is a long time for a DNS lookup, but a reasonable timeout.

I would think that you could accommodate this problematic nameserver by setting the timeout and retries in resolv.conf to something less hopeful than the defaults. If you shorten the timeout and reduce the retries, the resolver will waste less time on the flaky nameserver.

also, where does the timeout 45s come from? is this hardcoded somewhere in the python scripts (python3-spf 2.0.12t-3),

The log implies that, as it is logged by policyd-spf, not any postfix component.

or can this be configured in postfix?

Since policyd-spf is logging the timeout, it logically must be in policyd-spf.

this setting is not relevant, I assume:
# postconf | grep spf
policy-spf_time_limit = 3600s

It's certainly not relevant if policyd-spf is actually timing out at 45s.



--
 Bill Cole
 [email protected] or [email protected]
(AKA @[email protected] and many *@billmail.scconsult.com addresses)
 Not Currently Available For Hire
_______________________________________________
Postfix-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to