On 2025-10-22 at 06:41:46 UTC-0400 (Wed, 22 Oct 2025 12:41:46 +0200)
Fourhundred Thecat via Postfix-users <[email protected]>
is rumored to have said:
hello,
I have 2 nameservers in /etc/resolv.conf
when the first one is unreachable policyd-spf does not fail over to
the secondary, but instead times out after 45s:
The primary fix for this should be to use nameservers that don't
regularly become unreachable. Timeout values and failover to second or
third resolvers is normally controlled by the system resolver, not
individual programs.
10:51:00 postfix:25/smtpd: connect from
mail-4325.protonmail.ch[185.70.43.25]
10:51:00 postfix:25/smtpd: Anonymous TLS connection established from
mail-4325.protonmail.ch[185.70.43.25]: TLSv1.3 with cipher
TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519
server-signature RSA-PSS (2048 bits) server-digest SHA256
10:51:45 policyd-spf: prepend Received-SPF: Temperror
(mailfrom) identity=mailfrom; client-ip=185.70.43.25;
helo=mail-4325.protonmail.ch; [email protected];
[email protected]
10:51:45 postfix:25/smtpd: 30F11116:
client=mail-4325.protonmail.ch[185.70.43.25]
10:51:45 postfix/cleanup: 30F11116:
message-id=<aZUuZ3LpGZRq7o13iKqnK2j2KrRl36S8a6zc59NSow20Q5_yGQfYVOMKheGrlnIl44w2sUzBSTzyEylH934WNdcvRSLA8vWKAla_YwZYIHQ=@protonmail.ch>
10:51:45 opendkim: 30F11116: s=protonmail3 d=protonmail.ch
SSL
10:51:45 postfix/qmgr: 30F11116: from=<[email protected]>,
size=2234, nrcpt=1 (queue active)
10:51:45 postfix:25/smtpd: disconnect from
mail-4325.protonmail.ch[185.70.43.25] ehlo=2 starttls=1 mail=1 rcpt=1
data=1 quit=1 commands=7
in contrast, when I was troubleshooting with dig, everything worked
fine because dig does failover to the second ns.
Also, dig does not use the system resolver but instead has its own
internal DNS client.
is this expected behaviour?
It seems normal to me. 45s is a long time for a DNS lookup, but a
reasonable timeout.
I would think that you could accommodate this problematic nameserver by
setting the timeout and retries in resolv.conf to something less hopeful
than the defaults. If you shorten the timeout and reduce the retries,
the resolver will waste less time on the flaky nameserver.
also, where does the timeout 45s come from? is this hardcoded
somewhere in the python scripts (python3-spf 2.0.12t-3),
The log implies that, as it is logged by policyd-spf, not any postfix
component.
or can this be configured in postfix?
Since policyd-spf is logging the timeout, it logically must be in
policyd-spf.
this setting is not relevant, I assume:
# postconf | grep spf
policy-spf_time_limit = 3600s
It's certainly not relevant if policyd-spf is actually timing out at
45s.
--
Bill Cole
[email protected] or [email protected]
(AKA @[email protected] and many *@billmail.scconsult.com
addresses)
Not Currently Available For Hire
_______________________________________________
Postfix-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]