It wounds like you are saying that there is only one source. That is very bead
practice, as the chrony and ntp notes state. You should have at least three
sources, precisely for the cases like yours. If one source dies you have
backups.
So the default configuration on your end should be 3-5 sources. It probably
does not matter if the other sources are of lower quality than the one. They
are a backup.
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ [email protected]
Canada V6T 1Z1 ____|____ and Gravity ______|_ theory.physics.ubc.ca/
On Fri, 19 Apr 2024, Chris Knox wrote:
[CAUTION: Non-UBC Email]
Bryah, thanks for the answer. Yes, now that we have the scars, we're
monitoring chronyd's health carefully. But my question goes a bit beyond that.
If chronyd is configured and running, that implies that the owner of the
system wants the time to be correct. If the configured time source is not
reachable, it seems at least as important as chronyd being out by a half-second
and worth logging to syslog where someone who is not a time synchronization
expert will notice something being amiss. In effect, chronyd will tell me if
it has a figurative hangnail, but will suffer in silence if it is starving.
I would suggest the Chrony authors add a default configuration to call for help
if the time source is unreachable. Is this an appropriate venue to ask for
that enhancement?
--
Chris Knox
IT Infrastructure Engineer
tel 1.602.308.5438
18615 N. Claret Dr., Scottsdale, AZ 85255
[email protected]
-----Original Message-----
From: Bryan Christianson <[email protected]>
Sent: Thursday, April 18, 2024 4:28 PM
To: [email protected]
Subject: Re: [chrony-users] Silent Failure -- Enhancement Request
You could monitor the Reach field of chronyc and check that it has a value of
377, raising an appropriate alarm for your system on failure.
On 19 Apr 2024, at 10:45, Chris Knox <[email protected]> wrote:
We recently moved a bunch of systems out of a data center and shut it down.
Time sync was an overlooked item in the move. As a result, the time server was
not reachable, but it did not become apparent until servers started drifting
enough to create issues. Looking in the syslog of the various systems, the
only entries I see are when chronyd could again hit the time server. Over the
previous weeks, chronyd suffered in silence until we were able to establish a
valid time server, at which point log entries came fast and furious because the
time was more than .5 seconds out. Yet there was no message for all that time
(several months) during which one server drifted out synch by more than 30
seconds. Is there a configuration in chrony.conf to complain when the time
server is not reachable? If there is, why isn’t that the default behavior?
--
Chris
Bryan Christianson
[email protected]
--
To unsubscribe email [email protected]
with "unsubscribe" in the subject.
For help email [email protected]
with "help" in the subject.
Trouble? Email [email protected].