On Fri, Aug 21, 2015 at 07:05:38PM +0100, Nuno Gonçalves wrote: > Thank you so much. I've sent you the hostnames of a beaglebone white > and a beaglebone black that are set with your public key. > > I took a few days while I confirmed what I could. In fact I've tried > several kernel versions and I see that up to 3.15.10-bone8 the clock > is working properly. After 3.16 and until the most recent 4.2 RC, it > is malfunctioning. > > If you have a minute to login at the machines and are able to narrow > in any way which part of the kernel clocks is at fault it might be > easier have it fixed at the Linux level.
I did some tests with the adjtimex utility while chronyd was just reporting measured offset and not controlling the clock. It looks like there is a problem with large frequency offsets. There seems to be a maximum offset the kernel is willing to apply to the clock (about 1000 ppm) and any attempts to slow down or speed up the clock more are ignored. This breaks the chrony control loop badly. As a workaround, I set the maxslewrate option to 100 ppm and started chronyd with "0.0 1.0" in the driftfile to avoid large frequency adjustments and it seems it was able to settle down. I'm not sure where the problem could be. Looking at the log for the linux/kernel/time directory between 3.15 and 3.16 I don't see anything suspicious. Could be something in the arch-specific code. You could try to add the nohz option to the kernel command line, that would suggest it's a problem with the internal timekeeping loop. I think this is a serious bug that needs to be fixed. If you file any bug reports, please let us know. Thanks, -- Miroslav Lichvar -- To unsubscribe email [email protected] with "unsubscribe" in the subject. For help email [email protected] with "help" in the subject. Trouble? Email [email protected].
