On Tue, Mar 16, 2021 at 03:05:23PM +0100, Alexander Bluhm wrote: > Hi, > > I am running ntpd as a client with three upstream servers. Some > of them are not synchonized and report a time that is off by several > seconds. > > The ntpd client code corrects both T1 and T4 with the current offset > returned by adjtime(2) from the kernel. T1 is local time when the > NTP packet is sent and T4 when the response is received. If between > these events a NTP reply from another server is received, it may > change the kernel offset with adjtime(2). Then the calulation of > the client offset is done with different bases, the result is wrong > and the system time starts moving around. > > So instead of correcting T1 and T4 individually at different events, > I correct their sum once. This fixes my setup. > > Error handling missing if there is no timestamp in the response. > As this should not happen in our kernel, fatal() is fine. > > I have tested this bugfix together with more cleanup and refactoring, > but I will send a separate diff for that. > > ok?
Makes sense to me. OK claudio@ > bluhm > > Index: usr.sbin/ntpd/client.c > =================================================================== > RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/ntpd/client.c,v > retrieving revision 1.114 > diff -u -p -r1.114 client.c > --- usr.sbin/ntpd/client.c 11 Sep 2020 07:09:41 -0000 1.114 > +++ usr.sbin/ntpd/client.c 16 Mar 2021 12:30:23 -0000 > @@ -208,7 +208,7 @@ client_query(struct ntp_peer *p) > > p->query->msg.xmttime.int_partl = arc4random(); > p->query->msg.xmttime.fractionl = arc4random(); > - p->query->xmttime = gettime_corrected(); > + p->query->xmttime = gettime(); > > if (ntp_sendmsg(p->query->fd, NULL, &p->query->msg) == -1) { > p->senderrors++; > @@ -295,7 +295,6 @@ client_dispatch(struct ntp_peer *p, u_in > somsg.msg_control = cmsgbuf.buf; > somsg.msg_controllen = sizeof(cmsgbuf.buf); > > - T4 = getoffset(); > if ((size = recvmsg(p->query->fd, &somsg, 0)) == -1) { > if (errno == EHOSTUNREACH || errno == EHOSTDOWN || > errno == ENETUNREACH || errno == ENETDOWN || > @@ -325,10 +324,12 @@ client_dispatch(struct ntp_peer *p, u_in > if (cmsg->cmsg_level == SOL_SOCKET && > cmsg->cmsg_type == SCM_TIMESTAMP) { > memcpy(&tv, CMSG_DATA(cmsg), sizeof(tv)); > - T4 += gettime_from_timeval(&tv); > + T4 = gettime_from_timeval(&tv); > break; > } > } > + if (cmsg == NULL) > + fatal("SCM_TIMESTAMP"); > > ntp_getmsg((struct sockaddr *)&p->addr->ss, buf, size, &msg); > > @@ -384,7 +385,7 @@ client_dispatch(struct ntp_peer *p, u_in > return (0); > } > > - p->reply[p->shift].offset = ((T2 - T1) + (T3 - T4)) / 2; > + p->reply[p->shift].offset = ((T2 - T1) + (T3 - T4)) / 2 - getoffset(); > p->reply[p->shift].delay = (T4 - T1) - (T3 - T2); > p->reply[p->shift].status.stratum = msg.stratum; > if (p->reply[p->shift].delay < 0) { > -- :wq Claudio