On 2022-02-04 18:01:34 +0100, Vincent Lefevre wrote: > So it seems that there is some randomness in the reproducibility, > but with the patched "at" and many others tests to try to make > the new job appear at the end (which I couldn't succeed, but this > seems to be useless anyway), I couldn't reproduce the bug at all > (but why???).
After adding some logging to the unpatched "at", i.e. with the remaining "now = time(NULL);": Feb 04 18:42:36 zira atd[72429]: now = 1643996556 Feb 04 18:42:36 zira atd[72429]: run_time = 1643996580 Feb 04 18:42:36 zira atd[72429]: run_time = 1644876000 Feb 04 18:42:36 zira atd[72429]: run_time = 1644912000 Feb 04 18:42:36 zira atd[72429]: next_invocation = 1643996580 Feb 04 18:43:00 zira atd[72429]: now = 1643996579 Feb 04 18:43:00 zira atd[72429]: run_time = 1643996580 Feb 04 18:43:00 zira atd[72429]: run_time = 1644876000 Feb 04 18:43:00 zira atd[72429]: run_time = 1644912000 Feb 04 18:43:00 zira atd[72429]: next_invocation = 1643996580 There are 2 causes: 1. time(NULL) and clock_gettime(CLOCK_REALTIME,...) may have a one-second shift. So, at 18:43:00, "now" was still seen at 18:42:59; thus the job that should have run at 18:43:00 was still seen as being in the future. Not a real issue, except inefficiency, thanks to the loop, which should take care of that. 2. Due to the above minor issue, the timer was set at the present time 18:43:00 (1643996580). But for some reason, the pause() that follows it is not interrupted. This is the real issue. This also makes the patched "at" unreliable as this yields a race condition: the timer may still expire before pause() is called, though this is normally rather unlikely, I think. -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)