On Mon, Apr 1, 2024 at 12:19 AM Paul Eggert <egg...@cs.ucla.edu> wrote: > > On 2024-03-31 18:07, NightStrike wrote: > > I don't quite understand your animosity here. > > I don't see any animosity in Bruno's comments. Clearly the system you're > talking about has a severe performance bug, and the question is whether > it's worth our limited time to port to buggy systems like that. Since we > don't have easy access to these systems, you can't expect us to fix the > problems ourselves - you'll need to pitch in if you want a workaround > installed. > > That being said, does the attached patch (which I have neither tested or > installed) fix the problem for you? If not, perhaps you can adjust the > patch so that it does work.
Thanks Paul! I'll try as soon as I get it back online. We've been trying various ideas in #gnu for the past couple hours, and it's currently hung. I have to wait for a RAID verify after every attempt :( I ran the conftest directly instead of through configure. Notably, when I filled it with fprintf(stderr...) around every syscall, it worked correctly and finished after 1366 iterations. When I ran it unmodified, it seg faulted and left the system in the aforementioned hung state. So this leads me to believe that it's due to running too fast, assuming the printf's slowed it down and let the RCU not get stuck. For your patch, it'll be interesting to see if a SIGALRM gets through, because currently, no signals get through (or at least, the sigint via Ctrl-C doesn't, and I can't run kill from another shell because all disk access is blocked at that point, so I can't launch the kill program). I'm also curious to try your suggestion from another time this came up: https://lists.gnu.org/r/bug-tar/2019-10/msg00003.html (with additional info here: https://www.cs.rug.nl/~jurjen/ApprenticesNotes/tstcg_tar.html) but I haven't yet compared that file to the current one to see if you changed it or if you just extracted it unmodified. I don't have strace on this system, so I can't directly apply your debugging method from that thread unfortunately. This is why I tried the printfs, but then it Heisenbug'd away. I can try building strace, if it doesn't have too many dependencies. Maybe I should have done that first :)