https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119796
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I think libat_lock_n/libat_unlock_n should just count how many of the locks it will need, and if there is a wrap around first lock the mutexes at the start of the array and then just fallthrough into the current code which would lock those at the end of the array and instead of h = 0; break; But there is another problem, I don't see it taking into account ptr position within WATCH_SIZE. Currently libat_lock_n/libat_unlock_n will lock single mutex for n in [0, WATCH_SIZE], 2 mutexes for n in [WATCH_SIZE + 1, 2 * WATCH_SIZE] and so on. Regardless how many cachelines it actually crosses. One could call it on ((uintptr_t) ptr) % WATCH_SIZE == 0 (in that case what the code does right now is reasonable, I hope nothing calls it for n == 0) or on ((uintptr_t) ptr) % WATCH_SIZE == 32 or on ((uintptr_t) ptr) % WATCH_SIZE == 63. I'd think that for the second case only sizes <= 32 should result in a single lock, then [33, 96] with 2 locks, etc., and for the third case [0, 1] in a single lock, then [2, 65] 2 locks, ...