https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84419
--- Comment #5 from Alexander Nesterovskiy <alexander.nesterovskiy at intel dot com> --- Yes, looks like the problem is with unaligned access (there is no fail in reproducer when starting a loop with i=0). It seems that your patch works - there are no runfails for reproducer, 445, 521, 527, 554 (tested on SPEC train workload). I'll report upon finishing other benchmarks.