https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109821
Bug ID: 109821 Summary: vect: Different output with -O2 -ftree-loop-vectorize compared to -O2 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: yinyuefengyi at gmail dot com Target Milestone: --- For this test code, it aims to generate special patterns different with memcpy or memmove, it generates different results with -O2 -ftree-loop-vectorize compared to -O2, is this a bug of vectorizer that lack of checking the gap of op-src should be larger than vector mode size (here only do vectorize if op - src > 16)? copy.cpp: #include <stdio.h> #include <cstdint> #include <stdlib.h> #define UNALIGNED_LOAD64(_p) (*reinterpret_cast<const uint64_t *>(_p)) #define UNALIGNED_STORE64(_p, _val) (*reinterpret_cast<uint64_t *>(_p) = (_val)) __attribute__((__noinline__)) static void IncrementalCopyFastPath(const char* src, char* op, int len) { while (op - src < 8) { UNALIGNED_STORE64(op, UNALIGNED_LOAD64(src)); len -= op - src; op += op - src; } while (len > 0) { UNALIGNED_STORE64(op, UNALIGNED_LOAD64(src)); src += 8; op += 8; len -= 8; } } int main () { char src[] = "123456789abcdefghijklmnopqrstu"; char *op = src+12; char * dst = op; IncrementalCopyFastPath (src, op, 36); int i = 0; while (i < 36) {printf("%x ", *(dst+i)), i++;} printf("\n"); return 0; } $ gcc copy.cpp -O2 -o a.out.good $ ./a.out.good 30 31 32 33 34 35 36 37 38 39 61 62 30 31 32 33 34 35 36 37 38 39 61 62 30 31 32 33 34 35 36 37 38 39 61 62 $ gcc copy.cpp -O2 -ftree-loop-vectorize -o a.out.bad $ ./a.out.bad 30 31 32 33 34 35 36 37 38 39 61 62 63 64 65 66 34 35 36 37 38 39 61 62 63 64 65 66 73 74 75 76 38 39 61 62 gimple after t.vect: IncrementalCopyFastPath.constprop (const char * src, char * op) { ... <bb 2> [local count: 118111600]: _4 = src_8(D) + 8; if (_4 != op_9(D)) // <= the check should be op_9 > src_8 + 16 here? goto <bb 16>; [80.00%] else goto <bb 10>; [20.00%] ... }