https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112722
Bug ID: 112722 Summary: RISC-V: ICE on tree-vect-slp.cc:8029 for -march=rv64gc_zve64d Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- Hi, Richards. Recently testing expose an ICE for RVV. Here is the case: #define VECTOR_BITS 512 #define N (VECTOR_BITS * 11 / 64 + 4) #define add(A, B) ((A) + (B)) #define DEF(OP) \ void __attribute__ ((noipa)) \ f_##OP (double *restrict a, double *restrict b, double x) \ { \ for (int i = 0; i < N; i += 2) \ { \ a[i] = b[i] < 100 ? OP (b[i], x) : b[i]; \ a[i + 1] = b[i + 1] < 70 ? OP (b[i + 1], x) : b[i + 1]; \ } \ } #define TEST(OP) \ { \ f_##OP (a, b, 10); \ _Pragma("GCC novector") \ for (int i = 0; i < N; ++i) \ { \ int bval = (i % 17) * 10; \ int truev = OP (bval, 10); \ if (a[i] != (bval < (i & 1 ? 70 : 100) ? truev : bval)) \ __builtin_abort (); \ asm volatile ("" ::: "memory"); \ } \ } #define FOR_EACH_OP(T) \ T (add) \ FOR_EACH_OP (DEF) compile option: -march=rv64gc_zve64d_zvfh_zfh -mabi=lp64d -fdiagnostics-plain-output -flto -ffat-lto-objects -ftree-vectorize -fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O3 -fdump-tree-vect-details https://godbolt.org/z/GT4bW4Tno The reason why it ICE is because: In tree-vect-slp.cc:8028: unsigned int partial_nelts = nelts / nvectors; nelts = 2, nvectors = 4, then partial_nelts = 0. So it ICE. nelts = 2 looks reasonable since slp_instances.length () = 2. nvectors is calculated by can_duplicate_and_interleave_p. I dig into can_duplicate_and_interleave_p, it's hard to me to understand the codes. Here is the descriptions of can_duplicate_and_interleave_p: for (;;) { ... if (int_mode_for_size (elt_bits, 1).exists (&int_mode)) { if (vector_type && VECTOR_MODE_P (TYPE_MODE (vector_type)) && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)), GET_MODE_SIZE (base_vector_mode)) && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)), 2, &half_nelts)) { nvectors *= 2; } In the 1st round of the for loop, the "int_mode" = TImode. Since RVV doesn't have vector TI mode. Then "nvectors" become 2. In the second round, the "int_mode" = DImode. RVV has vector DI mode which RVVM1DImode, the size = poly (1,1). Since it doesn't satisfy the condtion "multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),2, &half_nelts)" Then it continue the loop and nvectors become 4. In the third round (the last round). int_mode = SImode, then RVV has RVVM1SImode which has nunits = (2,2) then return true; So the nvectors = 4, then *nvectors_out output 4. Then ICE. I am struggling at fixing this ICE of RVV and failed to find a appropriate approach to fix it. Do we need to walk around in RISC-V backed (Disable all poly (1,1) mode vectorization) ? I believe it may fix the issues but not sure. Or it should be fixed in middle-end ? Thanks.