https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118175

            Bug ID: 118175
           Summary: Unable to do auto vectorization for
                    rv32imafc_zve32f_zvl128b for matrix like c code
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fanghuaqi at vip dot qq.com
  Target Milestone: ---

Hi there,

I tried to compile code like below

#include <stdio.h>

float *matA, *matB, *matC;
int nthreads;

typedef struct{
   int id;
   int rowsA;
   int colsA;
   int colsB;
} tArgs;

void * CalculaProdutoMatriz(void *arg) {
   int i, j, k;
   tArgs *args = (tArgs*) arg;
   int rowsA = args->rowsA;
   int colsA = args->colsA;
   int colsB = args->colsB;

   for(i = args->id; i < rowsA; i += 1) {
      for(j = 0; j < colsB; j++) {
         matC[i * colsB + j] = 0;
         for(k = 0; k < colsA; k++) {
               matC[i * colsB + j] += matA[i * colsA + k] * matB[k * colsB +
j];
         }
      }
   }
   return 0;
}

Compiler options like below

-march=rv32imafc_zve32f_zvl128b -mabi=ilp32f --param=vsetvl-strategy=optim
-Ofast -ftree-vectorize -mrvv-max-lmul=m8 -funroll-all-loops

I thought it could be auto vectorization using latest gcc15, but not, is there
some compiler options are missed?

I also tried with clang 20 using options -march=rv32imafc_zve32f_zvl128b
-mabi=ilp32f -O3  -funroll-loops and it can generate some vector instructions.

And for Arm Cortex R52 with NEON enabled, it also works for gcc options:
-mcpu=cortex-r52 -O3 -ftree-vectorize -funroll-all-loops

You can also check it in this link https://godbolt.org/z/dY1r88d1c

Reply via email to