https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117383

            Bug ID: 117383
           Summary: gcc relies on RISC-V vcompress instruction undefined
                    behaviour
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: anton at ozlabs dot org
  Target Milestone: ---

I think gcc is relying on undefined behaviour with the vcompress instruction.
This thread explains how vcompress is different in that the tail starts after
the last mask selected field:

https://github.com/riscvarchive/riscv-v-spec/issues/796

There was a bug in QEMU that I just fixed that prevented the all 1s tail
agnostic option (rvv_ta_all_1s) from poisoning these bits:

https://lists.nongnu.org/archive/html/qemu-riscv/2024-10/msg00561.html

With that fix, I see problems with the test case below until I modify the
previous setvli from ta to tu. I think 9aabf81f40f0 ("RISC-V: Optimize
permutation codegen with compress") is one place we need to set tail
undisturbed.

Build with:

gcc -march=rv64gcv -mabi=lp64d -mrvv-vector-bits=zvl -O3

QEMU without all 1s tail agnostic poisoning:

-1
-2
-3
-5
-7
-9
-10
-11
-12
-14
-15
-17
-19
-21
-22
-23
-26
-28
-30
-31
-37
-38
-41
-46
-47
-53
-54
-55
-60
-61
-62
-63
52
53
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

QEMU with all 1s tail agnostic poisoning:

-1
-2
-3
-5
-7
-9
-10
-11
-12
-14
-15
-17
-19
-21
-22
-23
-26
-28
-30
-31
-37
-38
-41
-46
-47
-53
-54
-55
-60
-61
-62
-63
52
53
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1

Not sure where the 52/53 values are coming from either.


#include <stdio.h>
#include <stdint.h>

typedef int8_t vnx64i __attribute__ ((vector_size (64)));
#define MASK_64                                                               
\
  1, 2, 3, 5, 7, 9, 10, 11, 12, 14, 15, 17, 19, 21, 22, 23, 26, 28, 30, 31,   
\
    37, 38, 41, 46, 47, 53, 54, 55, 60, 61, 62, 63, 76, 77, 78, 79, 80, 81,   
\
    82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,   
\
    100, 101, 102, 103, 104, 105, 106, 107
void __attribute__ ((noinline, noclone)) test_1 (int8_t *x, int8_t *y, int8_t
*out)
{
  vnx64i v1 = *(vnx64i*)x;
  vnx64i v2 = *(vnx64i*)y;
  vnx64i v3 = __builtin_shufflevector (v1, v2, MASK_64);
  *(vnx64i*)out = v3;
}

int main(void)
{
  int8_t x[64];
  int8_t y[64];
  int8_t out[64];

  for (int i = 0; i < 64; i++) {
    x[i] = -i;
    y[i] = i;
  }

  test_1(x, y, out);

  for (int i = 0; i < 64; i++) {
    printf("%d\n", out[i]);
  }
}

Reply via email to