https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104353

            Bug ID: 104353
           Summary: ppc64le: Apparent reliance on undefined behavior of
                    xvcvdpsxws
           Product: gcc
           Version: 11.2.0
               URL: https://github.com/numpy/numpy/issues/20964#issuecomme
                    nt-1027865665
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ckk at kvr dot at
  Target Milestone: ---

Created attachment 52331
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52331&action=edit
Minimal test case for reproduction

I ran into a strange numpy error on ppc64le that only occurred inside a ppc64le
QEMU instance. In short, casting arrays of i doubles 1.0 to ints 1 worked as
expected on native hardware, but produced the following bogus results when
running inside a VM:

i = 1:   1
i = 2:   1 1
i = 3:   1 1 1
i = 4:   0 0 0 0
i = 5:   0 0 0 0 1
i = 6:   0 0 0 0 1 1
i = 7:   0 0 0 0 1 1 1
i = 8:   0 0 0 0 0 0 0 0
i = 9:   0 0 0 0 0 0 0 0 1
...


Guided by the numpy folks, a SIMD issue was suspected, and I managed to create
a minimal test case (attached here) with which this could be reproduced. It
only occurs with -O3.

I then filed an issue with QEMU, where the issue was quickly rejected. This led
to further analysis by the numpy folks. There, it was discovered that GCC is
apparently relying on undefined behavior of the xvcvdpsxws instruction, which
happened to work on native hardware because it happen to exhibit that behavior.

I'm only summarizing here; there's a great analysis in detail, and a much
better test case, on the GitHub issue, which I have linked in the URL as I'd
prefer not to reproduce the author's work here.

Reply via email to