https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122814

            Bug ID: 122814
           Summary: ffmpeg miscompiled with -O3 -march=x86-64-v4 on
                    Windows
           Product: gcc
           Version: 15.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kasper93 at gmail dot com
  Target Milestone: ---

Hello,

I noticed a failure in ffmpeg FATE tests on Windows when using -O3
-march=znver4. It was pain to minimize, as it needs specific code structure,
hence why the dummy fflush() call, any function call from another compilation
unit is fine. Checking explicitly the values also make the bug go away, so to
avoid roundabout saving the result, it just prints to stdout. You can compare
with -O0 or any other compiler.

To repro build with `-O3 -march=x86-64-v4` and run the code. The values are
wrong on 2nd iteration of `for (int y = 0; y < h; y++)` loop. Original
reproducer seems to fail only with -march=znver4, but likely just cost model
difference, and minimized version is closer to the bug.

I tested this with MSYS2 MINGW64 gcc build. Since it seems to reproduce only on
Windows, I presume this might be stack alignment issues or something.
Regardless we cannot fail like that, as it's not uncommon for people to build
things with -march=native or alike.

```
#include <stdio.h>

typedef struct {
  unsigned char *data[4];
  int linesize[4];
} SwsImg;

typedef struct {
  unsigned char *in[4];
} SwsOpExec;

int h = 6;
SwsOpExec exec;
unsigned char data0[1000], data1[1000], data2[1000];
SwsImg in = {{data0, data1, data2}, {8, 8, 8}};

void handle_tail(SwsOpExec *exec, SwsImg *in_base) {
  SwsImg img = *in_base;
  for (int i = 0; img.data[i]; i++)
    fflush(0);
  for (int y = 0; y < h; y++) {
    for (int i = 0; i < 128; i++)
      printf("%d", exec->in[2][i]);
    for (int i = 0; i < 4; i++)
      exec->in[i] += img.linesize[i];
  }
}

void op_pass_run(SwsImg *in_base) {
  for (int i = 0; i < 4; i++)
    exec.in[i] = in_base->data[i];
  handle_tail(&exec, in_base);
}

int main() {
  for (unsigned i = 0; i < sizeof(data2); i++)
    data2[i] = i;
  op_pass_run(&in);
}
```

Thanks,
Kacper

Reply via email to