https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117939

            Bug ID: 117939
           Summary: [15 Regression] nvptx: CRC test cases execution test
                    FAILs
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: testsuite-fail, wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tschwinge at gcc dot gnu.org
                CC: law at gcc dot gnu.org
  Target Milestone: ---
            Target: nvptx

For '--target=nvptx-none', there's one existing test case where a recent
regression at first wasn't obviously related to the CRC optimization work:

    PASS: gcc.c-torture/execute/pr84524.c   -O0  (test for excess errors)
    PASS: gcc.c-torture/execute/pr84524.c   -O0  execution test
    PASS: gcc.c-torture/execute/pr84524.c   -O1  (test for excess errors)
    PASS: gcc.c-torture/execute/pr84524.c   -O1  execution test
    PASS: gcc.c-torture/execute/pr84524.c   -O2  (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.c-torture/execute/pr84524.c   -O2  execution test
    PASS: gcc.c-torture/execute/pr84524.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
errors)
    PASS: gcc.c-torture/execute/pr84524.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
    PASS: gcc.c-torture/execute/pr84524.c   -O3 -g  (test for excess errors)
    PASS: gcc.c-torture/execute/pr84524.c   -O3 -g  execution test
    PASS: gcc.c-torture/execute/pr84524.c   -Os  (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.c-torture/execute/pr84524.c   -Os  execution test

For '-O2', '-Os' (but not '-O3') that now FAILs with:

    nvptx-run: error getting kernel result: an illegal memory access was
encountered (CUDA_ERROR_ILLEGAL_ADDRESS, 700)

I have to have a look what's going on.  (For now, let's assume it's specific to
nvptx target.)  I can get '-O2', '-Os' back to PASS with '-fno-optimize-crc',
thus "[15 Regression]".

However, a different observation: with plain '-O3' (without
'-fno-optimize-crc'), there are earlier code transformations (to be determined)
that apparently preclude CRC optimization.  I can't tell if that's profitable
or not.  Jeff: would Mariam et al. maybe want to have a look at that test case
-- assuming that this '-O3' behavior also reproduces on the CRC work's usual
architectures, and unless the behavior is obvious to explain anyway?  See the
'diff' of '-O2' vs. '-O3' for nvptx:

    --- O2/pr84524.c.169t.crc       2024-12-06 19:30:08.307780439 +0100
    +++ O3/pr84524.c.169t.crc       2024-12-06 19:37:03.142273827 +0100
    @@ -1,83 +1,213 @@

     ;; Function foo (foo, funcdef_no=0, decl_uid=1853, cgraph_uid=1,
symbol_order=0)

    -
    -foo function maybe contains CRC calculation.
    -Loop iteration number is 7.
    -Bit forward.
    -The loop with 4 header BB index calculates CRC!
     __attribute__((noipa, noinline, noclone, no_icf))
     void foo (short unsigned int * x)
     {
    [...]

I see big GIMPLE IR changes starting with 'pr84524.c.112t.cunrolli', maybe
that's responsible, maybe not.  

Otherwise, a few dozens of the new CRC test cases likewise FAIL their execution
test, as above: 'CUDA_ERROR_ILLEGAL_ADDRESS'.  For now I assume it's all the
same underlying issue.

Reply via email to