https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85231

            Bug ID: 85231
           Summary: [og7, openacc, nvptx] Too much shared memory claimed
                    for long vector length
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

On the og7 branch, with the current state of the vector-length patches applied,
for vector-length-128-1.c we generate:
...
.shared .align 8 .u8 __oacc_bcast[64];

   ...

cvta.shared.u64 %r55,__oacc_bcast;
st.u64 [%r55],%r36;
st.u64 [%r55+8],%r37;
st.u64 [%r55+16],%r38;
st.u64 [%r55+24],%r39;

   ...

cvta.shared.u64 %r54,__oacc_bcast;
ld.u64 %r36,[%r54];
ld.u64 %r37,[%r54+8];
ld.u64 %r38,[%r54+16];
ld.u64 %r39,[%r54+24];
...

It seems we claim double (64) of what we need (32).

This patch (already applicable on og7 branch) fixes that:
...
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index ba8d3bec1d7..3cf110cd1ed 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4058,7 +4058,8 @@ nvptx_shared_propagate (bool pre_p, bool is_call,
basic_block block,
       emit_insn_after (init, insn);

       unsigned int psize = ROUND_UP (data.offset, oacc_bcast_align);
-      unsigned int pnum = (nvptx_mach_vector_length () > PTX_WARP_SIZE
+      unsigned int pnum = ((nvptx_mach_vector_length () > PTX_WARP_SIZE
+                           && nvptx_mach_max_workers () > 1)
                           ? nvptx_mach_max_workers () + 1
                           : 1);
...

Reply via email to