https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85231
Bug ID: 85231 Summary: [og7, openacc, nvptx] Too much shared memory claimed for long vector length Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- On the og7 branch, with the current state of the vector-length patches applied, for vector-length-128-1.c we generate: ... .shared .align 8 .u8 __oacc_bcast[64]; ... cvta.shared.u64 %r55,__oacc_bcast; st.u64 [%r55],%r36; st.u64 [%r55+8],%r37; st.u64 [%r55+16],%r38; st.u64 [%r55+24],%r39; ... cvta.shared.u64 %r54,__oacc_bcast; ld.u64 %r36,[%r54]; ld.u64 %r37,[%r54+8]; ld.u64 %r38,[%r54+16]; ld.u64 %r39,[%r54+24]; ... It seems we claim double (64) of what we need (32). This patch (already applicable on og7 branch) fixes that: ... diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index ba8d3bec1d7..3cf110cd1ed 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -4058,7 +4058,8 @@ nvptx_shared_propagate (bool pre_p, bool is_call, basic_block block, emit_insn_after (init, insn); unsigned int psize = ROUND_UP (data.offset, oacc_bcast_align); - unsigned int pnum = (nvptx_mach_vector_length () > PTX_WARP_SIZE + unsigned int pnum = ((nvptx_mach_vector_length () > PTX_WARP_SIZE + && nvptx_mach_max_workers () > 1) ? nvptx_mach_max_workers () + 1 : 1); ...