Hi!
On 2022-12-19T21:40:07+0100, Thomas Schwinge <[email protected]> wrote:
> As I have reported to Nvidia in 2022-12-01 'NVIDIA Incident Report (3891704):
> ptxas: Duplicate declaration error: "cannot be resolved by a '.static'"',
> 'ptxas' has an inscrutable error mode for duplicate declarations:
>
> ptxas softstack-decl-1.o, line 11; error : '.extern' variable
> '__nvptx_stacks' cannot be resolved by a '.static'
> ptxas fatal : Ptx assembly aborted due to errors
> nvptx-as: ptxas returned 255 exit status
>
> ptxas uniform-simt-decl-1.o, line 12; error : '.extern' variable
> '__nvptx_uni' cannot be resolved by a '.static'
> ptxas fatal : Ptx assembly aborted due to errors
> nvptx-as: ptxas returned 255 exit status
>
> This is inscrutable, because (a) what is "cannot be resolved by a '.static'"
> supposed to tell me (there is no '.static' in PTX?), and (b) why arent't
> repeated declaration just verified to match the first, but otherwise a no-op
> (like in other programming languages)?
Since my report, this had its 'Status changed [...] to "Closed - Fixed"'
(2023-01-28), with comment:
| [...] fix should be available in a later release.
| The compiler was modified to allow duplicate declaration of extern symbol.
You will not see an error for this case.
| The documentation is also being changed to reflect this new change.
I've not yet verified the CUDA/'ptxas'-level fix, but I suggest to
retract my GCC-level proposed change:
> --- a/gcc/config/nvptx/nvptx.cc
> +++ b/gcc/config/nvptx/nvptx.cc
> +static bool have_softstack_decl;
> +static bool have_unisimt_decl;
> @@ -2571,6 +2573,13 @@ nvptx_assemble_undefined_decl (FILE *file, const char
> *name, const_tree decl)
> TREE_TYPE (decl), size ? tree_to_shwi (size) : 0,
> DECL_ALIGN (decl), true);
> nvptx_assemble_decl_end ();
> +
> + static tree softstack_id = get_identifier ("__nvptx_stacks");
> + static tree unisimt_id = get_identifier ("__nvptx_uni");
> + if (DECL_NAME (decl) == softstack_id)
> + have_softstack_decl = true;
> + else if (DECL_NAME (decl) == unisimt_id)
> + have_unisimt_decl = true;
> }
> @@ -6002,7 +6011,7 @@ nvptx_file_end (void)
> write_shared_buffer (asm_out_file, gang_private_shared_sym,
> gang_private_shared_align, gang_private_shared_size);
>
> - if (need_softstack_decl)
> + if (need_softstack_decl && !have_softstack_decl)
> {
> write_var_marker (asm_out_file, false, true, "__nvptx_stacks");
> /* 32 is the maximum number of warps in a block. Even though it's an
> @@ -6011,7 +6020,8 @@ nvptx_file_end (void)
> fprintf (asm_out_file, ".extern .shared .u%d __nvptx_stacks[32];\n",
> POINTER_SIZE);
> }
> - if (need_unisimt_decl)
> +
> + if (need_unisimt_decl && !have_unisimt_decl)
> {
> write_var_marker (asm_out_file, false, true, "__nvptx_uni");
> fprintf (asm_out_file, ".extern .shared .u32 __nvptx_uni[32];\n");
..., and suggest that we instead fix up duplicate declarations in the
nvptx-tools 'as', and once GCC depends on a nvptx-tools version with that
addressed, we still change the test cases from "compile" to "assemble" as
proposed:
> --- a/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
> +++ b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
> @@ -1,4 +1,4 @@
> -/* { dg-do compile } */
> +/* { dg-do assemble } */
> /* { dg-options {-save-temps -O0 -msoft-stack} } */
>
> extern void *__nvptx_stacks[32] __attribute__((shared,nocommon));
> --- a/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
> +++ b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
> @@ -1,4 +1,4 @@
> -/* { dg-do compile } */
> +/* { dg-do assemble } */
> /* { dg-options {-save-temps -O0 -muniform-simt} } */
>
> extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon));
..., but (obviously) without the following changes:
> --- a/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
> +++ b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
> -/* The implicit (via 'need_softstack_decl') and explicit declarations of
> - '__nvptx_stacks' are both emitted:
> - { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_stacks\[32\];}
> 2 } }
> +/* Of the implicit (via 'need_softstack_decl') and explicit declarations of
> + '__nvptx_stacks', only one is emitted:
> + { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_stacks\[32\];}
> 1 } }
> --- a/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
> +++ b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
> -/* The implicit (via 'need_unisimt_decl') and explicit declarations of
> - '__nvptx_uni' are both emitted:
> - { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_uni\[32\];} 2
> } }
> +/* Of the implicit (via 'need_unisimt_decl') and explicit declarations of
> + '__nvptx_uni', only one is emitted:
> + { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_uni\[32\];} 1
> } }
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
München, HRB 106955