Re: Handling of main() function for freestanding

2022-10-13 Thread Arsen Arsenović via Gcc
Hi,

On Friday, 7 October 2022 15:51:31 CEST Jason Merrill wrote:
> > * gcc.dg/noreturn-4.c: Likewise.
> 
> I'd be inclined to drop this test.
That seems like an odd choice, why do that over using another function 
for the test case? (there's nothing specific to main in this test, and 
it doesn't even need to link, so using any ol' function should be okay; 
see attachment)

The attached patch is also v2 of the original builtin-main one submitted 
earlier.  Tested on x86_64-pc-linux-gnu.  This revision excludes the 
mentioned pedwarns unless hosted.

Thanks,
-- 
Arsen Arsenović
>From 27a2cf85b1c3eb901413fd135918af0377bd1459 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Arsen=20Arsenovi=C4=87?= 
Date: Tue, 20 Sep 2022 19:17:31 +0200
Subject: [PATCH v2] c-family: Implement new `int main' semantics in
 freestanding

>From now, by default, (specifically) `int main' in freestanding will
implicitly return 0, as it does for hosted modes. The old behaviour is
still accessible via -fno-builtin-main.

gcc/c-family/ChangeLog:

	* c-common.cc (disable_builtin_function): Support special value
	`main' that, in freestanding, allows disabling special casing
	placed around `main'.
	* c-common.h: Add flag_builtin_main.
	(want_builtin_main_p): New function, true iff hosted OR
	builtin_main are set.

gcc/c/ChangeLog:

	* c-decl.cc (grokdeclarator): Consider flag_builtin_main when
	deciding whether to emit warnings.
	(finish_function): Consider flag_builtin_main and the noreturn
	flag when deciding whether to emit an implicit zero return.
	* c-objc-common.cc (c_missing_noreturn_ok_p): Consider missing
	noreturn okay only when hosted or when builtin_main is enabled.

gcc/cp/ChangeLog:

	* cp-tree.h (DECL_MAIN_P): Consider flag_builtin_main when
	deciding whether this function is to be the special function
	main.
	* decl.cc (grokfndecl): Only pedwarn on hosted.
	(finish_function): Do not inject extra return of marked
	noreturn.

gcc/ChangeLog:

	* doc/invoke.texi: Document -fno-builtin-main.

gcc/testsuite/ChangeLog:

	* gcc.dg/noreturn-4.c: Don't use `main', but a generic function
	name instead.
	* g++.dg/freestanding-main-implicitly-returns.C: New test.
	* g++.dg/no-builtin-main.C: New test.
	* gcc.dg/freestanding-main-implicitly-returns.c: New test.
	* gcc.dg/no-builtin-main.c: New test.
---
 gcc/c-family/c-common.cc  |  6 ++
 gcc/c-family/c-common.h   | 10 ++
 gcc/c/c-decl.cc   |  4 ++--
 gcc/c/c-objc-common.cc|  9 ++---
 gcc/cp/cp-tree.h  | 12 ++-
 gcc/cp/decl.cc|  6 --
 gcc/doc/invoke.texi   | 20 ++-
 .../freestanding-main-implicitly-returns.C|  5 +
 gcc/testsuite/g++.dg/no-builtin-main.C|  5 +
 .../freestanding-main-implicitly-returns.c|  5 +
 gcc/testsuite/gcc.dg/no-builtin-main.c|  5 +
 gcc/testsuite/gcc.dg/noreturn-4.c |  6 +++---
 12 files changed, 73 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/freestanding-main-implicitly-returns.C
 create mode 100644 gcc/testsuite/g++.dg/no-builtin-main.C
 create mode 100644 gcc/testsuite/gcc.dg/freestanding-main-implicitly-returns.c
 create mode 100644 gcc/testsuite/gcc.dg/no-builtin-main.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 9ec9100cc90..f9060cbc171 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -232,6 +232,10 @@ int flag_isoc2x;
 
 int flag_hosted = 1;
 
+/* Nonzero means that we want to give main its special meaning */
+
+int flag_builtin_main = 1;
+
 
 /* ObjC language option variables.  */
 
@@ -4879,6 +4883,8 @@ disable_builtin_function (const char *name)
 {
   if (startswith (name, "__builtin_"))
 error ("cannot disable built-in function %qs", name);
+  else if (strcmp("main", name) == 0)
+flag_builtin_main = 0;
   else
 {
   disabled_builtin *new_disabled_builtin = XNEW (disabled_builtin);
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 62ab4ba437b..44537cc6977 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -689,6 +689,16 @@ extern int flag_isoc2x;
 
 extern int flag_hosted;
 
+/* Nonzero means that we want to give main its special meaning */
+
+extern int flag_builtin_main;
+
+/* Returns false if both flag_hosted and flag_builtin_main are zero, true
+   otherwise. */
+inline bool builtin_main_p() {
+  return flag_hosted || flag_builtin_main;
+}
+
 /* ObjC language option variables.  */
 
 
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 193e268f04e..891e36b30b6 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -10442,9 +10442,9 @@ finish_function (location_t end_loc)
   if (DECL_RESULT (fndecl) && DECL_RESULT (fndecl) != error_mark_node)
 DECL_CONTEXT (DECL_RESULT (fndecl)) = fndecl;
 
-  if (MAIN_NAME_P (DECL_NAME (fndecl)) && flag_hosted
+  if (MA

Re: [PATCH RESEND 0/1] RFC: P1689R5 support

2022-10-13 Thread David Malcolm via Gcc
On Mon, 2022-10-10 at 16:21 -0400, Jason Merrill wrote:
> On 10/4/22 11:11, Ben Boeckel wrote:
> > This patch adds initial support for ISO C++'s [P1689R5][], a format
> > for
> > describing C++ module requirements and provisions based on the
> > source
> > code. This is required because compiling C++ with modules is not
> > embarrassingly parallel and need to be ordered to ensure that
> > `import
> > some_module;` can be satisfied in time by making sure that the TU
> > with
> > `export import some_module;` is compiled first.
> > 
> > [P1689R5]: https://isocpp.org/files/papers/P1689R5.html
> > 
> > I'd like feedback on the approach taken here with respect to the
> > user-visible flags. I'll also note that header units are not
> > supported
> > at this time because the current `-E` behavior with respect to
> > `import
> > ;` is to search for an appropriate `.gcm` file which
> > is not
> > something such a "scan" can support. A new mode will likely need to
> > be
> > created (e.g., replacing `-E` with `-fc++-module-scanning` or
> > something)
> > where headers are looked up "normally" and processed only as much
> > as
> > scanning requires.
> > 
> > Testing is currently happening in CMake's CI using a prior revision
> > of
> > this patch (the differences are basically the changelog, some
> > style, and
> > `trtbd` instead of `p1689r5` as the format name).
> > 
> > For testing within GCC, I'll work on the following:
> > 
> > - scanning non-module source
> > - scanning module-importing source (`import X;`)
> > - scanning module-exporting source (`export module X;`)
> > - scanning module implementation unit (`module X;`)
> > - flag combinations?
> > 
> > Are there existing tools for handling JSON output for testing
> > purposes?
> 
> David Malcolm would probably know best about JSON wrangling.

Unfortunately our JSON output doesn't make any guarantees about the
ordering of keys within an object, so the precise textual output
changes from run to run.  I've coped with that in my test cases by
limiting myself to simple regexes of fragments of the JSON output.

Martin Liska [CCed] went much further in
4e275dccfc2467b3fe39012a3dd2a80bac257dd0 by adding a run-gcov-pytest
DejaGnu directive, allowing for test cases for gcov to be written in
Python, which can thus test much more interesting assertions about the
generated JSON.

Dave

> 
> > Basically, something that I can add to the test suite that doesn't
> > care
> > about whitespace, but checks the structure (with sensible
> > replacements
> > for absolute paths where relevant)?
> 
> Various tests in g++.dg/debug/dwarf2 handle that sort of thing with
> regexps.
> 
> > For the record, Clang has patches with similar flags and behavior
> > by
> > Chuanqi Xu here:
> > 
> >  https://reviews.llvm.org/D134269
> > 
> > with the same flags (though using my old `trtbd` spelling for the
> > format name).
> > 
> > Thanks,
> > 
> > --Ben
> > 
> > Ben Boeckel (1):
> >    p1689r5: initial support
> > 
> >   gcc/ChangeLog   |   9 ++
> >   gcc/c-family/ChangeLog  |   6 +
> >   gcc/c-family/c-opts.cc  |  40 ++-
> >   gcc/c-family/c.opt  |  12 ++
> >   gcc/cp/ChangeLog    |   5 +
> >   gcc/cp/module.cc    |   3 +-
> >   gcc/doc/invoke.texi |  15 +++
> >   gcc/fortran/ChangeLog   |   5 +
> >   gcc/fortran/cpp.cc  |   4 +-
> >   gcc/genmatch.cc |   2 +-
> >   gcc/input.cc    |   4 +-
> >   libcpp/ChangeLog    |  11 ++
> >   libcpp/include/cpplib.h |  12 +-
> >   libcpp/include/mkdeps.h |  17 ++-
> >   libcpp/init.cc  |  14 ++-
> >   libcpp/mkdeps.cc    | 235
> > ++--
> >   16 files changed, 368 insertions(+), 26 deletions(-)
> > 
> > 
> > base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f
> 



Re: Handling of main() function for freestanding

2022-10-13 Thread Jakub Jelinek via Gcc
On Thu, Oct 13, 2022 at 07:03:24PM +0200, Arsen Arsenović wrote:
> @@ -1,10 +1,10 @@
>  /* Check for "noreturn" warning in main. */
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -Wmissing-noreturn -ffreestanding" } */
> +/* { dg-options "-O2 -Wmissing-noreturn" } */
>  extern void exit (int) __attribute__ ((__noreturn__));
>  
> -int
> -main (void) /* { dg-warning "function might be candidate for attribute 
> 'noreturn'" "warn for main" } */
> +void
> +f (void) /* { dg-warning "function might be candidate for attribute 
> 'noreturn'" "warn for main" } */
>  {
>exit (0);
>  }

Don't we have such a test already elsewhere?  If not, then certain
"warn for main" part should be removed or replaced...

Jakub



Re: Handling of main() function for freestanding

2022-10-13 Thread Jason Merrill via Gcc

On 10/13/22 13:02, Arsen Arsenović wrote:

Hi,

On Friday, 7 October 2022 15:51:31 CEST Jason Merrill wrote:

* gcc.dg/noreturn-4.c: Likewise.


I'd be inclined to drop this test.

That seems like an odd choice, why do that over using another function
for the test case? (there's nothing specific to main in this test, and
it doesn't even need to link, so using any ol' function should be okay;
see attachment)


It seemed to me that the test was specifically checking that main was 
treated like any other function when freestanding.



The attached patch is also v2 of the original builtin-main one submitted
earlier.  Tested on x86_64-pc-linux-gnu.  This revision excludes the
mentioned pedwarns unless hosted.


I was arguing that we don't need the new flag; there shouldn't be any 
need to turn it off.


Jason



Re: Handling of main() function for freestanding

2022-10-13 Thread Arsen Arsenović via Gcc
On Thursday, 13 October 2022 19:10:10 CEST Jakub Jelinek wrote:
> Don't we have such a test already elsewhere?  If not, then certain
> "warn for main" part should be removed or replaced...

Whoops, missed that comment.  There is actually an equivalent test that 
I overlooked (noreturn-1.c), so maybe dropping is the right thing to do, 
indeed.

-- 
Arsen Arsenović


signature.asc
Description: This is a digitally signed message part.


Re: Toolchain Infrastructure project statement of support

2022-10-13 Thread Christopher Faylor via Gcc
Re: https://sourceware.org/pipermail/overseers/2022q4/018981.html

On Wed, Oct 12, 2022 at 12:43:09PM -0400, Carlos O'Donell wrote:
>The GNU Toolchain project leadership supports the proposal[1] to move the
>services for the GNU Toolchain to the Linux Foundation IT under the auspices of
>the Toolchain Infrastructure project (GTI) with fiscal sponsorship from the
>OpenSSF and other major donors.

Noted, however, a list of signatories does not automatically confer
authority over any particular project.  Any participation from 
overseers in moving projects to different infrastructure will require
clear approval from the individual projects themselves.

Also, the FSF, being the existing fiscal sponsor to these projects,
surely needs to review the formal agreements before we sunset our
infrastructural offerings to glibc, gcc, binutils, and gdb and hand
control of the projects' infrastructure over to a different entity.

We'd like to assure the communities that, when and if any individual
project formally expresses the decision of their developers to transfer
their services, we'll endeavor to make the move as smooth as possible. 
Those projects that wish to stay will continue to receive the best
services that the overseers can offer, with the ongoing assistance of
Red Hat, the SFC, and, when relevant, the FSF tech team.



Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Vineet Gupta

Hi,

I have a testcase (from real workloads) involving C++ atomics and trying 
to understand the codegen (gcc 12) for RVWMO and x86.
It does mix atomics with non-atomics so not obvious what the behavior is 
intended to be hence some explicit CC of subject matter experts 
(apologies for that in advance).


Test has a non-atomic store followed by an atomic_load(SEQ_CST). I 
assume that unadorned direct access defaults to safest/conservative seq_cst.


   extern int g;
   std::atomic a;

   int bar_noaccessor(int n, int *n2)
   {
    *n2 = g;
    return n + a;
   }

   int bar_seqcst(int n, int *n2)
   {
    *n2 = g;
    return n + a.load(std::memory_order_seq_cst);
   }

On RV (rvwmo), with current gcc 12 we get 2 full fences around the load 
as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from 
C/C++ to RISC-V primitives).


   _Z10bar_seqcstiPi:
   .LFB382:
    .cfi_startproc
    lui    a5,%hi(g)
    lw    a5,%lo(g)(a5)
    sw    a5,0(a1)
   *fence    iorw,iorw*
    lui    a5,%hi(a)
    lw    a5,%lo(a)(a5)
   *fence    iorw,iorw*
    addw    a0,a5,a0
    ret


OTOH, for x86 (same default toggles) there's no barriers at all.

   _Z10bar_seqcstiPi:
    endbr64
    movl    g(%rip), %eax
    movl    %eax, (%rsi)
    movl    a(%rip), %eax
    addl    %edi, %eax
    ret


My naive intuition was x86 TSO would require a fence before 
load(seq_cst) for a prior store, even if that store was non atomic, so 
ensure load didn't bubble up ahead of store.


Perhaps this begs the general question of intermixing non atomic 
accesses with atomics and if that is undefined behavior or some such. I 
skimmed through C++14 specification chapter Atomic Operations library 
but nothing's jumping out on the topic.


Or is it much deeper, related to As-if rule or something.

Thx,
-Vineet


Re: Handling of main() function for freestanding

2022-10-13 Thread Arsen Arsenović via Gcc
On Thursday, 13 October 2022 19:24:41 CEST Jason Merrill wrote:
> I was arguing that we don't need the new flag; there shouldn't be any
> need to turn it off.
At the time, I opted to go with a more conservative route; I haven't 
been around enough to have very strong opinions ;)  I certainly can't 
think of a way always adding a return can go wrong, but figured someone, 
somehow, might rely on this behavior.  Removed the flag and tested on 
x86_64-pc-linux-gnu, v3 attached.

FWIW, there's precedent for treating main specially regardless of 
flag_hosted (e.g. it's always marked extern "C" in the C++ frontend, 
AFAICT).

-- 
Arsen Arsenović
>From e60be6bb45fdba8085bde5d1883deeae640e786b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Arsen=20Arsenovi=C4=87?= 
Date: Thu, 13 Oct 2022 21:46:30 +0200
Subject: [PATCH v3] c-family: Implicitly return zero from main even on
 freestanding

... unless marked noreturn.

This should not get in anyone's way, but should permit the use of main()
in freestanding more easily, especially for writing test cases that
should work both in freestanding and hosted modes.

gcc/c/ChangeLog:

	* c-decl.cc (finish_function): Ignore hosted when deciding
	whether to implicitly return zero, but check noreturn.
	* c-objc-common.cc (c_missing_noreturn_ok_p): Loosen the
	requirements to just MAIN_NAME_P.

gcc/cp/ChangeLog:

	* cp-tree.h (DECL_MAIN_FREESTANDING_P): Move most DECL_MAIN_P
	logic here, so that we can use it when not hosted.
	(DECL_MAIN_P): Implement in terms of DECL_MAIN_FREESTANDING_P.
	* decl.cc (finish_function): Use DECL_MAIN_FREESTANDING_P
	instead of DECL_MAIN_P, to lose the hosted requirement, but
	check noreturn.

gcc/testsuite/ChangeLog:

	* g++.dg/freestanding-main.C: New test.
	* gcc.dg/freestanding-main.c: New test.
---
 gcc/c/c-decl.cc  | 2 +-
 gcc/c/c-objc-common.cc   | 5 ++---
 gcc/cp/cp-tree.h | 8 +---
 gcc/cp/decl.cc   | 3 ++-
 gcc/testsuite/g++.dg/freestanding-main.C | 5 +
 gcc/testsuite/gcc.dg/freestanding-main.c | 5 +
 6 files changed, 20 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/freestanding-main.C
 create mode 100644 gcc/testsuite/gcc.dg/freestanding-main.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 193e268f04e..8c655590558 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -10442,7 +10442,7 @@ finish_function (location_t end_loc)
   if (DECL_RESULT (fndecl) && DECL_RESULT (fndecl) != error_mark_node)
 DECL_CONTEXT (DECL_RESULT (fndecl)) = fndecl;
 
-  if (MAIN_NAME_P (DECL_NAME (fndecl)) && flag_hosted
+  if (MAIN_NAME_P (DECL_NAME (fndecl)) && !TREE_THIS_VOLATILE (fndecl)
   && TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (fndecl)))
   == integer_type_node && flag_isoc99)
 {
diff --git a/gcc/c/c-objc-common.cc b/gcc/c/c-objc-common.cc
index 70e10a98e33..2933414fd45 100644
--- a/gcc/c/c-objc-common.cc
+++ b/gcc/c/c-objc-common.cc
@@ -37,9 +37,8 @@ static bool c_tree_printer (pretty_printer *, text_info *, const char *,
 bool
 c_missing_noreturn_ok_p (tree decl)
 {
-  /* A missing noreturn is not ok for freestanding implementations and
- ok for the `main' function in hosted implementations.  */
-  return flag_hosted && MAIN_NAME_P (DECL_ASSEMBLER_NAME (decl));
+  /* A missing noreturn is ok for the `main' function.  */
+  return MAIN_NAME_P (DECL_ASSEMBLER_NAME (decl));
 }
 
 /* Called from check_global_declaration.  */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 3b67be651b9..4c7adfbffd8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -772,11 +772,13 @@ typedef struct ptrmem_cst * ptrmem_cst_t;
 
 /* Returns nonzero iff NODE is a declaration for the global function
`main'.  */
-#define DECL_MAIN_P(NODE)\
+#define DECL_MAIN_FREESTANDING_P(NODE)			\
(DECL_EXTERN_C_FUNCTION_P (NODE)			\
 && DECL_NAME (NODE) != NULL_TREE			\
-&& MAIN_NAME_P (DECL_NAME (NODE))			\
-&& flag_hosted)
+&& MAIN_NAME_P (DECL_NAME (NODE)))
+
+/* Nonzero iff NODE is a declaration for `main', and we are hosted. */
+#define DECL_MAIN_P(NODE) (DECL_MAIN_FREESTANDING_P(NODE) && flag_hosted)
 
 /* Lookup walker marking.  */
 #define LOOKUP_SEEN_P(NODE) TREE_VISITED (NODE)
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 82eb0c2f22a..cfc8cd5afd7 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17854,7 +17854,8 @@ finish_function (bool inline_p)
   if (!DECL_CLONED_FUNCTION_P (fndecl))
 {
   /* Make it so that `main' always returns 0 by default.  */
-  if (DECL_MAIN_P (current_function_decl))
+  if (DECL_MAIN_FREESTANDING_P (current_function_decl)
+	  && !TREE_THIS_VOLATILE (current_function_decl))
 	finish_return_stmt (integer_zero_node);
 
   if (use_eh_spec_block (current_function_decl))
diff --git a/gcc/testsuite/g++.dg/freestanding-main.C b/gcc/testsuite/g++.dg/freestanding-main.C
new file mode 100644
index 000..3718cc4508e
--- /dev/null
+++ b/gcc/testsuite/g++

Re: Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Jonathan Wakely via Gcc
On Thu, 13 Oct 2022 at 20:31, Vineet Gupta wrote:
>
> Hi,
>
> I have a testcase (from real workloads) involving C++ atomics and trying
> to understand the codegen (gcc 12) for RVWMO and x86.
> It does mix atomics with non-atomics so not obvious what the behavior is
> intended to be hence some explicit CC of subject matter experts
> (apologies for that in advance).
>
> Test has a non-atomic store

And a non-atomic load of 'g'

> followed by an atomic_load(SEQ_CST). I
> assume that unadorned direct access defaults to safest/conservative seq_cst.

Yes, the two functions below are identical.

>
> extern int g;
> std::atomic a;
>
> int bar_noaccessor(int n, int *n2)
> {
>  *n2 = g;
>  return n + a;
> }
>
> int bar_seqcst(int n, int *n2)
> {
>  *n2 = g;
>  return n + a.load(std::memory_order_seq_cst);
> }
>


Re: Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Uros Bizjak via Gcc
On Thu, Oct 13, 2022 at 9:31 PM Vineet Gupta  wrote:
>
> Hi,
>
> I have a testcase (from real workloads) involving C++ atomics and trying
> to understand the codegen (gcc 12) for RVWMO and x86.
> It does mix atomics with non-atomics so not obvious what the behavior is
> intended to be hence some explicit CC of subject matter experts
> (apologies for that in advance).
>
> Test has a non-atomic store followed by an atomic_load(SEQ_CST). I
> assume that unadorned direct access defaults to safest/conservative seq_cst.
>
> extern int g;
> std::atomic a;
>
> int bar_noaccessor(int n, int *n2)
> {
>  *n2 = g;
>  return n + a;
> }
>
> int bar_seqcst(int n, int *n2)
> {
>  *n2 = g;
>  return n + a.load(std::memory_order_seq_cst);
> }
>
> On RV (rvwmo), with current gcc 12 we get 2 full fences around the load
> as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from
> C/C++ to RISC-V primitives).
>
> _Z10bar_seqcstiPi:
> .LFB382:
>  .cfi_startproc
>  luia5,%hi(g)
>  lwa5,%lo(g)(a5)
>  swa5,0(a1)
> *fenceiorw,iorw*
>  luia5,%hi(a)
>  lwa5,%lo(a)(a5)
> *fenceiorw,iorw*
>  addwa0,a5,a0
>  ret
>
>
> OTOH, for x86 (same default toggles) there's no barriers at all.
>
> _Z10bar_seqcstiPi:
>  endbr64
>  movlg(%rip), %eax
>  movl%eax, (%rsi)
>  movla(%rip), %eax
>  addl%edi, %eax
>  ret
>

Regarding x86 memory model, please see Intel® 64 and IA-32 Architectures
Software Developer’s Manual, Volume 3A, section 8.2 [1]

[1] 
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html

> My naive intuition was x86 TSO would require a fence before
> load(seq_cst) for a prior store, even if that store was non atomic, so
> ensure load didn't bubble up ahead of store.

As documented in the SDM above, the x86 memory model guarantees that

• Reads are not reordered with other reads.
• Writes are not reordered with older reads.
• Writes to memory are not reordered with other writes, with the
following exceptions:
...
• Reads may be reordered with older writes to different locations but
not with older writes to the same location.
...

Uros.

> Perhaps this begs the general question of intermixing non atomic
> accesses with atomics and if that is undefined behavior or some such. I
> skimmed through C++14 specification chapter Atomic Operations library
> but nothing's jumping out on the topic.
>
> Or is it much deeper, related to As-if rule or something.
>
> Thx,
> -Vineet


Re: Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Hans Boehm via Gcc
The generated code here is correct in both cases. In the RISC--V case, I
believe it is conservative, at a minimum, in that atomics should not imply
IO ordering. We had an earlier discussion, which seemed to have consensus
in favor of that opinion. I believe clang does not enforce IO ordering.

You can think of a "sequentially consistent" load roughly as enforcing two
properties:

1) It behaves as an "acquire" load. Later (in program order) memory
operations do not advance past it. This is implicit for x86. It requires
the trailing fence on RISC-V, which could probably be weakened to r,rw.

2) It ensures that seq_cst operations are fully ordered. This means that,
in addition to (1), and the corresponding fence for stores, every seq_cst
store must be separated from a seq_cst load by at least a w,r fence, so a
seq_cst store followed by a seq_cst load is not reordered. w,r fences are
discouraged on RISC-V, and probably no better than rw,rw, so that's how the
leading fence got there. (Again the io ordering should disappear. It's the
responsibility of IO code to insert that explicitly, rather than paying for
it everywhere.)

x86 does (2) by associating that fence with stores instead of loads, either
by using explicit fences after stores, or by turning stores into xchg.
RISC-V could do the same. And I believe that if the current A extension
were the final word on the architecture, it should. But that convention is
not compatible with the later introduction of an "acquire load", which I
think is essential for performance, at least on larger cores. So I think
the two fence mapping for loads should be maintained for now, as I
suggested in the document I posted to the list.

Hans

On Thu, Oct 13, 2022 at 12:31 PM Vineet Gupta  wrote:

> Hi,
>
> I have a testcase (from real workloads) involving C++ atomics and trying
> to understand the codegen (gcc 12) for RVWMO and x86.
> It does mix atomics with non-atomics so not obvious what the behavior is
> intended to be hence some explicit CC of subject matter experts
> (apologies for that in advance).
>
> Test has a non-atomic store followed by an atomic_load(SEQ_CST). I
> assume that unadorned direct access defaults to safest/conservative
> seq_cst.
>
> extern int g;
> std::atomic a;
>
> int bar_noaccessor(int n, int *n2)
> {
>  *n2 = g;
>  return n + a;
> }
>
> int bar_seqcst(int n, int *n2)
> {
>  *n2 = g;
>  return n + a.load(std::memory_order_seq_cst);
> }
>
> On RV (rvwmo), with current gcc 12 we get 2 full fences around the load
> as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from
> C/C++ to RISC-V primitives).
>
> _Z10bar_seqcstiPi:
> .LFB382:
>  .cfi_startproc
>  luia5,%hi(g)
>  lwa5,%lo(g)(a5)
>  swa5,0(a1)
> *fenceiorw,iorw*
>  luia5,%hi(a)
>  lwa5,%lo(a)(a5)
> *fenceiorw,iorw*
>  addwa0,a5,a0
>  ret
>
>
> OTOH, for x86 (same default toggles) there's no barriers at all.
>
> _Z10bar_seqcstiPi:
>  endbr64
>  movlg(%rip), %eax
>  movl%eax, (%rsi)
>  movla(%rip), %eax
>  addl%edi, %eax
>  ret
>
>
> My naive intuition was x86 TSO would require a fence before
> load(seq_cst) for a prior store, even if that store was non atomic, so
> ensure load didn't bubble up ahead of store.
>
> Perhaps this begs the general question of intermixing non atomic
> accesses with atomics and if that is undefined behavior or some such. I
> skimmed through C++14 specification chapter Atomic Operations library
> but nothing's jumping out on the topic.
>
> Or is it much deeper, related to As-if rule or something.
>
> Thx,
> -Vineet
>


Re: Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Vineet Gupta

Hi Hans,

On 10/13/22 13:54, Hans Boehm wrote:
The generated code here is correct in both cases. In the RISC--V case, 
I believe it is conservative, at a minimum, in that atomics should not 
imply IO ordering. We had an earlier discussion, which seemed to have 
consensus in favor of that opinion. I believe clang does not enforce 
IO ordering.


You can think of a "sequentially consistent" load roughly as enforcing 
two properties:


1) It behaves as an "acquire" load. Later (in program order) memory 
operations do not advance past it. This is implicit for x86. It 
requires the trailing fence on RISC-V, which could probably be 
weakened to r,rw.


Acq implies later things won't leak out, but prior things could still 
leak-in, meaning prior write could happen after load which contradicts 
what user is asking by load(seq_cst) on x86 ?




2) It ensures that seq_cst operations are fully ordered. This means 
that, in addition to (1), and the corresponding fence for stores, 
every seq_cst store must be separated from a seq_cst load by at least 
a w,r fence, so a seq_cst store followed by a seq_cst load is not 
reordered.


This makes sense when both store -> load are seq_cst.
But the question is what happens when that store is non atomic. IOW if 
we had a store(relaxed) -> load(seq_cst) would the generated code still 
ensure that load had a full barrier to prevent



w,r fences are discouraged on RISC-V, and probably no better than 
rw,rw, so that's how the leading fence got there. (Again the io 
ordering should disappear. It's the responsibility of IO code to 
insert that explicitly, rather than paying for it everywhere.)


Thanks for explaining the RV semantics.



x86 does (2) by associating that fence with stores instead of loads, 
either by using explicit fences after stores, or by turning stores 
into xchg.


That makes sense as x86 has ld->ld and ld -> st architecturally ordered, 
so any fences ought to be associated with st.


Thx,
-Vineet

RISC-V could do the same. And I believe that if the current A 
extension were the final word on the architecture, it should. But that 
convention is not compatible with the later introduction of an 
"acquire load", which I think is essential for performance, at least 
on larger cores. So I think the two fence mapping for loads should be 
maintained for now, as I suggested in the document I posted to the list.


Hans

On Thu, Oct 13, 2022 at 12:31 PM Vineet Gupta  
wrote:


Hi,

I have a testcase (from real workloads) involving C++ atomics and
trying
to understand the codegen (gcc 12) for RVWMO and x86.
It does mix atomics with non-atomics so not obvious what the
behavior is
intended to be hence some explicit CC of subject matter experts
(apologies for that in advance).

Test has a non-atomic store followed by an atomic_load(SEQ_CST). I
assume that unadorned direct access defaults to
safest/conservative seq_cst.

    extern int g;
    std::atomic a;

    int bar_noaccessor(int n, int *n2)
    {
         *n2 = g;
         return n + a;
    }

    int bar_seqcst(int n, int *n2)
    {
         *n2 = g;
         return n + a.load(std::memory_order_seq_cst);
    }

On RV (rvwmo), with current gcc 12 we get 2 full fences around the
load
as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from
C/C++ to RISC-V primitives).

    _Z10bar_seqcstiPi:
    .LFB382:
         .cfi_startproc
         lui    a5,%hi(g)
         lw    a5,%lo(g)(a5)
         sw    a5,0(a1)
    *fence    iorw,iorw*
         lui    a5,%hi(a)
         lw    a5,%lo(a)(a5)
    *fence    iorw,iorw*
         addw    a0,a5,a0
         ret


OTOH, for x86 (same default toggles) there's no barriers at all.

    _Z10bar_seqcstiPi:
     endbr64
         movl    g(%rip), %eax
         movl    %eax, (%rsi)
         movl    a(%rip), %eax
         addl    %edi, %eax
         ret


My naive intuition was x86 TSO would require a fence before
load(seq_cst) for a prior store, even if that store was non
atomic, so
ensure load didn't bubble up ahead of store.

Perhaps this begs the general question of intermixing non atomic
accesses with atomics and if that is undefined behavior or some
such. I
skimmed through C++14 specification chapter Atomic Operations library
but nothing's jumping out on the topic.

Or is it much deeper, related to As-if rule or something.

Thx,
-Vineet



Re: Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Vineet Gupta




On 10/13/22 13:30, Uros Bizjak wrote:

OTOH, for x86 (same default toggles) there's no barriers at all.

 _Z10bar_seqcstiPi:
  endbr64
  movlg(%rip), %eax
  movl%eax, (%rsi)
  movla(%rip), %eax
  addl%edi, %eax
  ret


Regarding x86 memory model, please see Intel® 64 and IA-32 Architectures
Software Developer’s Manual, Volume 3A, section 8.2 [1]

[1]https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html


My naive intuition was x86 TSO would require a fence before
load(seq_cst) for a prior store, even if that store was non atomic, so
ensure load didn't bubble up ahead of store.

As documented in the SDM above, the x86 memory model guarantees that

• Reads are not reordered with other reads.
• Writes are not reordered with older reads.
• Writes to memory are not reordered with other writes, with the
following exceptions:
...
• Reads may be reordered with older writes to different locations but
not with older writes to the same location.


So my example is the last case where older write is followed by read to 
different location and thus potentially could be reordered.


Re: Handling of main() function for freestanding

2022-10-13 Thread Jason Merrill via Gcc

On 10/13/22 16:14, Arsen Arsenović wrote:

On Thursday, 13 October 2022 19:24:41 CEST Jason Merrill wrote:

I was arguing that we don't need the new flag; there shouldn't be any
need to turn it off.

At the time, I opted to go with a more conservative route; I haven't
been around enough to have very strong opinions ;)  I certainly can't
think of a way always adding a return can go wrong, but figured someone,
somehow, might rely on this behavior.  Removed the flag and tested on
x86_64-pc-linux-gnu, v3 attached.


Thanks!


FWIW, there's precedent for treating main specially regardless of
flag_hosted (e.g. it's always marked extern "C" in the C++ frontend,
AFAICT).

-#define DECL_MAIN_P(NODE)  \
+#define DECL_MAIN_FREESTANDING_P(NODE) \
(DECL_EXTERN_C_FUNCTION_P (NODE)\
 && DECL_NAME (NODE) != NULL_TREE   \
-&& MAIN_NAME_P (DECL_NAME (NODE))  \
-&& flag_hosted)
+&& MAIN_NAME_P (DECL_NAME (NODE)))
+
+/* Nonzero iff NODE is a declaration for `main', and we are hosted. */
+#define DECL_MAIN_P(NODE) (DECL_MAIN_FREESTANDING_P(NODE) && flag_hosted)


I liked in the previous version that you checked the return type of main 
when !flag_hosted, here and in c_missing_noreturn_ok_p.  Let's bring 
that back.


Jason



Re: Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Uros Bizjak via Gcc
On Thu, Oct 13, 2022 at 11:14 PM Vineet Gupta  wrote:
>
>
>
> On 10/13/22 13:30, Uros Bizjak wrote:
>
> OTOH, for x86 (same default toggles) there's no barriers at all.
>
> _Z10bar_seqcstiPi:
>  endbr64
>  movlg(%rip), %eax
>  movl%eax, (%rsi)
>  movla(%rip), %eax
>  addl%edi, %eax
>  ret
>
> Regarding x86 memory model, please see Intel® 64 and IA-32 Architectures
> Software Developer’s Manual, Volume 3A, section 8.2 [1]
>
> [1] 
> https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
>
> My naive intuition was x86 TSO would require a fence before
> load(seq_cst) for a prior store, even if that store was non atomic, so
> ensure load didn't bubble up ahead of store.
>
> As documented in the SDM above, the x86 memory model guarantees that
>
> • Reads are not reordered with other reads.
> • Writes are not reordered with older reads.
> • Writes to memory are not reordered with other writes, with the
> following exceptions:
> ...
> • Reads may be reordered with older writes to different locations but
> not with older writes to the same location.
>
>
> So my example is the last case where older write is followed by read to 
> different location and thus potentially could be reordered.

Yes, but can this reordening be observed under the above conditions?
There is additional rule, where:

In the case of I/O operations, both reads and writes always appear in
programmed order.

Uros.


Re: Fences/Barriers when mixing C++ atomics and non-atomics

2022-10-13 Thread Hans Boehm via Gcc
On Thu, Oct 13, 2022 at 2:11 PM Vineet Gupta  wrote:

> Hi Hans,
>
> On 10/13/22 13:54, Hans Boehm wrote:
>
> The generated code here is correct in both cases. In the RISC--V case, I
> believe it is conservative, at a minimum, in that atomics should not imply
> IO ordering. We had an earlier discussion, which seemed to have consensus
> in favor of that opinion. I believe clang does not enforce IO ordering.
>
> You can think of a "sequentially consistent" load roughly as enforcing two
> properties:
>
> 1) It behaves as an "acquire" load. Later (in program order) memory
> operations do not advance past it. This is implicit for x86. It requires
> the trailing fence on RISC-V, which could probably be weakened to r,rw.
>
>
> Acq implies later things won't leak out, but prior things could still
> leak-in, meaning prior write could happen after load which contradicts what
> user is asking by load(seq_cst) on x86 ?
>
> Agreed.

>
> 2) It ensures that seq_cst operations are fully ordered. This means that,
> in addition to (1), and the corresponding fence for stores, every seq_cst
> store must be separated from a seq_cst load by at least a w,r fence, so a
> seq_cst store followed by a seq_cst load is not reordered.
>
>
> This makes sense when both store -> load are seq_cst.
> But the question is what happens when that store is non atomic. IOW if we
> had a store(relaxed) -> load(seq_cst) would the generated code still ensure
> that load had a full barrier to prevent
>
> That reordering is not observable in conforming C or C++ code. To observe
that reordering, another thread would have to  concurrently load from the
same location as the non-atomic store. That's a data race and undefined
behavior, at least in C and C++.

Perhaps more importantly here, if the earlier store is a relaxed store,
then the relaxed store is not ordered with respect to a subsequent seq_cst
load, just as it would not be ordered by a subsequent critical section.
You can think of C++ seq_cst as being roughly the minimal ordering to
guarantee that if you only use locks and seq_cst atomics (and avoid data
races as required), everything looks sequentially consistent.

I think the Linux kernel has made some different decisions here that give
atomics stronger ordering properties than lock-based critical sections.

>
> w,r fences are discouraged on RISC-V, and probably no better than rw,rw,
> so that's how the leading fence got there. (Again the io ordering should
> disappear. It's the responsibility of IO code to insert that explicitly,
> rather than paying for it everywhere.)
>
>
> Thanks for explaining the RV semantics.
>
>
> x86 does (2) by associating that fence with stores instead of loads,
> either by using explicit fences after stores, or by turning stores into
> xchg.
>
>
> That makes sense as x86 has ld->ld and ld -> st architecturally ordered,
> so any fences ought to be associated with st.
>
It also guarantees st->st and ld->st. The decision is arbitrary, except
that we believe that there will be fewer stores than loads that need those
fences.

>
> Thx,
> -Vineet
>
> RISC-V could do the same. And I believe that if the current A extension
> were the final word on the architecture, it should. But that convention is
> not compatible with the later introduction of an "acquire load", which I
> think is essential for performance, at least on larger cores. So I think
> the two fence mapping for loads should be maintained for now, as I
> suggested in the document I posted to the list.
>
> Hans
>
> On Thu, Oct 13, 2022 at 12:31 PM Vineet Gupta 
> wrote:
>
>> Hi,
>>
>> I have a testcase (from real workloads) involving C++ atomics and trying
>> to understand the codegen (gcc 12) for RVWMO and x86.
>> It does mix atomics with non-atomics so not obvious what the behavior is
>> intended to be hence some explicit CC of subject matter experts
>> (apologies for that in advance).
>>
>> Test has a non-atomic store followed by an atomic_load(SEQ_CST). I
>> assume that unadorned direct access defaults to safest/conservative
>> seq_cst.
>>
>> extern int g;
>> std::atomic a;
>>
>> int bar_noaccessor(int n, int *n2)
>> {
>>  *n2 = g;
>>  return n + a;
>> }
>>
>> int bar_seqcst(int n, int *n2)
>> {
>>  *n2 = g;
>>  return n + a.load(std::memory_order_seq_cst);
>> }
>>
>> On RV (rvwmo), with current gcc 12 we get 2 full fences around the load
>> as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from
>> C/C++ to RISC-V primitives).
>>
>> _Z10bar_seqcstiPi:
>> .LFB382:
>>  .cfi_startproc
>>  luia5,%hi(g)
>>  lwa5,%lo(g)(a5)
>>  swa5,0(a1)
>> *fenceiorw,iorw*
>>  luia5,%hi(a)
>>  lwa5,%lo(a)(a5)
>> *fenceiorw,iorw*
>>  addwa0,a5,a0
>>  ret
>>
>>
>> OTOH, for x86 (same default toggles) there's no barriers at all.
>>
>> _Z10bar_seqcstiPi:
>>  endbr64
>>  movl

gcc-10-20221013 is now available

2022-10-13 Thread GCC Administrator via Gcc
Snapshot gcc-10-20221013 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/10-20221013/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 10 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-10 revision 21f6023b3b4acc620914e399a1285ca6df65e2d3

You'll find:

 gcc-10-20221013.tar.xz   Complete GCC

  SHA256=410a3282bcdd5ea14542f3ac961b9d1ed15ef746f7a987aa322f1fd573caba70
  SHA1=9f659080374a34a36ab349f79c61e330d0fb68e3

Diffs from 10-20221006 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-10
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.