On 10/23/20 1:47 PM, Martin Liška wrote:
Hey.

Hello.

I deferred the patch for GCC 12. Since the time, I messed up with options
I feel more familiar with the option handling. So ...


This is a follow-up of the discussion that happened in thread about 
no_stack_protector
attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html

The current optimize attribute works in the following way:
- 1) we take current global_options as base
- 2) maybe_default_options is called for the currently selected optimization 
level, which
      means all rules in default_options_table are executed
- 3) attribute values are applied (via decode_options)

So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and 
__attribute__((optimize("-fno-stack-protector")))
ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is 
default:
     /* -O1 and -Og optimizations.  */
     { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },

My patch handled and the current optimize attribute really behaves that same as 
appending attribute value
to the command line. So far so good. We should also reflect that in 
documentation entry which is quite
vague right now:

^^^ all these are still valid arguments, plus I'm adding a new test-case that 
tests that.


"""
The optimize attribute is used to specify that a function is to be compiled 
with different optimization options than specified on the command line.
"""

I addressed that with documentation changes, should be more clear to users. 
Moreover, I noticed that we declare 'optimize' attribute
as something not for a production use:

"The optimize attribute should be used for debugging purposes only. It is not 
suitable in production code."

Are we sure about the statement? I know that e.g. glibc uses that.


and we may want to handle -Ox in the attribute in a special way. I guess many 
macro/pragma users expect that

-O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
with -ftree-vectorize -O1 ?

The situation with 'target' attribute is different. When parsing the attribute, 
we intentionally drop all existing target flags:

$ cat -n gcc/config/i386/i386-options.c
...
  1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
  1246                  {
  1247                    /* If arch= is set,  clear all bits in 
x_ix86_isa_flags,
  1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
  1249                       and all bits in x_ix86_isa_flags2.  */
  1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
  1251                                               | OPTION_MASK_ABI_64
  1252                                               | OPTION_MASK_ABI_X32
  1253                                               | OPTION_MASK_CODE16);
  1254                    opts->x_ix86_isa_flags_explicit &= 
(OPTION_MASK_ISA_64BIT
  1255                                                        | 
OPTION_MASK_ABI_64
  1256                                                        | 
OPTION_MASK_ABI_X32
  1257                                                        | 
OPTION_MASK_CODE16);
  1258                    opts->x_ix86_isa_flags2 = 0;
  1259                    opts->x_ix86_isa_flags2_explicit = 0;
  1260                  }

That seems logical because target attribute is used for e.g. ifunc 
multi-versioning and one needs
to be sure all existing ISA flags are dropped. However, I noticed clang behaves 
differently:

$ cat hreset.c
#pragma GCC target "arch=geode"
#include <immintrin.h>
void foo(unsigned int eax)
{
  _hreset (eax);
}

$ clang hreset.c -mhreset  -c -O2 -m32
$ gcc hreset.c -mhreset  -c -O2 -m32
In file included from 
/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
                 from 
/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
                 from hreset.c:2:
hreset.c: In function ‘foo’:
/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1:
 error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific 
option mismatch
   39 | _hreset (unsigned int __EAX)
      | ^~~~~~~
hreset.c:5:3: note: called from here
    5 |   _hreset (eax);
      |   ^~~~~~~~~~~~~

Anyway, I think the current target attribute handling should be preserved.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin


I'm also planning to take a look at the target macro/attribute, I expect 
similar problems:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469

Thoughts?
Thanks,
Martin

>From eaa1892cbe32c6fe73de7708aa17be2d3917bceb Mon Sep 17 00:00:00 2001
From: Martin Liska <mli...@suse.cz>
Date: Wed, 2 Jun 2021 08:44:37 +0200
Subject: [PATCH] Append target/optimize attr to the current cmdline.

gcc/c-family/ChangeLog:

	* c-common.c (parse_optimize_options): Combine optimize
	options with what was provided on the command line.

gcc/ChangeLog:

	* toplev.c (toplev::main): Save decoded optimization options.
	* toplev.h (save_opt_decoded_options): New.
	* doc/extend.texi: Be more clear about optimize and target
	attributes.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math.
	* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
	* gcc.target/i386/attr-optimize.c: New test.
---
 gcc/c-family/c-common.c                       | 17 +++++++++++--
 gcc/doc/extend.texi                           |  8 +++++--
 gcc/testsuite/gcc.target/i386/attr-optimize.c | 24 +++++++++++++++++++
 .../gcc.target/i386/avx512er-vrsqrt28ps-3.c   |  2 +-
 .../gcc.target/i386/avx512er-vrsqrt28ps-5.c   |  2 +-
 gcc/toplev.c                                  |  8 +++++++
 gcc/toplev.h                                  |  1 +
 7 files changed, 56 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/attr-optimize.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 681fcc972f4..f4e56653585 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5901,9 +5901,22 @@ parse_optimize_options (tree args, bool attr_p)
       j++;
     }
   decoded_options_count = j;
-  /* And apply them.  */
+
+  /* Merge the decoded options with save_decoded_options.  */
+  unsigned save_opt_count = save_opt_decoded_options.length ();
+  unsigned merged_decoded_options_count
+    = save_opt_count + decoded_options_count;
+  cl_decoded_option *merged_decoded_options
+    = XNEWVEC (cl_decoded_option, merged_decoded_options_count);
+
+  for (unsigned i = 0; i < save_opt_count; ++i)
+    merged_decoded_options[i] = save_opt_decoded_options[i];
+  for (unsigned i = 0; i < decoded_options_count; ++i)
+    merged_decoded_options[save_opt_count + i] = decoded_options[i];
+
+   /* And apply them.  */
   decode_options (&global_options, &global_options_set,
-		  decoded_options, decoded_options_count,
+		  merged_decoded_options, merged_decoded_options_count,
 		  input_location, global_dc, NULL);
   free (decoded_options);
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8fc66d626d8..3b501a8be3a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3593,7 +3593,10 @@ take function pointer arguments.
 @cindex @code{optimize} function attribute
 The @code{optimize} attribute is used to specify that a function is to
 be compiled with different optimization options than specified on the
-command line.  Valid arguments are constant non-negative integers and
+command line.  The optimize attribute arguments of a function behave
+as if they were added to the command line options.
+
+Valid arguments are constant non-negative integers and
 strings.  Each numeric argument specifies an optimization @var{level}.
 Each @var{string} argument consists of one or more comma-separated
 substrings.  Each substring that begins with the letter @code{O} refers
@@ -3797,7 +3800,8 @@ This attribute prevents stack protection code for the function.
 Multiple target back ends implement the @code{target} attribute
 to specify that a function is to
 be compiled with different target options than specified on the
-command line.  One or more strings can be provided as arguments.
+command line.  The original target command line options are ignored.
+One or more strings can be provided as arguments.
 Each string consists of one or more comma-separated suffixes to
 the @code{-m} prefix jointly forming the name of a machine-dependent
 option.  @xref{Submodel Options,,Machine-Dependent Options}.
diff --git a/gcc/testsuite/gcc.target/i386/attr-optimize.c b/gcc/testsuite/gcc.target/i386/attr-optimize.c
new file mode 100644
index 00000000000..f5db028f1fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/attr-optimize.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O1 -ftree-slp-vectorize -march=znver1 -fdump-tree-optimized" } */
+
+/* Use -O2, but -ftree-slp-vectorize option should be preserved and used.  */
+#pragma GCC optimize "-O2"
+
+typedef struct {
+  long n[4];
+} secp256k1_fe;
+
+void *a;
+int c;
+static void
+fn1(secp256k1_fe *p1, int p2)
+{
+  p1->n[0] = p1->n[1] = p2;
+}
+void
+fn2()
+{
+  fn1(a, !c);
+}
+
+/* { dg-final { scan-tree-dump { MEM <vector\(2\) long int> \[[^]]*\] = } "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
index 1ba8172d6e3..40aefb50844 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
@@ -8,7 +8,7 @@
 #define MAX 1000
 #define EPS 0.00001
 
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
 void static
 compute_rsqrt_ref (float *a, float *r)
 {
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
index e067a81e562..498f4d50aa5 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
@@ -8,7 +8,7 @@
 #define MAX 1000
 #define EPS 0.00001
 
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
 void static
 compute_sqrt_ref (float *a, float *r)
 {
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 43f1f7d345e..3e26ad97ff0 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -121,6 +121,9 @@ static bool no_backend;
 struct cl_decoded_option *save_decoded_options;
 unsigned int save_decoded_options_count;
 
+/* Vector of saved Optimization decoded command line options.  */
+auto_vec<cl_decoded_option> save_opt_decoded_options;
+
 /* Used to enable -fvar-tracking, -fweb and -frename-registers according
    to optimize in process_options ().  */
 #define AUTODETECT_VALUE 2
@@ -2335,6 +2338,11 @@ toplev::main (int argc, char **argv)
 						&save_decoded_options,
 						&save_decoded_options_count);
 
+  /* Save Optimization decoded options.  */
+  for (unsigned i = 0; i < save_decoded_options_count; ++i)
+    if (cl_options[save_decoded_options[i].opt_index].flags & CL_OPTIMIZATION)
+      save_opt_decoded_options.safe_push (save_decoded_options[i]);
+
   /* Perform language-specific options initialization.  */
   lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
 
diff --git a/gcc/toplev.h b/gcc/toplev.h
index 175944c85b7..9a1d2e6986f 100644
--- a/gcc/toplev.h
+++ b/gcc/toplev.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Decoded options, and number of such options.  */
 extern struct cl_decoded_option *save_decoded_options;
 extern unsigned int save_decoded_options_count;
+extern auto_vec<cl_decoded_option> save_opt_decoded_options;
 
 class timer;
 
-- 
2.32.0

Reply via email to