Re: [PATCH] x86: Only use general-purpose registers during CPUID check

2020-08-23 Thread Uros Bizjak via Gcc-patches
On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu  wrote:

> > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, AVX and
> > > AVX512 during CPUID check to avoid vector and mask register operations.
> >
> > -mgeneral-regs-only ?
> >
>
> Here is a patch to add target("general-regs-only") function
> attribute and use it for CPUID check.   OK for master if there
> are no regressions?

Please test it first, then ask for an approval.

Please submit the general-regs-only part as an independent patch. (I
think this is the option linux should use for compilation).

OTOH, wrapping CPUID check in a target attribute is a bad idea. We
should disable spills to mask registers for generic targets by either
raising costs of moves between general and mask registers and/or (as
suggested earlier) introducing TARGET_SPILL_TO_MASK_REGS tuning and
use it in secondary_memory_needed to prevent inter register unit
spills.

So, compiling with -mavx512bw would NOT enable spills by default,
where compiling with -march=skylake-avx512 (or using equivalent
-mtune) would. This is IMO the least surprising approach, and would
avoid changing sources (as you now have to do for several testcases).

Uros.


[committed] wwwdocs: Update reference to RISC-V ISA Specifications

2020-08-23 Thread Gerald Pfeifer
Pushed. Gerald

---
 htdocs/readings.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/readings.html b/htdocs/readings.html
index b960eb8c..978d566c 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -263,7 +263,7 @@ names.
  riscv
   Manufacturer: Many (open ISA standard)
   https://riscv.org";>RISC-V Foundation
-  https://riscv.org/specifications/";>ISA Specifications
+  https://riscv.org/technical/specifications/";>ISA 
Specifications
  
  
  rs6000 (powerpc, powerpcle)
-- 
2.28.0


PING [PATCH] x86: Change CTZ_DEFINED_VALUE_AT_ZERO to return 0/2

2020-08-23 Thread H.J. Lu via Gcc-patches
On Mon, Jul 13, 2020 at 6:42 AM H.J. Lu  wrote:
>
> Change CTZ_DEFINED_VALUE_AT_ZERO/CTZ_DEFINED_VALUE_AT_ZERO to return 0/2
> to enable table-based clz/ctz optimization:
>
>  -- Macro: CLZ_DEFINED_VALUE_AT_ZERO (MODE, VALUE)
>  -- Macro: CTZ_DEFINED_VALUE_AT_ZERO (MODE, VALUE)
>  A C expression that indicates whether the architecture defines a
>  value for 'clz' or 'ctz' with a zero operand.  A result of '0'
>  indicates the value is undefined.  If the value is defined for only
>  the RTL expression, the macro should evaluate to '1'; if the value
>  applies also to the corresponding optab entry (which is normally
>  the case if it expands directly into the corresponding RTL), then
>  the macro should evaluate to '2'.  In the cases where the value is
>  defined, VALUE should be set to this value.
>
> gcc/
>
> PR target/95863
> * config/i386/i386.h (CTZ_DEFINED_VALUE_AT_ZERO): Return 0/2.
> (CLZ_DEFINED_VALUE_AT_ZERO): Likewise.
>
> gcc/testsuite/
>
> PR target/95863
> * gcc.target/i386/pr95863-1.c: New test.
> * gcc.target/i386/pr95863-2.c: Likewise.
> ---
>  gcc/config/i386/i386.h|  4 +-
>  gcc/testsuite/gcc.target/i386/pr95863-1.c | 47 +++
>  gcc/testsuite/gcc.target/i386/pr95863-2.c | 27 +
>  3 files changed, 76 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95863-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
>
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index f4a8f1391fa..1deb59f286f 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -2946,9 +2946,9 @@ extern void debug_dispatch_window (int);
>  /* The value at zero is only defined for the BMI instructions
> LZCNT and TZCNT, not the BSR/BSF insns in the original isa.  */
>  #define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> -   ((VALUE) = GET_MODE_BITSIZE (MODE), TARGET_BMI ? 1 : 0)
> +   ((VALUE) = GET_MODE_BITSIZE (MODE), TARGET_BMI ? 2 : 0)
>  #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> -   ((VALUE) = GET_MODE_BITSIZE (MODE), TARGET_LZCNT ? 1 : 0)
> +   ((VALUE) = GET_MODE_BITSIZE (MODE), TARGET_LZCNT ? 2 : 0)
>
>
>  /* Flags returned by ix86_get_callcvt ().  */
> diff --git a/gcc/testsuite/gcc.target/i386/pr95863-1.c 
> b/gcc/testsuite/gcc.target/i386/pr95863-1.c
> new file mode 100644
> index 000..f3918a1a766
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr95863-1.c
> @@ -0,0 +1,47 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -mbmi" } */
> +
> +int ctz1 (unsigned x)
> +{
> +  static const char table[32] =
> +{
> +  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
> +  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
> +};
> +
> +  return table[((unsigned)((x & -x) * 0x077CB531U)) >> 27];
> +}
> +
> +int ctz2 (unsigned x)
> +{
> +#define u 0
> +  static short table[64] =
> +{
> +  32, 0, 1,12, 2, 6, u,13, 3, u, 7, u, u, u, u,14,
> +  10, 4, u, u, 8, u, u,25, u, u, u, u, u,21,27,15,
> +  31,11, 5, u, u, u, u, u, 9, u, u,24, u, u,20,26,
> +  30, u, u, u, u,23, u,19,29, u,22,18,28,17,16, u
> +};
> +
> +  x = (x & -x) * 0x0450FBAF;
> +  return table[x >> 26];
> +}
> +
> +int ctz3 (unsigned x)
> +{
> +  static int table[32] =
> +{
> +  0, 1, 2,24, 3,19, 6,25, 22, 4,20,10,16, 7,12,26,
> +  31,23,18, 5,21, 9,15,11,30,17, 8,14,29,13,28,27
> +};
> +
> +  if (x == 0) return 32;
> +  x = (x & -x) * 0x04D7651F;
> +  return table[x >> 27];
> +}
> +
> +/* { dg-final { scan-assembler-times "tzcntl\t" 3 } } */
> +/* { dg-final { scan-assembler-times "andl\t" 1 } } */
> +/* { dg-final { scan-assembler-not "neg" } } */
> +/* { dg-final { scan-assembler-not "imul" } } */
> +/* { dg-final { scan-assembler-not "shr" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr95863-2.c 
> b/gcc/testsuite/gcc.target/i386/pr95863-2.c
> new file mode 100644
> index 000..cb56dfc6d94
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr95863-2.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O -mbmi" } */
> +
> +static const unsigned long long magic = 0x03f08c5392f756cdULL;
> +
> +static const char table[64] = {
> + 0,  1, 12,  2, 13, 22, 17,  3,
> +14, 33, 23, 36, 18, 58, 28,  4,
> +62, 15, 34, 26, 24, 48, 50, 37,
> +19, 55, 59, 52, 29, 44, 39,  5,
> +63, 11, 21, 16, 32, 35, 57, 27,
> +61, 25, 47, 49, 54, 51, 43, 38,
> +10, 20, 31, 56, 60, 46, 53, 42,
> + 9, 30, 45, 41,  8, 40,  7,  6,
> +};
> +
> +int ctz4 (unsigned long long x)
> +{
> +  unsigned long long lsb = x & -x;
> +  return table[(lsb * magic) >> 58];
> +}
> +
> +/* { dg-final { scan-assembler-times "tzcntq\t" 1 } } */
> +/* { dg-final { scan-assembler-times "andl\t" 1 } } */
> +/* { dg-final { scan-assembler-not "negq" } } */
> +/* { dg-final { scan-assembler-not "imulq" } } */
> +

[PATCH] x86: Add target("general-regs-only") function attribute

2020-08-23 Thread H.J. Lu via Gcc-patches
On Sun, Aug 23, 2020 at 10:18:28AM +0200, Uros Bizjak wrote:
> On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu  wrote:
> 
> > > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, AVX and
> > > > AVX512 during CPUID check to avoid vector and mask register operations.
> > >
> > > -mgeneral-regs-only ?
> > >
> >
> > Here is a patch to add target("general-regs-only") function
> > attribute and use it for CPUID check.   OK for master if there
> > are no regressions?
> 
> Please test it first, then ask for an approval.
> 
> Please submit the general-regs-only part as an independent patch. (I
> think this is the option linux should use for compilation).
> 

Tested on Linux/x86-64.  OK for master?

Thanks.

H.J.
---
gcc/

PR target/96744
* config/i386/i386-options.c (IX86_ATTR_IX86_YES): New.
(IX86_ATTR_IX86_NO): Likewise.
(ix86_opt_type): Add ix86_opt_ix86_yes and ix86_opt_ix86_no.
(ix86_valid_target_attribute_inner_p): Handle general-regs-only,
ix86_opt_ix86_yes and ix86_opt_ix86_no.
(ix86_option_override_internal): Check opts->x_ix86_target_flags
instead of opts->x_ix86_target_flags.
* doc/extend.texi: Document target("general-regs-only") function
attribute.

gcc/testsuite/

PR target/96744
* gcc.target/i386/pr96744-1.c: New test.
* gcc.target/i386/pr96744-2.c: Likewise.
* gcc.target/i386/pr96744-3a.c: Likewise.
* gcc.target/i386/pr96744-3b.c: Likewise.
* gcc.target/i386/pr96744-4.c: Likewise.
* gcc.target/i386/pr96744-5.c: Likewise.
* gcc.target/i386/pr96744-6.c: Likewise.
* gcc.target/i386/pr96744-7.c: Likewise.
* gcc.target/i386/pr96744-8a.c: Likewise.
* gcc.target/i386/pr96744-8b.c: Likewise.
* gcc.target/i386/pr96744-9.c: Likewise.
---
 gcc/config/i386/i386-options.c | 44 --
 gcc/doc/extend.texi|  4 ++
 gcc/testsuite/gcc.target/i386/pr96744-1.c  | 10 +
 gcc/testsuite/gcc.target/i386/pr96744-2.c  | 11 ++
 gcc/testsuite/gcc.target/i386/pr96744-3a.c | 12 ++
 gcc/testsuite/gcc.target/i386/pr96744-3b.c | 16 
 gcc/testsuite/gcc.target/i386/pr96744-4.c  | 11 ++
 gcc/testsuite/gcc.target/i386/pr96744-5.c  | 17 +
 gcc/testsuite/gcc.target/i386/pr96744-6.c  | 11 ++
 gcc/testsuite/gcc.target/i386/pr96744-7.c  | 14 +++
 gcc/testsuite/gcc.target/i386/pr96744-8a.c | 33 
 gcc/testsuite/gcc.target/i386/pr96744-8b.c | 35 +
 gcc/testsuite/gcc.target/i386/pr96744-9.c  | 25 
 13 files changed, 240 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-8a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-8b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-9.c

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 26d1ea18ef1..e0fc68c27bf 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -922,12 +922,18 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree 
args, char *p_strings[],
 #define IX86_ATTR_ENUM(S,O)  { S, sizeof (S)-1, ix86_opt_enum, O, 0 }
 #define IX86_ATTR_YES(S,O,M) { S, sizeof (S)-1, ix86_opt_yes, O, M }
 #define IX86_ATTR_NO(S,O,M)  { S, sizeof (S)-1, ix86_opt_no,  O, M }
+#define IX86_ATTR_IX86_YES(S,O,M) \
+  { S, sizeof (S)-1, ix86_opt_ix86_yes, O, M }
+#define IX86_ATTR_IX86_NO(S,O,M) \
+  { S, sizeof (S)-1, ix86_opt_ix86_no,  O, M }
 
   enum ix86_opt_type
   {
 ix86_opt_unknown,
 ix86_opt_yes,
 ix86_opt_no,
+ix86_opt_ix86_yes,
+ix86_opt_ix86_no,
 ix86_opt_str,
 ix86_opt_enum,
 ix86_opt_isa
@@ -1062,6 +1068,10 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree 
args, char *p_strings[],
 IX86_ATTR_YES ("recip",
   OPT_mrecip,
   MASK_RECIP),
+
+IX86_ATTR_IX86_YES ("general-regs-only",
+   OPT_mgeneral_regs_only,
+   OPTION_MASK_GENERAL_REGS_ONLY),
   };
 
   location_t loc
@@ -1175,6 +1185,33 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree 
args, char *p_strings[],
opts->x_target_flags &= ~mask;
}
 
+  else if (type == ix86_opt_ix86_yes || type == ix86_opt_ix86_no)
+   {
+ if (mask == OPTION_MASK_GENERAL_REGS_ONLY)
+   {
+ if (type != ix86_opt_ix86_yes)
+   gcc_unreachable ();
+
+ 

Re: [PATCH] x86: Only use general-purpose registers during CPUID check

2020-08-23 Thread H.J. Lu via Gcc-patches
On Sun, Aug 23, 2020 at 10:18:28AM +0200, Uros Bizjak wrote:
> On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu  wrote:
> 
> > > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, AVX and
> > > > AVX512 during CPUID check to avoid vector and mask register operations.
> > >
> > > -mgeneral-regs-only ?
> > >
> >
> > Here is a patch to add target("general-regs-only") function
> > attribute and use it for CPUID check.   OK for master if there
> > are no regressions?
> 
> Please test it first, then ask for an approval.
> 
> Please submit the general-regs-only part as an independent patch. (I
> think this is the option linux should use for compilation).
> 
> OTOH, wrapping CPUID check in a target attribute is a bad idea. We
> should disable spills to mask registers for generic targets by either
> raising costs of moves between general and mask registers and/or (as
> suggested earlier) introducing TARGET_SPILL_TO_MASK_REGS tuning and
> use it in secondary_memory_needed to prevent inter register unit
> spills.
> 
> So, compiling with -mavx512bw would NOT enable spills by default,
> where compiling with -march=skylake-avx512 (or using equivalent
> -mtune) would. This is IMO the least surprising approach, and would
> avoid changing sources (as you now have to do for several testcases).

We have 2 orthogonal issues here:

1. When mask register spill should be enabled.
2. CPUID check should be done with general registers only.

As shown in GCC testcases, CPUID check may be done with arbitrary ISAs
or -march/-mtune options enabled.  We should either

1. Enable only general registers for CPUID check.  Or
2. Issue an error for CPUID check if non-general registers are used.

H.J.


Re: [PATCH] x86: Add target("general-regs-only") function attribute

2020-08-23 Thread Uros Bizjak via Gcc-patches
On Sun, Aug 23, 2020 at 5:07 PM H.J. Lu  wrote:
>
> On Sun, Aug 23, 2020 at 10:18:28AM +0200, Uros Bizjak wrote:
> > On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu  wrote:
> >
> > > > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, AVX 
> > > > > and
> > > > > AVX512 during CPUID check to avoid vector and mask register 
> > > > > operations.
> > > >
> > > > -mgeneral-regs-only ?
> > > >
> > >
> > > Here is a patch to add target("general-regs-only") function
> > > attribute and use it for CPUID check.   OK for master if there
> > > are no regressions?
> >
> > Please test it first, then ask for an approval.
> >
> > Please submit the general-regs-only part as an independent patch. (I
> > think this is the option linux should use for compilation).
> >
>
> Tested on Linux/x86-64.  OK for master?
>
> Thanks.
>
> H.J.
> ---
> gcc/
>
> PR target/96744
> * config/i386/i386-options.c (IX86_ATTR_IX86_YES): New.
> (IX86_ATTR_IX86_NO): Likewise.
> (ix86_opt_type): Add ix86_opt_ix86_yes and ix86_opt_ix86_no.
> (ix86_valid_target_attribute_inner_p): Handle general-regs-only,
> ix86_opt_ix86_yes and ix86_opt_ix86_no.
> (ix86_option_override_internal): Check opts->x_ix86_target_flags
> instead of opts->x_ix86_target_flags.
> * doc/extend.texi: Document target("general-regs-only") function
> attribute.
>
> gcc/testsuite/
>
> PR target/96744
> * gcc.target/i386/pr96744-1.c: New test.
> * gcc.target/i386/pr96744-2.c: Likewise.
> * gcc.target/i386/pr96744-3a.c: Likewise.
> * gcc.target/i386/pr96744-3b.c: Likewise.
> * gcc.target/i386/pr96744-4.c: Likewise.
> * gcc.target/i386/pr96744-5.c: Likewise.
> * gcc.target/i386/pr96744-6.c: Likewise.
> * gcc.target/i386/pr96744-7.c: Likewise.
> * gcc.target/i386/pr96744-8a.c: Likewise.
> * gcc.target/i386/pr96744-8b.c: Likewise.
> * gcc.target/i386/pr96744-9.c: Likewise.

OK.

Thanks,
Uros.
> ---
>  gcc/config/i386/i386-options.c | 44 --
>  gcc/doc/extend.texi|  4 ++
>  gcc/testsuite/gcc.target/i386/pr96744-1.c  | 10 +
>  gcc/testsuite/gcc.target/i386/pr96744-2.c  | 11 ++
>  gcc/testsuite/gcc.target/i386/pr96744-3a.c | 12 ++
>  gcc/testsuite/gcc.target/i386/pr96744-3b.c | 16 
>  gcc/testsuite/gcc.target/i386/pr96744-4.c  | 11 ++
>  gcc/testsuite/gcc.target/i386/pr96744-5.c  | 17 +
>  gcc/testsuite/gcc.target/i386/pr96744-6.c  | 11 ++
>  gcc/testsuite/gcc.target/i386/pr96744-7.c  | 14 +++
>  gcc/testsuite/gcc.target/i386/pr96744-8a.c | 33 
>  gcc/testsuite/gcc.target/i386/pr96744-8b.c | 35 +
>  gcc/testsuite/gcc.target/i386/pr96744-9.c  | 25 
>  13 files changed, 240 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-3a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-3b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-7.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-8a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-8b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96744-9.c
>
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 26d1ea18ef1..e0fc68c27bf 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -922,12 +922,18 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree 
> args, char *p_strings[],
>  #define IX86_ATTR_ENUM(S,O)  { S, sizeof (S)-1, ix86_opt_enum, O, 0 }
>  #define IX86_ATTR_YES(S,O,M) { S, sizeof (S)-1, ix86_opt_yes, O, M }
>  #define IX86_ATTR_NO(S,O,M)  { S, sizeof (S)-1, ix86_opt_no,  O, M }
> +#define IX86_ATTR_IX86_YES(S,O,M) \
> +  { S, sizeof (S)-1, ix86_opt_ix86_yes, O, M }
> +#define IX86_ATTR_IX86_NO(S,O,M) \
> +  { S, sizeof (S)-1, ix86_opt_ix86_no,  O, M }
>
>enum ix86_opt_type
>{
>  ix86_opt_unknown,
>  ix86_opt_yes,
>  ix86_opt_no,
> +ix86_opt_ix86_yes,
> +ix86_opt_ix86_no,
>  ix86_opt_str,
>  ix86_opt_enum,
>  ix86_opt_isa
> @@ -1062,6 +1068,10 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree 
> args, char *p_strings[],
>  IX86_ATTR_YES ("recip",
>OPT_mrecip,
>MASK_RECIP),
> +
> +IX86_ATTR_IX86_YES ("general-regs-only",
> +   OPT_mgeneral_regs_only,
> +   OPTION_MASK_GENERAL_REGS_ONLY),
>};
>
>location_t loc
> @@ -1175,6 +1185,33 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree 
> args, char

Re: [PATCH] x86: Only use general-purpose registers during CPUID check

2020-08-23 Thread Uros Bizjak via Gcc-patches
On Sun, Aug 23, 2020 at 5:23 PM H.J. Lu  wrote:
>
> On Sun, Aug 23, 2020 at 10:18:28AM +0200, Uros Bizjak wrote:
> > On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu  wrote:
> >
> > > > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, AVX 
> > > > > and
> > > > > AVX512 during CPUID check to avoid vector and mask register 
> > > > > operations.
> > > >
> > > > -mgeneral-regs-only ?
> > > >
> > >
> > > Here is a patch to add target("general-regs-only") function
> > > attribute and use it for CPUID check.   OK for master if there
> > > are no regressions?
> >
> > Please test it first, then ask for an approval.
> >
> > Please submit the general-regs-only part as an independent patch. (I
> > think this is the option linux should use for compilation).
> >
> > OTOH, wrapping CPUID check in a target attribute is a bad idea. We
> > should disable spills to mask registers for generic targets by either
> > raising costs of moves between general and mask registers and/or (as
> > suggested earlier) introducing TARGET_SPILL_TO_MASK_REGS tuning and
> > use it in secondary_memory_needed to prevent inter register unit
> > spills.
> >
> > So, compiling with -mavx512bw would NOT enable spills by default,
> > where compiling with -march=skylake-avx512 (or using equivalent
> > -mtune) would. This is IMO the least surprising approach, and would
> > avoid changing sources (as you now have to do for several testcases).
>
> We have 2 orthogonal issues here:
>
> 1. When mask register spill should be enabled.
> 2. CPUID check should be done with general registers only.
>
> As shown in GCC testcases, CPUID check may be done with arbitrary ISAs
> or -march/-mtune options enabled.  We should either
>
> 1. Enable only general registers for CPUID check.  Or
> 2. Issue an error for CPUID check if non-general registers are used.

We should follow the same approach as with SSE2, where DI/SImode
spills to XMM registers were effectively disabled for a generic
target. So, unless the tuning target is also specified, spills to mask
registers should not be generated. It was my oversight to approve the
patch that enables spills for a generic target, and without the tuning
flag, the patch will be reverted.

Now, we have -mgeneral-regs-only functionality in place, so if a
package wants to enable spills, the correct -mtune (ro -march that
implies -mtune) should be used, and it is expected that the detection
code is amended with general-regs-only pragmas.



void cpuid_check ()
...

#pragma GCC pop_options

>footnote

Nowadays, -march=native is mostly used outside generic target
compilations, so for relevant avx512 targets, we still generate spills
to mask regs. In future, we can review the setting of the tuning flag
for a generic target in the same way as with SSE2 inter-reg moves.

Uros.


Re: [PATCH] x86: Only use general-purpose registers during CPUID check

2020-08-23 Thread Florian Weimer
* H. J. Lu via Gcc-patches:

> 2. CPUID check should be done with general registers only.

Is this really the concern here?  Isn't this about instructions, not
registers?  If there's a useful integer register instruction for
post-processing CPUID bits that's not in the baseline ABI, GCC still
shouldn't use it in the check, I assume.


Re: [committed] wwwdocs: Update reference to RISC-V ISA Specifications

2020-08-23 Thread Kito Cheng via Gcc-patches
Hi Gerald:

Thanks for your patch :)

On Sun, Aug 23, 2020 at 6:19 PM Gerald Pfeifer  wrote:
>
> Pushed. Gerald
>
> ---
>  htdocs/readings.html | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/htdocs/readings.html b/htdocs/readings.html
> index b960eb8c..978d566c 100644
> --- a/htdocs/readings.html
> +++ b/htdocs/readings.html
> @@ -263,7 +263,7 @@ names.
>   riscv
>Manufacturer: Many (open ISA standard)
>https://riscv.org";>RISC-V Foundation
> -  https://riscv.org/specifications/";>ISA Specifications
> +  https://riscv.org/technical/specifications/";>ISA 
> Specifications
>   
>
>   rs6000 (powerpc, powerpcle)
> --
> 2.28.0


Re: [PATCH] SLP: support entire BB.

2020-08-23 Thread Richard Biener via Gcc-patches
On Mon, Aug 10, 2020 at 12:29 PM Martin Liška  wrote:
>
> On 8/3/20 12:29 PM, Richard Biener wrote:
> > You are always passing NULL here so simply avoid this and the following 
> > changes.
>
> Are you sure about this?
>
> Note that vect_slp_bb does:
>
> +  if (!vect_find_stmt_data_reference (NULL, stmt, &datarefs,
> + &dataref_groups, current_group))
> +   ++current_group;

Oops, I stand corrected.  Patch is OK without this particular requested change.

Richard.

> Martin