Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Palmer Dabbelt

On Thu, 05 Jul 2018 05:00:20 PDT (-0700), sebastian.hu...@embedded-brains.de 
wrote:

* config.guess: Sync with upstream version 2018-06-26.
* config.sub: Sync with upstream version 2018-07-02.
---
 config.guess | 6 +++---
 config.sub   | 8 +++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/config.guess b/config.guess
index 883a6713bf0..445c406836e 100755
--- a/config.guess
+++ b/config.guess
@@ -2,7 +2,7 @@
 # Attempt to guess a canonical system name.
 #   Copyright 1992-2018 Free Software Foundation, Inc.

-timestamp='2018-05-19'
+timestamp='2018-06-26'

 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -894,8 +894,8 @@ EOF
# other systems with GNU libc and userland
echo "$UNAME_MACHINE-unknown-`echo "$UNAME_SYSTEM" | sed 's,^[^/]*/,,' | tr "[:upper:]" 
"[:lower:]"``echo "$UNAME_RELEASE"|sed -e 's/[-(].*//'`-$LIBC"
exit ;;
-i*86:Minix:*:*)
-   echo "$UNAME_MACHINE"-pc-minix
+*:Minix:*:*)
+   echo "$UNAME_MACHINE"-unknown-minix
exit ;;
 aarch64:Linux:*:*)
echo "$UNAME_MACHINE"-unknown-linux-"$LIBC"
diff --git a/config.sub b/config.sub
index d1f5b549034..072700fb037 100755
--- a/config.sub
+++ b/config.sub
@@ -2,7 +2,7 @@
 # Configuration validation subroutine script.
 #   Copyright 1992-2018 Free Software Foundation, Inc.

-timestamp='2018-05-24'
+timestamp='2018-07-02'

 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -1125,6 +1125,12 @@ case $basic_machine in
ps2)
basic_machine=i386-ibm
;;
+   riscv)
+   basic_machine=riscv32-unknown
+   ;;
+   riscv-*)
+   basic_machine=`echo "$basic_machine" | sed 's/^riscv/riscv32/'`
+   ;;
rm[46]00)
basic_machine=mips-siemens
;;


I'm not sure what the policy is on getting config stuff approved for commit, 
but just FYI there's another RISC-V related patch to config.sub that changes 
the behavior of "riscv-*" tuples.  I'm assuming we should take both, as it's 
odd to sync half way to the head of config.


When I try to build it I see "Unsupported RISC-V target riscv-unknown-elf", so 
there's at least some extra autoconf wizadry that needs to happen in here.  I'm 
actually not sure what the "riscv-*" tuples are supposed to do so I've added 
Liviu as I don't want to misrepresent his desires and get into trouble again 
:).


I'm fine with pretty much anything when it comes to this tuple stuff, so feel 
free to consider it all pre-approved from a RISC-V prospective -- though I 
assume it needs a GCC global maintainer to approve it as well.  My only 
constraint is that it doesn't break anything that currently builds, as I don't 
want to force a flag day on everyone because of this.


Thanks for submitting the patch!

Here's the config commit, for reference:

commit dd5d5dd697df579a5ebd119a88475b446c07c6b0
Author: Ben Elliston 
Date:   Tue Jul 3 21:18:29 2018 +1000

   * config.sub: Do not rewrite riscv -> riscv32.
   * testsuite/config-sub.data: Adjust tests.

diff --git a/ChangeLog b/ChangeLog
index dc19a4b02ba6..db7a24b8a2a3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2018-07-03  Liviu Ionescu 
+   Ben Elliston  
+
+   * config.sub: Do not rewrite riscv -> riscv32.
+   * testsuite/config-sub.data: Adjust tests.
+
2018-06-26  Sevan Janiyan  
Ben Elliston  

diff --git a/config.sub b/config.sub
index 072700fb037c..c95acc681d1b 100755
--- a/config.sub
+++ b/config.sub
@@ -2,7 +2,7 @@
# Configuration validation subroutine script.
#   Copyright 1992-2018 Free Software Foundation, Inc.

-timestamp='2018-07-02'
+timestamp='2018-07-03'

# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
@@ -625,7 +625,7 @@ case $basic_machine in
| powerpc | powerpc64 | powerpc64le | powerpcle \
| pru \
| pyramid \
-   | riscv32 | riscv64 \
+   | riscv | riscv32 | riscv64 \
| rl78 | rx \
| score \
| sh | sh[1234] | sh[24]a | sh[24]aeb | sh[23]e | sh[234]eb | sheb | 
shbe | shle | sh[1234]le | sh3ele \
@@ -752,7 +752,7 @@ case $basic_machine in
| powerpc-* | powerpc64-* | powerpc64le-* | powerpcle-* \
| pru-* \
| pyramid-* \
-   | riscv32-* | riscv64-* \
+   | riscv-* | riscv32-* | riscv64-* \
| rl78-* | romp-* | rs6000-* | rx-* \
| sh-* | sh[1234]-* | sh[24]a-* | sh[24]aeb-* | sh[23]e-* | sh[34]eb-* 
| sheb-* | shbe-* \
| shle-* | sh[1234]le-* | sh3ele-* | sh64-* | sh64le-* \
@@ -1125,12 +1125,6 @@ case $basic_machine in
ps2)
basic_machine=i386-ibm
;;
-   riscv)
-   basic

Re: FW: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-06-01 Thread Palmer Dabbelt

On Thu, 01 Jun 2023 09:48:47 PDT (-0700), jeffreya...@gmail.com wrote:



On 6/1/23 01:01, juzhe.zh...@rivai.ai wrote:

I plan to implement BF16 vector in GCC but still waiting for ISA
ratified since GCC policy doesn't allow un-ratified ISA.

Right.  So those specs need to move along further before we can start
integrating code.



Currently, we are working on INT8,INT16,INT32,INT64,FP16,FP32,FP64
auto-vectorizaiton.
It should very simple BF16 in current vector framework in GCC.

In prior architectures I've worked on the bulk of BF16 work was just
adding additional entries to existing iterators.  So I agree, it should
be very simple :-)


We should also have someone who's a bit more plugged in to floating 
point check to make sure the RISC-V bfloat16 semantics match IEEE.  I 
don't see any issues, but I'm not really a FP person so I'm not sure.  
There were certainly a lot of subtlies for the other FP bits, so even if 
the implementation just plumbs straight through IMO it's worth checking.


We have one FP person at Rivos, I can try and rope him in if you want?  
Happy to have someone else do it, though, as he's usually pretty busy ;)


Re: [PATCH] RISC-V: Save and restore FCSR in interrupt functions to avoid program errors.

2023-06-13 Thread Palmer Dabbelt

On Tue, 13 Jun 2023 10:41:00 PDT (-0700), gcc-patches@gcc.gnu.org wrote:



On 6/13/23 00:41, Jin Ma wrote:

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_compute_frame_info): Allocate frame for 
FCSR.
(riscv_for_each_saved_reg): Save and restore FCSR in interrupt 
functions.
* config/riscv/riscv.md (riscv_frcsr): New patterns.
(riscv_fscsr): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/interrupt-fcsr-1.c: New test.
* gcc.target/riscv/interrupt-fcsr-2.c: New test.
* gcc.target/riscv/interrupt-fcsr-3.c: New test.

Looks pretty good.  Just a couple minor updates and I think we can push
this to the trunk.


We should update the C API doc as well, it's a bit vague as to whether 
the CSRs are saved: it just says the any used registers are saved, it's 
not clear if registers includes CSRs.


Unless I'm missing something, we also need to save/restore the V CSRs in 
interrupt functions as well?  They're treated the same way in the C API 
doc, so applying the same logic seems reasonable -- I'm not sure we 
really want to save/restore something like vstart, though...


I opened a PR for the API doc: 
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/42



diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index de30bf4e567..4ef9692b4db 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4990,7 +4990,8 @@ riscv_compute_frame_info (void)
if (cfun->machine->interrupt_handler_p)
  {
HOST_WIDE_INT step1 = riscv_first_stack_step (frame, frame->total_size);
-  if (! POLY_SMALL_OPERAND_P ((frame->total_size - step1)))
+  if (! POLY_SMALL_OPERAND_P ((frame->total_size - step1))
+ || TARGET_HARD_FLOAT)
interrupt_save_prologue_temp = true;
  }

There's a comment before this IF block indicating when we need to save
the prologue temporary register (specifically in interrupt functions
with large frames).  That comment needs to be updated so that it
mentions interrupt functions on TARGET_HARD_FLOAT.


I think we're also missing Zfinx here: there's no F registers to save, 
but we should still have the same side effects visible in the CSRs.







@@ -5282,6 +5290,29 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
riscv_save_restore_fn fn,
}
}

+  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
+ && TARGET_HARD_FLOAT
+ && cfun->machine->interrupt_handler_p
+ && cfun->machine->frame.fmask)
+   {
+ unsigned int fcsr_size = GET_MODE_SIZE (SImode);
+ if (!epilogue)
+   {
+ riscv_save_restore_reg (word_mode, regno, offset, fn);
+ offset -= fcsr_size;
+ emit_insn (gen_riscv_frcsr (gen_rtx_REG (SImode, 
RISCV_PROLOGUE_TEMP_REGNUM)));
+ riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM, 
offset, riscv_save_reg);
+   }
+ else
+   {
+ riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM, 
offset - fcsr_size, riscv_restore_reg);
+ emit_insn (gen_riscv_fscsr (gen_rtx_REG (SImode, 
RISCV_PROLOGUE_TEMP_REGNUM)));
+ riscv_save_restore_reg (word_mode, regno, offset, fn);
+ offset -= fcsr_size;
+   }
+ continue;
+   }

Note there is a macro RISCV_PROLOGUE_TEMP(MODE) which will create the
REG expression for the prologue temporary in the given mode.  That way
you don't have to call gen_rtx_REG directly here.

Jeff


This got snipped, but the tests should only check for the CSR 
save/restore on F/D systems (from looking at them they'd fail on soft 
float targets).


Re: [PATCH] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-12 Thread Palmer Dabbelt

On Wed, 12 Jul 2023 09:02:06 PDT (-0700), jeffreya...@gmail.com wrote:



On 7/11/23 21:30, juzhe.zh...@rivai.ai wrote:

LGTM

OK for the trunk.


I'd like to make sure Kito is OK with this.  IIUC the "pass through 
unknown extensions" behavior is deliberate.  It's not what I would have 
done, but I didn't do it ;)



jeff


Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Palmer Dabbelt
On Thu, 13 Jul 2023 07:01:26 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
>
>
> On 7/13/23 01:47, Richard Biener wrote:
>> On Thu, Jul 13, 2023 at 1:30 AM 钟居哲  wrote:
>>>
>>> I notice vectorizable_call in Loop Vectorizer.
>>> It's vectorizing CALL function for example like fmax/fmin.
>>>  From my understanding, we dont have RVV instruction for fmax/fmin?

Unless I'm misunderstanding, we do.  The ISA manual says

=== Vector Floating-Point MIN/MAX Instructions

The vector floating-point `vfmin` and `vfmax` instructions have the
same behavior as the corresponding scalar floating-point instructions
in version 2.2 of the RISC-V F/D/Q extension: they perform the 
`minimumNumber`
or `maximumNumber` operation on active elements.


# Floating-point minimum
vfmin.vv vd, vs2, vs1, vm   # Vector-vector
vfmin.vf vd, vs2, rs1, vm   # vector-scalar

# Floating-point maximum
vfmax.vv vd, vs2, vs1, vm   # Vector-vector
vfmax.vf vd, vs2, rs1, vm   # vector-scalar


so we should be able to match at least some loops.

>>
>> There's things like .POPCOUNT which we can vectorize, but sure, it
>> depends on the ISA if there's anything.
> Right.  And RV has some of these -- vcpop, vfirst...  Supporting them
> obviously isn't a requirement for a vector implementation, but they're
> nice to have :-)
>
> Jeff


Re: [PATCH] RISC-V: Remove the redundant expressions in the and3.

2023-07-13 Thread Palmer Dabbelt

On Thu, 13 Jul 2023 19:02:05 PDT (-0700), li...@eswincomputing.com wrote:

When generating the gen_and3 function based on the and3
template, it produces the expression emit_insn (gen_rtx_SET (operand0,
gen_rtx_AND (, operand1, operand2)));, which is identical to the
portion I removed in this patch. Therefore, the redundant portion can be
deleted.

Signed-off-by: Die Li 

gcc/ChangeLog:

* config/riscv/riscv.md: Remove redundant portion in and3.
---
 gcc/config/riscv/riscv.md | 5 -
 1 file changed, 5 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 7988026d129..c4f8eb9488e 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1491,11 +1491,6 @@
  DONE;
}
 }
-  else
-{
-  emit_move_insn (operands[0], gen_rtx_AND (mode, operands[1], 
operands[2]));
-  DONE;
-}
 })

 (define_insn "*and3"


Unless I'm missing something, this will just result in no emitted 
instructions for this "and" pattern?  That seems wrong, it would at 
least have to put the source into the dest -- but 
"arith_operand_or_mode_mask" can contain values that don't just result 
in an extension (like arbitrary register values, for example), so I 
think we need the "and" operation.


Does this pass the regression suite?

Either way, if this branch of the conditional can't trigger we should 
tighten the constraint (or at a bare minimum add a comment as to why).


Re: [PATCH] RISC-V: Remove the redundant expressions in the and3.

2023-07-13 Thread Palmer Dabbelt

On Thu, 13 Jul 2023 19:41:08 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

Expanding without DONE or FAIL will leave the pattern as well, so this
patch is fine IMO, so this patch LGTM, but anyway I will test this and
commit if passed :)


Ah, thanks, I guess I didn't know that.  This is probably fine then, but 
we might have some code floating around we could toss...



On Fri, Jul 14, 2023 at 10:34 AM Palmer Dabbelt  wrote:


On Thu, 13 Jul 2023 19:02:05 PDT (-0700), li...@eswincomputing.com wrote:
> When generating the gen_and3 function based on the and3
> template, it produces the expression emit_insn (gen_rtx_SET (operand0,
> gen_rtx_AND (, operand1, operand2)));, which is identical to the
> portion I removed in this patch. Therefore, the redundant portion can be
> deleted.
>
> Signed-off-by: Die Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.md: Remove redundant portion in and3.
> ---
>  gcc/config/riscv/riscv.md | 5 -
>  1 file changed, 5 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 7988026d129..c4f8eb9488e 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1491,11 +1491,6 @@
> DONE;
>   }
>  }
> -  else
> -{
> -  emit_move_insn (operands[0], gen_rtx_AND (mode, operands[1], 
operands[2]));
> -  DONE;
> -}
>  })
>
>  (define_insn "*and3"

Unless I'm missing something, this will just result in no emitted
instructions for this "and" pattern?  That seems wrong, it would at
least have to put the source into the dest -- but
"arith_operand_or_mode_mask" can contain values that don't just result
in an extension (like arbitrary register values, for example), so I
think we need the "and" operation.

Does this pass the regression suite?

Either way, if this branch of the conditional can't trigger we should
tighten the constraint (or at a bare minimum add a comment as to why).


Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]

2023-07-21 Thread Palmer Dabbelt

On Fri, 21 Jul 2023 10:55:52 PDT (-0700), Vineet Gupta wrote:

DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it.

void zd(double *) { *d = 0.0; }

currently:

| fmv.d.x fa5,zero
| fsd fa5,0(a0)
| ret

With patch

| sd  zero,0(a0)
| ret

This came to light when testing the in-flight f-m-o patch where an ICE
was gettinh triggered due to lack of this pattern but turns out this
is an independent optimization of its own [1]

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html

Apparently this is a regression in gcc-13, introduced by commit
ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix
thus is a partial revert of that change.


Given that it can ICE, we should probably backport it to 13.


Ran thru full multilib testsuite, there was 1 false failure due to


Did you run the test with autovec?  There's also a 
pmode_reg_or_0_operand, some of those don't appear protected from FP 
values.  So we might need something like


diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index cd5b19457f8..d8ce9223343 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -63,7 +63,7 @@ (define_expand "movmisalign"

(define_expand "len_mask_gather_load"
  [(match_operand:VNX1_QHSD 0 "register_operand")
-   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:P 1 "pmode_reg_or_0_operand")
   (match_operand:VNX1_QHSDI 2 "register_operand")
   (match_operand 3 "")
   (match_operand 4 "")

a bunch of times, as there's a ton of them?  I'm not entirely sure if that
could manifest as an actual bug, though...


random string "lw" appearing in lto build assembler output,
which is also fixed in the patch.

gcc/Changelog:

* config/riscv/predicates.md (const_0_operand): Add back
  const_double.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr110748-1.c: New Test.
* gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test
  patterns to avoid random string matches.

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/predicates.md |  2 +-
 gcc/testsuite/gcc.target/riscv/pr110748-1.c| 10 ++
 gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c |  8 
 3 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 5a22c77f0cd0..9db28c2def7e 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -58,7 +58,7 @@
(match_test "INTVAL (op) + 1 != 0")))

 (define_predicate "const_0_operand"
-  (and (match_code "const_int,const_wide_int,const_vector")
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
(match_test "op == CONST0_RTX (GET_MODE (op))")))

 (define_predicate "const_1_operand"
diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c 
b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
new file mode 100644
index ..2f5bc08aae72
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target hard_float } */
+/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */
+
+
+void zd(double *d) { *d = 0.0;  }
+void zf(float *f)  { *f = 0.0;  }
+
+/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
+/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */


IIUC the pattern to emit fmv suffers from the same bug -- it's fixed in the same
way, but I think we might be able to come up with a test for it: `fmv.d.x FREG,
x0` would be the fastest way to generate 0.0, so maybe something like

   double sum(double *d) {
 double sum = 0;
 for (int i = 0; i < 8; ++i)
   sum += d[i];
 return sum;
   }

would do it?  That's generating the fmv on 13 for me, though, so maybe I'm
missing something?`


diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c 
b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
index 1036044291e7..89eb48bed1b9 100644
--- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
@@ -18,7 +18,7 @@ d2ll (double d)
 /* { dg-final { scan-assembler "th.fmv.hw.x" } } */
 /* { dg-final { scan-assembler "fmv.x.w" } } */
 /* { dg-final { scan-assembler "th.fmv.x.hw" } } */
-/* { dg-final { scan-assembler-not "sw" } } */
-/* { dg-final { scan-assembler-not "fld" } } */
-/* { dg-final { scan-assembler-not "fsd" } } */
-/* { dg-final { scan-assembler-not "lw" } } */
+/* { dg-final { scan-assembler-not "\tsw\t" } } */
+/* { dg-final { scan-assembler-not "\tfld\t" } } */
+/* { dg-final { scan-assembler-not "\tfsd\t" } } */
+/* { dg-final { scan-assembler-not "\tlw\t" } } */


I think that autovec one is the only possible dependency that might have snuck
in, so we should be safe otherwise.  Thanks!

Reviewed-by: Palmer Dabbelt 


Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Palmer Dabbelt

On Tue, 25 Jul 2023 11:01:54 PDT (-0700), Patrick O'Neill wrote:

Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there aren't any objections I'll commit this cherry-picked series
on Thursday (July 27th).


+Jakub

According to the "GCC 13.1.1 Status Report (2023-07-20)", it looks like 
we're frozen for 13.2 and thus would need a release maintainer to sign 
off on anything we backport until 13.2 is released.


I'm not opposed to the backport, but it does looks like we're down to no 
P1 regressions which means we might release very soon.  So we should at 
least make sure this gets through all the tests and such.  It's kind of 
splitting hairs as this is a pretty bad set of bugs we're fixing and 
distros are probably going to just backport it anyway, so not sure what 
the right answer is.



Patchset on trunk:
https://inbox.sourceware.org/gcc-patches/20230427162301.1151333-1-patr...@rivosinc.com/
First commit: f37a36bce81b50a43ec1613c1d08d803642f7506

Also includes bugfix from:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109713
commit: 4bd434fbfc7865961a8e10d7e9601b28765ce7be

[1] 
https://inbox.sourceware.org/gcc/mhng-b7423fca-67ec-4ce4-9694-4e062632ceb0@palmer-ri-x1c9/T/#t

Martin Liska (1):
  riscv: fix error: control reaches end of non-void function

Patrick O'Neill (11):
  RISC-V: Eliminate SYNC memory models
  RISC-V: Enforce Libatomic LR/SC SEQ_CST
  RISC-V: Enforce subword atomic LR/SC SEQ_CST
  RISC-V: Enforce atomic compare_exchange SEQ_CST
  RISC-V: Add AMO release bits
  RISC-V: Strengthen atomic stores
  RISC-V: Eliminate AMO op fences
  RISC-V: Weaken LR/SC pairs
  RISC-V: Weaken mem_thread_fence
  RISC-V: Weaken atomic loads
  RISC-V: Table A.6 conformance tests

 gcc/config/riscv/riscv-protos.h   |   3 +
 gcc/config/riscv/riscv.cc |  66 --
 gcc/config/riscv/sync.md  | 196 --
 .../riscv/amo-table-a-6-amo-add-1.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-2.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-3.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-4.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-5.c   |  15 ++
 .../riscv/amo-table-a-6-compare-exchange-1.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-2.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-3.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-4.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-5.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-6.c  |  10 +
 .../riscv/amo-table-a-6-compare-exchange-7.c  |   9 +
 .../gcc.target/riscv/amo-table-a-6-fence-1.c  |  14 ++
 .../gcc.target/riscv/amo-table-a-6-fence-2.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-fence-3.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-fence-4.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-fence-5.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-load-1.c   |  16 ++
 .../gcc.target/riscv/amo-table-a-6-load-2.c   |  17 ++
 .../gcc.target/riscv/amo-table-a-6-load-3.c   |  18 ++
 .../gcc.target/riscv/amo-table-a-6-store-1.c  |  16 ++
 .../gcc.target/riscv/amo-table-a-6-store-2.c  |  17 ++
 .../riscv/amo-table-a-6-store-compat-3.c  |  18 ++
 .../riscv/amo-table-a-6-subword-amo-add-1.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-2.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-3.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-4.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-5.c   |   9 +
 gcc/testsuite/gcc.target/riscv/pr89835.c  |   9 +
 libgcc/config/riscv/atomic.c  |   4 +-
 33 files changed, 563 insertions(+), 75 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-

Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Palmer Dabbelt

On Tue, 25 Jul 2023 12:50:48 PDT (-0700), ja...@redhat.com wrote:

On Tue, Jul 25, 2023 at 11:01:54AM -0700, Patrick O'Neill wrote:

Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there aren't any objections I'll commit this cherry-picked series
on Thursday (July 27th).


Please don't before 13.2 will be released, the branch is frozen and none of
this seems to be a release blocker.


Sorry I missed this.  IMO it's fine to wait, this has been broken for 
5-10 years so we can wait another cycle ;)




Jakub


Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Palmer Dabbelt

On Tue, 25 Jul 2023 14:02:24 PDT (-0700), jeffreya...@gmail.com wrote:



On 7/25/23 13:50, Jakub Jelinek wrote:

On Tue, Jul 25, 2023 at 11:01:54AM -0700, Patrick O'Neill wrote:

Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there aren't any objections I'll commit this cherry-picked series
on Thursday (July 27th).


Please don't before 13.2 will be released, the branch is frozen and none of
this seems to be a release blocker.

Ugh.  Missed the boat :(

I could make an argument for inclusion given the strong desire to have
compatible mappings across the toolchains and alignment with the RVI
specs -- but I won't.  As Palmer has indicated, it's been broken for a
while and we can manage that breakage.


I think if we just merge it right after 13.2 and indicate that distros 
doing long-term binary builds before 13.3 backport the patches we should 
be fine.  I think that's just Debian right now, so while it's an 
important set of bugs to get fixed it's just the single user.


It's certainly a bummer to miss 13.2, but we've just got ourselves to 
blame for forgetting about the backport ;)







jeff


Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]

2023-07-25 Thread Palmer Dabbelt

On Fri, 21 Jul 2023 11:47:58 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

On 7/21/23 12:31, Palmer Dabbelt wrote:

(define_expand "len_mask_gather_load"
   [(match_operand:VNX1_QHSD 0 "register_operand")
-   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:P 1 "pmode_reg_or_0_operand")
    (match_operand:VNX1_QHSDI 2 "register_operand")
    (match_operand 3 "")
    (match_operand 4 "")

a bunch of times, as there's a ton of them?  I'm not entirely sure if that
could manifest as an actual bug, though...

But won't this cause (const_int 0) to no longer match because CONST_INT
nodes are modeless (VOIDmode)?


I poked around a bit and I'm not actually sure, I'm kind of lost on the docs
here.  IIUC we're eliding the VOIDmode in the predicate correctly

   (define_predicate "const_0_operand"
 (and (match_code "const_int,const_wide_int,const_vector")
  (match_test "op == CONST0_RTX (GET_MODE (op))")))

so we're OK there, otherwise we'd presumably have similar problems with
expanders like

   (define_expand "subsi3"
 [(set (match_operand:SI   0 "register_operand" "= r")
  (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
(match_operand:SI 2 "register_operand" "  r")))]
 ""

which we have a few of -- though it'd be kind of a silent failure, as
presumably we'd just end up with some more move-x0s emitted?


Re: Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Palmer Dabbelt

On Wed, 26 Jul 2023 08:34:14 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

I would say LCM/PRE is the key of this set of static rounding model
intrinsic, otherwise I think it's will push people to using dynamic with
fesetrouding mode or inline asm to set the rounding mode for performance
issue - it's kind of opposite way of the design concept, we want to provide
a reliable way with performance to precisely control the ronding model.

For the function call stuff that could be resolved by fenv_access pragma in
theory, since it can be an annotation to tell compiler some function has
modify fenv or not, but unfortunately it’s not well modeled within GCC yet,
so we must did the conservative to make sure we didn't break anything.

And also the LLVM side is trying to implement some simple LCM/PRE to
optimize that, so I believe we need LCM/PRE based mode switching to do that.


IMO that's a perfectly reasonably way to start: let's just get something 
that's correct and simple, if we need to do more complicated stuff later 
we can always add it.


There's going to be a very small amount of this code written my a very 
small number of people (that are likely very close to the compiler teams 
doing the optimizations here), so we can just all work with each other 
to sort out any important performance issues as we go.


I think whether LCM or entry/exit performs better is probably just going 
to boil down to some uarch/workload specific decisions, so as long as 
whatever we have is correct and reasonably simple it seems fine for now.  
Given how little of this code there's going to be it's probably not 
worth spending a ton of time on things until we have a concrete use case 
to drive things.


Let's just make sure to also update the intrinsic spec to get rid of the 
grey area here, that way we can point to something if we want to 
optimize differently in the future.



Li, Pan2 於 2023年7月26日 週三,22:31寫道:


As Juzhe mentioned, the problem of the CALL is resolved by LCM/PRE
similar to the VSETVL pass, which is well proofed up to a point.



I would like to propose that being focus and moving forward for this patch
itself, the underlying other RVV floating point API support and the RVV
instrinsic API fully tests depend on this.



Of course, I am working on PATCH v8 and thanks again for Robin’s comments.



Pan



*From:* 钟居哲 
*Sent:* Wednesday, July 26, 2023 10:18 PM
*To:* rdapp.gcc ; Li, Pan2 
*Cc:* rdapp.gcc ; kito.cheng ;
gcc-patches ; Wang, Yanzhang <
yanzhang.w...@intel.com>
*Subject:* Re: Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point
dynamic rounding



Explicitly backup and restore for each intrinsic just the same as we did
for CALL in this patch.



I can't have the data to prove how good we use LCM/PRE of mode switching
but I trust it.



Since the the LCM/PRE is the key optimization method of VSETVL PASS which
is doing good job on VSETVL instruction optimizations.



I don't we should give up LCM/PRE chance then just backup and restore for
each intrinsic bindly.




--

juzhe.zh...@rivai.ai



*From:* Robin Dapp 

*Date:* 2023-07-26 21:46

*To:* juzhe.zhong ; Li, Pan2 

*CC:* rdapp.gcc ; Kito Cheng ;
gcc-patches@gcc.gnu.org; Wang, Yanzhang 

*Subject:* Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point
dynamic rounding

> current llvm didn't do any pre optimization.  They always

> backup+restore for each rounding mode intrinsic



I see.  There is still the option of lazily restoring the

(entry) FRM before a function call but not read the FRM

after every call.  Do we have any data on how good or bad the

mode-switching LCM works when we explicitly backup and restore

for each intrinsic?



Regards

Robin






[RFC] RISC-V: Add support for RV64E/lp64e

2022-07-12 Thread Palmer Dabbelt
gcc/ChangeLog

* config.gcc (riscv): Accept rv64e and lp64e.
* config/riscv/arch-canonicalize: Likewise.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Likewise.
* config/riscv/riscv-opts.h (riscv_abi_type): Likewise.
* config/riscv/riscv.cc (riscv_option_override): Likewise
* config/riscv/riscv.h (UNITS_PER_FP_ARG): Likewise.
(STACK_BOUNDARY): Likewise.
(ABI_STACK_BOUNDARY): Likewise.
(MAX_ARGS_IN_REGISTERS): Likewise.
(ABI_SPEC): Likewise.
* config/riscv/riscv.opt (abi_type): Likewise.
* doc/invoke.texi (RISC-V) <-mabi>: Likewise.
---
This is all still in flight, but evidently RV64E exists.  I haven't
tested this at all, but given that we don't even have the ABI docs lined
up yet it's likely a bit away from being mergable.
---
 gcc/config.gcc |  8 +---
 gcc/config/riscv/arch-canonicalize |  2 +-
 gcc/config/riscv/riscv-c.cc|  1 +
 gcc/config/riscv/riscv-opts.h  |  1 +
 gcc/config/riscv/riscv.cc  |  6 --
 gcc/config/riscv/riscv.h   | 11 +++
 gcc/config/riscv/riscv.opt |  3 +++
 gcc/doc/invoke.texi|  5 +++--
 8 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 4e3b15bb5e9..4617ecb8d9b 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4637,7 +4637,7 @@ case "${target}" in
 
# Infer arch from --with-arch, --target, and --with-abi.
case "${with_arch}" in
-   rv32e* | rv32i* | rv32g* | rv64i* | rv64g*)
+   rv32e* | rv32i* | rv32g* | rv64e* | rv64i* | rv64g*)
# OK.
;;
"")
@@ -4645,12 +4645,13 @@ case "${target}" in
case "${with_abi}" in
ilp32e) with_arch="rv32e" ;;
ilp32 | ilp32f | ilp32d) with_arch="rv32gc" ;;
+   lp64e) with_arch="rv64e" ;;
lp64 | lp64f | lp64d) with_arch="rv64gc" ;;
*) with_arch="rv${xlen}gc" ;;
esac
;;
*)
-   echo "--with-arch=${with_arch} is not supported.  The 
argument must begin with rv32e, rv32i, rv32g, rv64i, or rv64g." 1>&2
+   echo "--with-arch=${with_arch} is not supported.  The 
argument must begin with rv32e, rv32i, rv32g, rv64e, rv64i, or rv64g." 1>&2
exit 1
;;
esac
@@ -4672,6 +4673,7 @@ case "${target}" in
rv32e*) with_abi=ilp32e ;;
rv32*) with_abi=ilp32 ;;
rv64*d* | rv64g*) with_abi=lp64d ;;
+   rv64e*) with_abi=lp64e ;;
rv64*) with_abi=lp64 ;;
esac
;;
@@ -4687,7 +4689,7 @@ case "${target}" in
ilp32,rv32* | ilp32e,rv32e* \
| ilp32f,rv32*f* | ilp32f,rv32g* \
| ilp32d,rv32*d* | ilp32d,rv32g* \
-   | lp64,rv64* \
+   | lp64,rv64* | lp64e,rv64e* \
| lp64f,rv64*f* | lp64f,rv64g* \
| lp64d,rv64*d* | lp64d,rv64g*)
;;
diff --git a/gcc/config/riscv/arch-canonicalize 
b/gcc/config/riscv/arch-canonicalize
index fd7651ac491..8db3e88ddd7 100755
--- a/gcc/config/riscv/arch-canonicalize
+++ b/gcc/config/riscv/arch-canonicalize
@@ -71,7 +71,7 @@ def arch_canonicalize(arch, isa_spec):
   new_arch = ""
   extra_long_ext = []
   std_exts = []
-  if arch[:5] in ['rv32e', 'rv32i', 'rv32g', 'rv64i', 'rv64g']:
+  if arch[:5] in ['rv32e', 'rv32i', 'rv32g', 'rv64e', 'rv64i', 'rv64g']:
 new_arch = arch[:5].replace("g", "i")
 if arch[:5] in ['rv32g', 'rv64g']:
   std_exts = ['m', 'a', 'f', 'd']
diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index eb7ef09297e..4614dc6b6d9 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -67,6 +67,7 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
   switch (riscv_abi)
 {
 case ABI_ILP32E:
+case ABI_LP64E:
   builtin_define ("__riscv_abi_rve");
   gcc_fallthrough ();
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1e153b3a6e7..70fe708cbae 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -27,6 +27,7 @@ enum riscv_abi_type {
   ABI_ILP32F,
   ABI_ILP32D,
   ABI_LP64,
+  ABI_LP64E,
   ABI_LP64F,
   ABI_LP64D
 };
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2e83ca07394..51b7195c17b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5047,8 +5047,10 @@ riscv_option_override (void)
 error ("requested ABI requires %<-march%> to subsume the %qc extension",
   UNITS_PER_FP_ARG > 8 ? 'Q' : (UNITS_PER_FP_ARG > 4 ? 'D' : 'F'));
 

Re: [RFC] RISC-V: Add support for RV64E/lp64e

2022-07-14 Thread Palmer Dabbelt

On Tue, 12 Jul 2022 22:09:53 PDT (-0700), Palmer Dabbelt wrote:

gcc/ChangeLog

* config.gcc (riscv): Accept rv64e and lp64e.
* config/riscv/arch-canonicalize: Likewise.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Likewise.
* config/riscv/riscv-opts.h (riscv_abi_type): Likewise.
* config/riscv/riscv.cc (riscv_option_override): Likewise
* config/riscv/riscv.h (UNITS_PER_FP_ARG): Likewise.
(STACK_BOUNDARY): Likewise.
(ABI_STACK_BOUNDARY): Likewise.
(MAX_ARGS_IN_REGISTERS): Likewise.
(ABI_SPEC): Likewise.
* config/riscv/riscv.opt (abi_type): Likewise.
* doc/invoke.texi (RISC-V) <-mabi>: Likewise.
---
This is all still in flight, but evidently RV64E exists.  I haven't
tested this at all, but given that we don't even have the ABI docs lined
up yet it's likely a bit away from being mergable.
---
 gcc/config.gcc |  8 +---
 gcc/config/riscv/arch-canonicalize |  2 +-
 gcc/config/riscv/riscv-c.cc|  1 +
 gcc/config/riscv/riscv-opts.h  |  1 +
 gcc/config/riscv/riscv.cc  |  6 --
 gcc/config/riscv/riscv.h   | 11 +++
 gcc/config/riscv/riscv.opt |  3 +++
 gcc/doc/invoke.texi|  5 +++--
 8 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 4e3b15bb5e9..4617ecb8d9b 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4637,7 +4637,7 @@ case "${target}" in

# Infer arch from --with-arch, --target, and --with-abi.
case "${with_arch}" in
-   rv32e* | rv32i* | rv32g* | rv64i* | rv64g*)
+   rv32e* | rv32i* | rv32g* | rv64e* | rv64i* | rv64g*)
# OK.
;;
"")
@@ -4645,12 +4645,13 @@ case "${target}" in
case "${with_abi}" in
ilp32e) with_arch="rv32e" ;;
ilp32 | ilp32f | ilp32d) with_arch="rv32gc" ;;
+   lp64e) with_arch="rv64e" ;;
lp64 | lp64f | lp64d) with_arch="rv64gc" ;;
*) with_arch="rv${xlen}gc" ;;
esac
;;
*)
-   echo "--with-arch=${with_arch} is not supported.  The argument must 
begin with rv32e, rv32i, rv32g, rv64i, or rv64g." 1>&2
+   echo "--with-arch=${with_arch} is not supported.  The argument must 
begin with rv32e, rv32i, rv32g, rv64e, rv64i, or rv64g." 1>&2
exit 1
;;
esac
@@ -4672,6 +4673,7 @@ case "${target}" in
rv32e*) with_abi=ilp32e ;;
rv32*) with_abi=ilp32 ;;
rv64*d* | rv64g*) with_abi=lp64d ;;
+   rv64e*) with_abi=lp64e ;;
rv64*) with_abi=lp64 ;;
esac
;;
@@ -4687,7 +4689,7 @@ case "${target}" in
ilp32,rv32* | ilp32e,rv32e* \
| ilp32f,rv32*f* | ilp32f,rv32g* \
| ilp32d,rv32*d* | ilp32d,rv32g* \
-   | lp64,rv64* \
+   | lp64,rv64* | lp64e,rv64e* \
| lp64f,rv64*f* | lp64f,rv64g* \
| lp64d,rv64*d* | lp64d,rv64g*)
;;
diff --git a/gcc/config/riscv/arch-canonicalize 
b/gcc/config/riscv/arch-canonicalize
index fd7651ac491..8db3e88ddd7 100755
--- a/gcc/config/riscv/arch-canonicalize
+++ b/gcc/config/riscv/arch-canonicalize
@@ -71,7 +71,7 @@ def arch_canonicalize(arch, isa_spec):
   new_arch = ""
   extra_long_ext = []
   std_exts = []
-  if arch[:5] in ['rv32e', 'rv32i', 'rv32g', 'rv64i', 'rv64g']:
+  if arch[:5] in ['rv32e', 'rv32i', 'rv32g', 'rv64e', 'rv64i', 'rv64g']:
 new_arch = arch[:5].replace("g", "i")
 if arch[:5] in ['rv32g', 'rv64g']:
   std_exts = ['m', 'a', 'f', 'd']
diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index eb7ef09297e..4614dc6b6d9 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -67,6 +67,7 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
   switch (riscv_abi)
 {
 case ABI_ILP32E:
+case ABI_LP64E:
   builtin_define ("__riscv_abi_rve");
   gcc_fallthrough ();

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1e153b3a6e7..70fe708cbae 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -27,6 +27,7 @@ enum riscv_abi_type {

Re: [PATCH] Revert "[PATCH] RISC-V: Use new linker emulations for glibc ABI."

2022-07-14 Thread Palmer Dabbelt

On Mon, 20 Jun 2022 20:48:50 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

On Mon, Jun 20, 2022 at 1:21 AM Kito Cheng  wrote:


Generally I agree we should fix that by GCC driver rather than ld
emulation, but I think this should be reverted with the -L path fix,
otherwise that will break multilib on GNU toolchain for linux
immediately?


IIUC just changing this will break even non-multlib systems, though it's 
possible that the symlinks work around that sufficiently.



Thanks for the good consideration. That said, I am unsure any distro
uses this currently.
I think some just work around the possibly non-existent paths by
creating symlinks.
Perhaps we should prioritize on fixing the scheme before distros start
to rely on the behavior.


I'm kind of torn on this one: this has been around for a while and 
dropping it would be an ABI break, but the feedback from distro folks is 
pretty consistently that multlib is broken on RISC-V.  If it's really 
unusably broken then I could buy the argument that there's no binaries 
(and thus no ABI to break), but there's at some base multilib 
functionality working -- I build multilib cross toolchains regularly, 
for example, and they can build simple stuff.


I always find making that "nobody's used it" argument really hard, 
there's just too many users to try and track everyone down.  We're in 
kind of a weird spot with RISC-V in general when it comes to ABI stuff: 
we were probably a bit overly optimistic about how fast any of this was 
going to get used when we committed to the ABI freeze, but any ABi break 
has such a huge potential for user headaches that I'm not sure it's 
going to be possible.  It means we're stuck with some baggage, and while 
it's a headache to keep around stuff that's probably not all that useful 
I think it's just what we've got to live with.


If multlib really is so broken it's not fixable without an ABI break 
then I guess there's no other option, but I think in this case we have 
some:


One option would be to add an ld argument that says to turn off the 
emulation-specific path resolution, which we could then add to LINK_SPEC 
when we get the library paths sorted out?  We'd still have the 
emulations and the subdirs, but at least we wouldn't need a flag day.


Another option would be to add new multlib paths that don't have the 
subdirectories, as last I checked that was an issue for distros 
(violates FHS, breaks build systems, etc).  If we're going to do that 
anyway then we could piggyback the new behavior on it and deprecate the old 
paths along with whatever behavior is associated with them.



On Wed, Jun 15, 2022 at 4:00 PM Fangrui Song via Gcc-patches
 wrote:
>
> This reverts commit 37d57ac9a636f2235f9060e84fb8dd7968abd1dc.
>
> The resolution to https://sourceware.org/bugzilla/show_bug.cgi?id=22962
> let GCC pass -m emulation to ld and let the ld emulation configure
> default library paths.  This scheme is problematic:
>
> * It's not ld's business to specify default -L.  Different platforms have
> different opinions on the hierarchy and all other arches work well without 
ld's
> default -L.
> * If some ABI derived library paths are desired, the compiler driver is in a
> better position to make the decision and traditionally has done this.
> * -m emulation is opaque to the compiler driver.  It doesn't affect -B, so
> data files like crt*.o, libasan_preinit.o, and libtsan_preinit.o are not 
affected.
>
> As is, many platforms just use symlinks to fake the lib64/{ilp32{,f},lp64{,f}}
> hierarchies needed by the GNU ld emulation.  They can always specify -L
> explicitly if they want some ABI derived library paths.  See also the rejected
> https://reviews.llvm.org/D95755


I don't do a lot of LLVM stuff, but that has a green check mark that 
says "accepted" at the top.  Does that mean it was merged somewhere, or 
just that it was acked/reviewed and then dropped?



>
> gcc/Changelog:
>
> * config/riscv/linux.h (LD_EMUL_SUFFIX): Remove.
> (LINK_SPEC): Remove LD_EMUL_SUFFIX.
> ---
>  gcc/config/riscv/linux.h | 10 +-
>  1 file changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/gcc/config/riscv/linux.h b/gcc/config/riscv/linux.h
> index 38803723ba9..e0ff6e6a178 100644
> --- a/gcc/config/riscv/linux.h
> +++ b/gcc/config/riscv/linux.h
> @@ -49,16 +49,8 @@ along with GCC; see the file COPYING3.  If not see
>
>  #define CPP_SPEC "%{pthread:-D_REENTRANT}"
>
> -#define LD_EMUL_SUFFIX \
> -  "%{mabi=lp64d:}" \
> -  "%{mabi=lp64f:_lp64f}" \
> -  "%{mabi=lp64:_lp64}" \
> -  "%{mabi=ilp32d:}" \
> -  "%{mabi=ilp32f:_ilp32f}" \
> -  "%{mabi=ilp32:_ilp32}"
> -
>  #define LINK_SPEC "\
> --melf" XLEN_SPEC DEFAULT_ENDIAN_SPEC "riscv" LD_EMUL_SUFFIX " \
> +-melf" XLEN_SPEC DEFAULT_ENDIAN_SPEC "riscv \
>  %{mno-relax:--no-relax} \
>  %{mbig-endian:-EB} \
>  %{mlittle-endian:-EL} \
> --
> 2.36.1.476.g0c4daa206d-goog
>




--
宋方睿


Re: Supporting RISC-V Vendor Extensions in the GNU Toolchain

2022-07-20 Thread Palmer Dabbelt

On Tue, 10 May 2022 17:01:26 PDT (-0700), Palmer Dabbelt wrote:

[Sorry for cross-posting to a bunch of lists, I figured it'd be best to
have all the discussions in one thread.]

We currently only support what is defined by official RISC-V
specifications in the various GNU toolchain projects.  There's certainly
some grey areas there, but in general that means not taking code that
relies on drafts or vendor defined extensions, even if that would result
in higher performance or more featured systems for users.

The original goal of these policies were to steer RISC-V implementers
towards a common set of specifications, but over the last year or so
it's become abundantly clear that this is causing more harm that good.
All extant RISC-V systems rely on behaviors defined outside the official
specifications, and while that's technically always been the case we've
gotten to the point where trying to ignore that fact is impacting real
users on real systems.  There's been consistent feedback from users that
we're not meeting their needs, which can clearly be seen in the many out
of tree patch sets in common use.

There's been a handful of discussions about this, but we've yet to have
a proper discussion on the mailing lists.  From the various discussions
I've had it seems that folks are broadly in favor of supporting vendor
extensions, but the devil's always in the details with this sort of
thing so I thought it'd be best to write something up so we can have a
concrete discussion.

The idea is to start taking code that depends on vendor-defined behavior
into the core GNU toolchain ports, as long as it meets the following
criteria:

* An ISA manual is available that can be redistributed/archived, defines
  the behaviors in question as one or more vendor-specific extensions,
  and is clearly versioned.  The RISC-V foundation is setting various
  guidelines around how vendor-defined extensions and instructions
  should be named, we strongly suggest that vendors follow those
  conventions whenever possible (this is all new, though, so exactly
  what's necessary from vendor specifications will likely evolve as we
  learn).
* There is a substantial user base that depends on the behavior in
  question, which probably means there is hardware in the wild that
  implements the extensions and users that require those extensions in
  order for that hardware to be useful for common applications.  This is
  always going to be a grey area, but it's essentially the same spot
  everyone else is in.
* There is a mechanism for testing the code in question without direct
  access to hardware, which in practice means a QEMU port (or whatever
  simulator is relevant in the space and that folks use for testing) or
  some community commitment to long-term availability of the hardware
  for testing (something like the GCC compile farm, for example).
* It is possible to produce binaries that are compatible with all
  upstream vendors' implementations.  That means we'll need mechanisms
  to allow extensions from multiple vendors to be linked together and
  then probed at runtime.  That's not to say that all binaries will be
  compatible, as users are always free to skip the compatibility code
  and there will be conflicting definitions of instruction encodings,
  but we can at least provide users with the option of compatibility.

These are pretty loosely written on purpose, both because this is all
new and because each project has its own set of contribution
requirements so it's going to be all but impossible to have a single
concrete set of rules that applies everywhere -- that's nothing specific
to the vendor extensions (or even RISC-V), it's just life.  Specifically
a major goal here is to balance the needs of users, both in the short
term (ie, getting new hardware to work) and the long term (ie, the long
term stability of their software).  We're not talking about taking code
that can't be tested, hasn't been reviewed, isn't going to be supported
long-term, or doesn't have a stable ABI; just dropping the specific
requirement that a specification must be furnished by the RISC-V
foundation in order to accept code.

Nothing is decided yet, so happy to hear any thought folks have.  This
is certainly a very different development methodology than what we've
done in the past and isn't something that should be entreated into
lightly, so any comments are welcome.


I'm going back to the start of the thread as this led to some heated 
discussion, both here and in private.  Clearly there's lots of opinions 
here and everyone wants something different, but the nature of 
compromise is that nobody gets exactly what they want and it looks like 
this is as good as we're going to get any time soon.  So I'm going to 
propose that we go with this.


This was all purposefully a bit vague so we'

Re: [PATCH 1/1 V5] RISC-V: Support Zmmul extension

2022-07-21 Thread Palmer Dabbelt

On Thu, 21 Jul 2022 02:03:35 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

LGTM, will merge once binuils part merge.


+Nelson, in case he's already planning on handling those.  If not then 
they're not in my inbox, so just poke me if you want me to review them.


Also some comments on the patch below.



On Wed, Jul 13, 2022 at 10:14 AM  wrote:


From: LiaoShihua 

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add Zmmul.
* config/riscv/riscv-opts.h (MASK_ZMMUL): New.
(TARGET_ZMMUL): Ditto.
* config/riscv/riscv.cc (riscv_option_override):Ditto.
* config/riscv/riscv.md: Add Zmmul
* config/riscv/riscv.opt: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zmmul-1.c: New test.
* gcc.target/riscv/zmmul-2.c: New test.

---
 gcc/common/config/riscv/riscv-common.cc  |  3 +++
 gcc/config/riscv/riscv-opts.h|  3 +++
 gcc/config/riscv/riscv.cc|  8 +--
 gcc/config/riscv/riscv.md| 28 
 gcc/config/riscv/riscv.opt   |  3 +++
 gcc/testsuite/gcc.target/riscv/zmmul-1.c | 20 +
 gcc/testsuite/gcc.target/riscv/zmmul-2.c | 20 +
 7 files changed, 69 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-2.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 0e5be2ce105..20acc590b30 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -193,6 +193,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zvl32768b", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvl65536b", ISA_SPEC_CLASS_NONE, 1, 0},

+  {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
+
   /* Terminate the list.  */
   {NULL, ISA_SPEC_CLASS_NONE, 0, 0}
 };
@@ -1148,6 +1150,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zvl32768b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL32768B},
   {"zvl65536b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL65536B},

+  {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},

   {NULL, NULL, 0}
 };
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1e153b3a6e7..9c7d69a6ea3 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -153,6 +153,9 @@ enum stack_protector_guard {
 #define TARGET_ZICBOM ((riscv_zicmo_subext & MASK_ZICBOM) != 0)
 #define TARGET_ZICBOP ((riscv_zicmo_subext & MASK_ZICBOP) != 0)

+#define MASK_ZMMUL  (1 << 0)
+#define TARGET_ZMMUL((riscv_zm_subext & MASK_ZMMUL) != 0)
+
 /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is
set, e.g. MASK_ZVL64B has set then MASK_ZVL32B is set, so we can use
popcount to caclulate the minimal VLEN.  */
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2e83ca07394..9ad4181f35f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4999,10 +4999,14 @@ riscv_option_override (void)
   /* The presence of the M extension implies that division instructions
  are present, so include them unless explicitly disabled.  */
   if (TARGET_MUL && (target_flags_explicit & MASK_DIV) == 0)
-target_flags |= MASK_DIV;
+if(!TARGET_ZMMUL)
+  target_flags |= MASK_DIV;


Not sure if I'm missing something here, but that doesn't look right: it 
would mean that "-march=rv32im_zmmul" ends up without divide 
instructions.  I think it's fine to just leave this as it was, we're not 
setting TARGET_MUL from "-march...zmmul...", so this should all be OK.



   else if (!TARGET_MUL && TARGET_DIV)
 error ("%<-mdiv%> requires %<-march%> to subsume the % extension");
-
+
+  if(TARGET_ZMMUL && !TARGET_MUL && TARGET_DIV)
+warning (0, "%<-mdiv%> cannot be used when % extension is 
present");


That should already be getting caught by the check above, but even so 
it's not quite the right error: "-march=rv32im_zmmul -mdiv" is fine, 
it's just something like "-march=rv32i_zmmul -mdiv" that's the problem.



+
   /* Likewise floating-point division and square root.  */
   if (TARGET_HARD_FLOAT && (target_flags_explicit & MASK_FDIV) == 0)
 target_flags |= MASK_FDIV;
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 308b64dd30d..d4e171464ea 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -763,7 +763,7 @@
   [(set (match_operand:SI  0 "register_operand" "=r")
(mult:SI (match_operand:SI 1 "register_operand" " r")
 (match_operand:SI 2 "register_operand" " r")))]
-  "TARGET_MUL"
+  "TARGET_ZMMUL || TARGET_MUL"
   { return TARGET_64BIT ? "mulw\t%0,%1,%2" : "mul\t%0,%1,%2"; }
   [(set_attr "type" "imul")
(set_attr "mode" "SI")])
@@ -772,7 +772,7 @@
   [(set (match_operand:DI  0 "register_operand" "=r")
(mult:DI (match_operand:DI 1 "register_operand" " r")
   

[PATCH] RISC-V: Use the X iterator for eh_set_lr_{si,di}

2022-08-06 Thread Palmer Dabbelt
These two patterns were independent, but exactly match the semantics of
X.  Replace them with a single paramaterized pattern.  Thanks to Andrew
for pointing this one out over IRC.

gcc/ChangeLog

* config/riscv/riscv.md (eh_set_lr_): New pattern.
(eh_set_lr_si): Remove.
(eh_set_lr_di): Likewise.
---
No new failures on the Linux multilibs on trunk.
---
 gcc/config/riscv/riscv.md | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 0796f91dd30..11a59f98a9f 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2562,16 +2562,10 @@
 ;; Clobber the return address on the stack.  We can't expand this
 ;; until we know where it will be put in the stack frame.
 
-(define_insn "eh_set_lr_si"
-  [(unspec [(match_operand:SI 0 "register_operand" "r")] UNSPEC_EH_RETURN)
-   (clobber (match_scratch:SI 1 "=&r"))]
-  "! TARGET_64BIT"
-  "#")
-
-(define_insn "eh_set_lr_di"
-  [(unspec [(match_operand:DI 0 "register_operand" "r")] UNSPEC_EH_RETURN)
-   (clobber (match_scratch:DI 1 "=&r"))]
-  "TARGET_64BIT"
+(define_insn "eh_set_lr_"
+  [(unspec [(match_operand:X 0 "register_operand" "r")] UNSPEC_EH_RETURN)
+   (clobber (match_scratch:X 1 "=&r"))]
+  ""
   "#")
 
 (define_split
-- 
2.34.1



[PATCH] RISC-V: Fix the sge ..., x0, ... pattern

2022-08-06 Thread Palmer Dabbelt
There's no operand 2 here, so referencing it doesn't make sense.  I
couldn't find a way to trigger bad assembly output so I don't have a
test.

gcc/ChangeLog

PR target/106543
* config/riscv/riscv.md (sge_): Remove
reference to non-existent operand.
---
No new failures on the Linux multilibs on trunk.
---
 gcc/config/riscv/riscv.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 0796f91dd30..ed1c7f241e6 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2386,7 +2386,7 @@
(any_ge:GPR (match_operand:X 1 "register_operand" " r")
(const_int 1)))]
   ""
-  "slt%i2\t%0,zero,%1"
+  "slt\t%0,zero,%1"
   [(set_attr "type" "slt")
(set_attr "mode" "")])
 
-- 
2.34.1



Re: [committed][RISC-V] Remove errant hunk of code

2023-08-03 Thread Palmer Dabbelt

On Thu, 03 Aug 2023 08:05:09 PDT (-0700), gcc-patches@gcc.gnu.org wrote:


I'm using this hunk locally to more thoroughly exercise the zicond paths
due to inaccuracies elsewhere in the costing model.  It was never
supposed to be part of the costing commit though.  And as we've seen
it's causing problems with the vector bits.

While my testing isn't complete, this hunk was never supposed to be
pushed and it's causing problems.  So I'm just ripping it out.

There's a bigger TODO in this space WRT a top-to-bottom evaluation of
the costing on RISC-V.  I'm still formulating what that evaluation is
going to look like, so don't hold your breath waiting on it.


We touched on it a bit in the call a few days ago, but I definately 
agree that's worth doing: the current cost and pipeline models are from 
when the ISA was a lot simpler, it's changed a lot over the last few 
years so we're likely to need a bunch of work here.  At a bare minimum 
there's some refactoring that could be done to make the code saner, it's 
been through a few rounds of work so there's some cruft.


Our rough plan was to get together some microbenchmarks to drive the 
model.  We're using Microprobe for that, Patrick has been updating it to 
support various new ISA extensions -- I think that's all upstream 
already.  That'll spit out throughput/latency tables that we can build 
the pipeline/cost models on top of.  The hope was to run these on some 
extant systems so we could start answering questions like that 
overlapping stores on the C906, but we haven't gotten around to that 
yet.


Everything past microprobe is just ideas right now.  The cost/pipeline 
model related issues on the scalar side look small so most of the worry 
is on vector.  The general theory is that we're going to need a lot of 
work on vector codegen to get things going fast for us, but it's all 
very microarchitecture specific.  We're not aiming for any of this for 
GCC 14.


So if you guys have time to look that'd be awesome, I don't think anyone 
over here will conflict with it any time soon -- aside from whatever 
falls out of bugs and the generic optimization work, but no way around 
that sort of thing.



Pushed to the trunk.
commit d61efa3cd3378be38738bfb5139925d1505c1325
Author: Jeff Law 
Date:   Thu Aug 3 10:57:23 2023 -0400

[committed][RISC-V] Remove errant hunk of code

I'm using this hunk locally to more thoroughly exercise the zicond paths
due to inaccuracies elsewhere in the costing model.  It was never
supposed to be part of the costing commit though.  And as we've seen
it's causing problems with the vector bits.

While my testing isn't complete, this hunk was never supposed to be
pushed and it's causing problems.  So I'm just ripping it out.

There's a bigger TODO in this space WRT a top-to-bottom evaluation of
the costing on RISC-V.  I'm still formulating what that evaluation is
going to look like, so don't hold your breath waiting on it.

Pushed to the trunk.

gcc/

* config/riscv/riscv.cc (riscv_rtx_costs): Remove errant hunk from
recent commit.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 9e75450aa97..d8fab68dbb4 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2913,16 +2913,6 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
}
   return false;

-case SET:
-  /* A simple SET with a register destination takes its cost solely from
-the SET_SRC operand.  */
-  if (outer_code == INSN && REG_P (SET_DEST (x)))
-   {
- *total = riscv_rtx_costs (SET_SRC (x), mode, SET, opno, total, speed);
- return true;
-   }
-  return false;
-
 default:
   return false;
 }


Re: [PATCH v3] RISC-V: Add Ztso atomic mappings

2023-08-08 Thread Palmer Dabbelt

On Tue, 08 Aug 2023 14:52:14 PDT (-0700), Patrick O'Neill wrote:

The RISC-V Ztso extension currently has no effect on generated code.
With the additional ordering constraints guarenteed by Ztso, we can emit
more optimized atomic mappings than the RVWMO mappings.

This PR implements the Ztso psABI mappings[1].

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/391


Which was just merged this morning, we talked some in the GCC patchwork 
sync.  IIUC that was the last blocker for getting the implementations 
merged, so


Reviewed-by: Palmer Dabbelt 

Thanks!


2023-08-08 Patrick O'Neill 

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add Ztso and mark Ztso as
dependent on 'a' extension.
* config/riscv/riscv-opts.h (MASK_ZTSO): New mask.
(TARGET_ZTSO): New target.
* config/riscv/riscv.cc (riscv_memmodel_needs_amo_acquire): Add
Ztso case.
(riscv_memmodel_needs_amo_release): Add Ztso case.
(riscv_print_operand): Add Ztso case for LR/SC annotations.
* config/riscv/riscv.md: Import sync-rvwmo.md and sync-ztso.md.
* config/riscv/riscv.opt: Add Ztso target variable.
* config/riscv/sync.md (mem_thread_fence_1): Expand to RVWMO or
Ztso specific insn.
(atomic_load): Expand to RVWMO or Ztso specific insn.
(atomic_store): Expand to RVWMO or Ztso specific insn.
* config/riscv/sync-rvwmo.md: New file. Seperate out RVWMO
specific load/store/fence mappings.
* config/riscv/sync-ztso.md: New file. Seperate out Ztso
specific load/store/fence mappings.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo-table-ztso-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-ztso-amo-add-5.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: New test.
* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-1.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-2.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-3.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-4.c: New test.
* gcc.target/riscv/amo-table-ztso-fence-5.c: New test.
* gcc.target/riscv/amo-table-ztso-load-1.c: New test.
* gcc.target/riscv/amo-table-ztso-load-2.c: New test.
* gcc.target/riscv/amo-table-ztso-load-3.c: New test.
* gcc.target/riscv/amo-table-ztso-store-1.c: New test.
* gcc.target/riscv/amo-table-ztso-store-2.c: New test.
* gcc.target/riscv/amo-table-ztso-store-3.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: New test.

Signed-off-by: Patrick O'Neill 
---
V3:
- Added missing Ztso extension version in riscv-common.cc
- Reformatted patterns/expanders
- Fix minor formatting issues
---
 gcc/common/config/riscv/riscv-common.cc   |   6 +
 gcc/config/riscv/riscv-opts.h |   4 +
 gcc/config/riscv/riscv.cc |  20 +++-
 gcc/config/riscv/riscv.md |   2 +
 gcc/config/riscv/riscv.opt|   3 +
 gcc/config/riscv/sync-rvwmo.md|  96 +++
 gcc/config/riscv/sync-ztso.md |  80 +
 gcc/config/riscv/sync.md  | 111 ++
 .../riscv/amo-table-ztso-amo-add-1.c  |  15 +++
 .../riscv/amo-table-ztso-amo-add-2.c  |  15 +++
 .../riscv/amo-table-ztso-amo-add-3.c  |  15 +++
 .../riscv/amo-table-ztso-amo-add-4.c  |  15 +++
 .../riscv/amo-table-ztso-amo-add-5.c  |  15 +++
 .../riscv/amo-table-ztso-compare-exchange-1.c |  10 ++
 .../riscv/amo-table-ztso-compare-exchange-2.c |  10 ++
 .../riscv/amo-table-ztso-compare-exchange-3.c |  10 ++
 .../riscv/amo-table-ztso-compare-exchange-4.c |  10 ++
 .../riscv/amo-table-ztso-compare-exchange-5.c |  10 ++
 .../riscv/amo-table-ztso-compare-exchange-6.c |  10 ++
 .../riscv/amo-table-ztso-compare-exchange-7.c |  10 ++
 .../gcc.target/riscv/amo-table-ztso-fence-1.c |  14 +++
 .../gcc.target/riscv/amo-t

Re: [PATCH] RISC-V: Handle no_insn in TARGET_SCHED_VARIABLE_ISSUE.

2023-08-10 Thread Palmer Dabbelt

On Thu, 10 Aug 2023 19:12:06 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

On 5/29/23 06:46, Jeff Law wrote:
> 
> 
> On 5/29/23 05:01, Jin Ma wrote:
>> Reference: 
>> https://github.com/gcc-mirror/gcc/commit/d0bc0cb66bcb0e6a5a5a31a9e900e8ccc98e34e5

>>
>> RISC-V should also be implemented to handle no_insn patterns for 
>> pipelining.

>>
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv.cc (riscv_sched_variable_issue): New function.
>> (TARGET_SCHED_VARIABLE_ISSUE): New macro.
>> ---
>>   gcc/config/riscv/riscv.cc | 21 +
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index 3954fc07a8b..559fa9cd7e0 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -6225,6 +6225,24 @@ riscv_issue_rate (void)
>> return tune_param->issue_rate;
>>   }
>> +/* Implement TARGET_SCHED_VARIABLE_ISSUE.  */
>> +
>> +static int
>> +riscv_sched_variable_issue (FILE *, int, rtx_insn *insn, int more)
>> +{
>> +  if (DEBUG_INSN_P (insn))
>> +return more;
>> +
>> +  rtx_code code = GET_CODE (PATTERN (insn));
>> +  if (code == USE || code == CLOBBER)
>> +return more;
>> +
>> +  if (get_attr_type (insn) == TYPE_UNKNOWN)
>> +return more;
>> +
>> +  return more - 1;
>> +}
> The problem is that INSN is *much* more likely to be a real instruction 
> that takes real resources, even if it is TYPE_UNKNOWN.
> TYPE_UNKNOWN here is actually an indicator of what I would consider a 
> bug in the backend, specifically that we have INSNs that do not provide 
> a mapping for the schedulers to suitable types.
> 
> With that in mind I'd much rather get to the point where we can do 
> something like this for TYPE_UNKNOWN:
> 
> type = get_attr_type (insn);

> gcc_assert (type != TYPE_UNKNOWN);
> 
> That way if we ever encounter a TYPE_UNKNOWN during development, we can 
> fix it in the md files in a sensible manner.  I don't know if we are 
> close to being able to do that.  We fixed a ton of stuff in bitmanip.md, 
> but I don't think there's been a thorough review of the port to find 
> other instances of TYPE_UNKNOWN INSNs.


Sorry for being lost here, but I'm not sure where TYPE_UNKNOWN comes 
from.  There's not a whole lot of instances in the code, and they all 
seem to be doing something very special.  Is it just something we didn't 
do a '(set_attr "type" ...)' on?


In that case it seems reasonable to have a dev-mode early failure: we've 
got some odd types now (like just the broad "bitmanip" one), but those 
can be split later.  At least having some classification seems like the 
way to go, it's just an internal interface so we can make it better 
later.


That said, it also smells like this is something that should be more 
generic than backend code?


> The other thing if this code probably wants to handle GHOST type 
> instructions.  While GHOST is used for instructions which generate no 
> code, it might seem they should return "more" as those INSNs take no 
> resources.  But GHOST is actually used for things like the blockage insn 
> which should end a cycle from an issue standpoint.  So the right 
> handling of a GHOST is something like this:
> 
> if (type == TYPE_GHOST)

>return 0;


Lost again, here, there's almost no references to TYPE_GHOST (aside from 
a MIPS-ism that looks to have ended up in Loongarch).


So there wasn't ever any follow-up.  Given this was something Ventana 
was also carrying locally (with very minor differences) I went ahead and 
merged up the implementations and pushed the final result to the trunk.



Attached is the patch that was actually committed.

Jeff


My fault, I'm very sorry for not replying to the patch follow-up, I just
forgot this :)


Re: [PATCH] RISC-V: Handle no_insn in TARGET_SCHED_VARIABLE_ISSUE.

2023-08-10 Thread Palmer Dabbelt

On Thu, 10 Aug 2023 20:19:02 PDT (-0700), jeffreya...@gmail.com wrote:



On 8/10/23 20:30, Palmer Dabbelt wrote:



Sorry for being lost here, but I'm not sure where TYPE_UNKNOWN comes
from.  There's not a whole lot of instances in the code, and they all
seem to be doing something very special.  Is it just something we didn't
do a '(set_attr "type" ...)' on?

Yup.  TYPE_UNKNOWN means we don't have a type associated with the insn.
As I've mentioned before this isn't a major problem if there's one or
two here and there.  But if most are TYPE_UNKNOWN, the the scheduler is
going to do highly unnatural things.


OK, that seems like the way to go.  I still think it's likely we'll need 
to split up these types more, but that's something we can only deal with 
when there's HW that behaves oddly.



In that case it seems reasonable to have a dev-mode early failure: we've
got some odd types now (like just the broad "bitmanip" one), but those
can be split later.  At least having some classification seems like the
way to go, it's just an internal interface so we can make it better later.

That said, it also smells like this is something that should be more
generic than backend code?

No, it's really a target issue.  And what I was suggesting is that we
get to the point where we can enable the currently #if 0'd assert so
that if we introduce insns without an associated type, we get a nice
early warning.  I wasn't up for tackling that this week ;-)


I was thinking of some sort of "TARGET_ALLOWS_UNKNOWN_INSNS" hook, but 
poking around the uses that might not be meaningfully simpler than just 
rejecting these in the backend -- certainly simpler if we're just 
worried about RISC-V ;)


This seems pretty mechinacial: just scrub through our MDs to check for 
any un-typed insns, then add the assert and fix the failures.  You're 
more than welcome to have at it, but LMK if you want me to try and find 
some time for someone to do it -- certainly seems like a good way for 
someone new to dig in a bit.



> The other thing if this code probably wants to handle GHOST type >
instructions.  While GHOST is used for instructions which generate no
> code, it might seem they should return "more" as those INSNs take
no > resources.  But GHOST is actually used for things like the
blockage insn > which should end a cycle from an issue standpoint.
So the right > handling of a GHOST is something like this:
> > if (type == TYPE_GHOST)
>    return 0;


Lost again, here, there's almost no references to TYPE_GHOST (aside from
a MIPS-ism that looks to have ended up in Loongarch).

Search for "ghost" in riscv.md ;-)


Thanks, the "return 0" makes sense.



Jeff


Re: [RFC PATCH 0/2] RISC-V: __builtin_riscv_pause for all environment

2023-08-11 Thread Palmer Dabbelt

On Fri, 11 Aug 2023 16:30:29 PDT (-0700), gcc-patches@gcc.gnu.org wrote:



On 8/9/23 16:39, Tsukasa OI wrote:

On 2023/08/10 5:05, Jeff Law wrote:



I'd tend to think we do not want to expose the intrinsic unless the
right extensions are enabled -- even though the encoding is a no-op and
we could emit it as a .insn.


I think that makes sense.  The only reason I implemented the
no-'Zihintpause' version is because GCC 13 implemented the built-in
unconditionally.  If the compatibility breakage is considered minimum (I
don't know, though), I'm ready to submit 'Zihintpause'-only version of
this patch set.

While it's a compatibility break I don't think we have a need to
preserve this kind of compatibility.  I suspect anyone using
__builtin_riscv_pause was probably already turning on Zihintpause and if
they weren't they should have been :-0


I agree it's fine to just call this a bug: the builtin wasn't doing 
anything on non-Zihintpause systems anyway, so it's not like it could 
have been all that useful.



I'm sure we'll kick this around in the Tuesday meeting and hopefully
make a decision about the desired direction.  You're obviously welcome
to join if you're inclined.  Let me know if you need an invite.

jeff


Re: [PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-16 Thread Palmer Dabbelt

On Wed, 16 Aug 2023 15:59:13 PDT (-0700), jeffreya...@gmail.com wrote:



On 8/16/23 07:50, Robin Dapp wrote:

But if it's a float16 precision issue then I would have expected both
the computations for the lhs and rhs values to have suffered
similarly.


Yeah, right.  I didn't look closely enough.  The problem is not the
reduction but the additional return-value conversion that is omitted
when calculating the reference value inline.

The attached is simpler and does the trick.

Regards
  Robin

Subject: [PATCH v2] RISC-V: Fix reduc_strict_run-1 test case.

This patch fixes the reduc_strict_run-1 testcase by converting
the reference value to double and back to the tested type.
Without that omitted the implicit return-value conversion and
would produce a different result for _Float16.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c:
Perform type -> double -> type conversion for reference value.

OK


I'm not opposed to merging the test change, but I couldn't figure out 
where in C the implicit conversion was coming from: as far as I can tell 
the macros don't introduce any (it's "return _float16 * _float16"), I'd 
had the patch open since last night but couldn't figure it out.


We get a bunch of half->single->half converting in the generated 
assembly that smelled like we had a bug somewhere else, sorry if I'm 
just missing something...



jeff


Re: [Committed] RISCV: Add rotate immediate regression test

2023-08-17 Thread Palmer Dabbelt

On Thu, 17 Aug 2023 10:10:38 PDT (-0700), Patrick O'Neill wrote:

On 8/16/23 21:36, Jeff Law wrote:




On 8/16/23 19:17, Patrick O'Neill wrote:

This adds new regression tests to ensure half-register rotations are
correctly optimized into rori instructions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbb-rol-ror-08.c: New test.
* gcc.target/riscv/zbb-rol-ror-09.c: New test.

Co-authored-by: Charlie Jenkins 
Signed-off-by: Patrick O'Neill 

OK
jeff

Committed


IIRC this came up in the context of Linux's TCP checksum code.


Patrick


Re: [PATCH V2] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin or zhinxmin

2023-08-17 Thread Palmer Dabbelt

On Thu, 17 Aug 2023 10:03:04 PDT (-0700), rdapp@gmail.com wrote:

Indeed all ANYLSF patterns have TARGET_HARD_FLOAT (==f extension) which
is incompatible with ZHINX or ZHINXMIN anyway.  That should really be fixed
separately or at least clarified, maybe I'm missing something.


We've also got the broader issue where these PIC patterns are likely not 
the way to go long term, they're just papering around some other issues 
(and are likely why we flip the implicit-relocs behavior implicitly).  
We should probably fix that at some point, but I don't see any reason to 
block a fix on a cleanup.


That said, given that folks are poking around in here it's probably 
worth putting together test cases for the other patterns in there.



Still we can go forward with the patch itself as it improves things
independently, so LGTM.


Ya, IMO it's fine to add these given they fix the issue.


Regards
 Robin


Re: [PATCH] RISC-V: Allow immediates 17-31 for vector shift.

2023-08-18 Thread Palmer Dabbelt

On Fri, 18 Aug 2023 12:37:06 PDT (-0700), rdapp@gmail.com wrote:

Hi,

this patch adds a missing constraint check in order to be able to
print (and not ICE) vector immediates 17-31 for vector shifts.

Regards
 Robin

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand):

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/shift-immediate.c: New test.
---
 gcc/config/riscv/riscv.cc|  3 ++-
 .../riscv/rvv/autovec/binop/shift-immediate.c| 16 
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-immediate.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 49062bef9fc..0f60ffe5f60 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4954,7 +4954,8 @@ riscv_print_operand (FILE *file, rtx op, int letter)


Looks like the comment at the top of riscv_print_operand() is way out of 
date.  Maybe we should just toss it?



else if (satisfies_constraint_Wc0 (op))
  asm_fprintf (file, "0");
else if (satisfies_constraint_vi (op)
-|| satisfies_constraint_vj (op))
+|| satisfies_constraint_vj (op)
+|| satisfies_constraint_vk (op))
  asm_fprintf (file, "%wd", INTVAL (elt));
else
  output_operand_lossage ("invalid vector constant");
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-immediate.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-immediate.c
new file mode 100644
index 000..a2e1c33f4fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-immediate.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -march=rv32gcv -mabi=ilp32d -O2 
--param=riscv-autovec-preference=scalable" } */
+
+#define uint8_t unsigned char
+
+void foo1 (uint8_t *a)
+{
+uint8_t b = a[0];
+int val = 0;
+
+for (int i = 0; i < 4; i++)
+{
+a[i] = (val & 1) ? (-val) >> 17 : val;
+val += b;
+}
+}



Unless I'm missing something it looks like we're missing at least Wc1 as 
well, and maybe a few others?


Either way

Reviewed-by: Palmer Dabbelt 

Thanks!


Re: [PATCH] RISC-V: improve codegen for repeating large constants [3]

2023-06-30 Thread Palmer Dabbelt

On Fri, 30 Jun 2023 17:25:54 PDT (-0700), Andrew Waterman wrote:

On Fri, Jun 30, 2023 at 5:13 PM Vineet Gupta  wrote:




On 6/30/23 16:50, Andrew Waterman wrote:
> I don't believe this is correct; the subtraction is needed to account
> for the fact that the low part might be negative, resulting in a
> borrow from the high part.  See the output for your test case below:
>
> $ cat test.c
> #include 
>
> int main()
> {
>unsigned long result, tmp;
>
> asm (
>"li  %1,-252645376\n"
>"addi%1,%1,240\n"
>"slli%0,%1,32\n"
>"add %0,%0,%1"
>  : "=r" (result), "=r" (tmp));
>
>printf("%lx\n", result);
>
>return 0;
> }
> $ riscv64-unknown-elf-gcc -O2 test.c
> $ spike pk a.out
> bbl loader
> f0f0f0eff0f0f0f0
> $

Thx for the quick feedback Andew. I'm clearly lacking in signed math :-(
So is it possible to have a better code seq for the testcase at all ?


You're welcome!

When Zba is implemented, then inserting a zext.w would do the trick;
see below.  (The generalization is that the zext.w is needed if the
32-bit constant is negative.)  When Zba is not implemented, I think
the original sequence is optimal.

li  a5, -252645376
addia5, a5, 240
sllia0, a5, 32
zext.w  a5, a5
add a0, a0, a5


For the non-Zba case, I think we can leverage the two high parts 
starting out the same to save an instruction generating the constant.  
So for the original code sequence of 


   li  a5,-252645376
   addia5,a5,241
   li  a0,-252645376
   sllia5,a5,32
   addia0,a0,240
   add a0,a5,a0
   ret

we could instead generate

   li  a5,-252645376
   addia0,a5,240
   addia5,a5,241
   sllia5,a5,32
   add a0,a5,a0
   ret

which IIUC produces the same result.  I think something along the lines 
of this (with the corresponding cost function updates) would do it


   diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
   index de578b5b899..32b6033a966 100644
   --- a/gcc/config/riscv/riscv.cc
   +++ b/gcc/config/riscv/riscv.cc
   @@ -704,7 +704,13 @@ riscv_split_integer (HOST_WIDE_INT val, machine_mode 
mode)
  rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode);

  riscv_move_integer (hi, hi, hival, mode);

   -  riscv_move_integer (lo, lo, loval, mode);
   +  if (riscv_integer_cost (loval - hival) + 1 < riscv_integer_cost (loval)) {
   +rtx delta = gen_reg_rrtx (mode);
   +riscv_move_integer (delta, delta, loval - hival, mode);
   +lo = gen_rtx_fmt_ee (PLUS, mode, hi, delta);
   +  } else {
   +riscv_move_integer (lo, lo, loval, mode);
   +  }

  hi = gen_rtx_fmt_ee (ASHIFT, mode, hi, GEN_INT (32));

  hi = force_reg (mode, hi);

though I suppose that would produce a slightly different sequence that has the
same number of instructions but a slightly longer dependency chain, something
more like

   li  a5,-252645376
   addia5,a5,241
   addia0,a5,-1
   sllia5,a5,32
   add a0,a5,a0
   ret

Take that all with a grain of salt, though, as I just ate some very spicy
chicken and can barely see straight :)







-Vineet

>
>
> On Fri, Jun 30, 2023 at 4:42 PM Vineet Gupta  wrote:
>>
>>
>> On 6/30/23 16:33, Vineet Gupta wrote:
>>> Ran into a minor snafu in const splitting code when playing with test
>>> case from an old PR/23813.
>>>
>>>long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; }
>>>
>>> This currently generates
>>>
>>>li  a5,-252645376
>>>addia5,a5,241
>>>li  a0,-252645376
>>>sllia5,a5,32
>>>addia0,a0,240
>>>add a0,a5,a0
>>>ret
>>>
>>> The signed math in hival extraction introduces an additional bit,
>>> causing loval == hival check to fail.
>>>
>>> | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at 
../gcc/config/riscv/riscv.cc:702
>>> | 702   unsigned HOST_WIDE_INT loval = sext_hwi (val, 32);
>>> | (gdb)n
>>> | 703   unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32);
>>> | (gdb)
>> FWIW (and I missed adding this observation to the changelog) I pondered
>> about using unsigned loval/hival with zext_hwi() but that in certain
>> cases can cause additional insns
>>
>> e.g. constant 0x8000_ is codegen to LI 1 +SLLI 31 vs, LI
>> 0x_8000
>>
>>
>>> | 704   rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode);
>>> | (gdb) p/x val
>>> | $2 = 0xf0f0f0f0f0f0f0f0
>>> | (gdb) p/x loval
>>> | $3 = 0xf0f0f0f0
>>> | (gdb) p/x hival
>>> | $4 = 0xf0f0f0f1
>>>  ^^^
>>> Fix that by eliding the subtraction in shift.
>>>
>>> With patch:
>>>
>>>li  a5,-252645376
>>>addia5,a5,240
>>>sllia0,a5,32
>>>add a0,a0,a5
>>>ret
>>>
>>> gcc/ChangeLog:
>>>
>>>* config/riscv/riscv.cc (riscv_split_integer): hival computation
>>>  do elide subtraction of loval.
>>>* (riscv_split

Re: [PATCH] RISC-V: improve codegen for repeating large constants [3]

2023-07-01 Thread Palmer Dabbelt

On Sat, 01 Jul 2023 07:04:16 PDT (-0700), jeffreya...@gmail.com wrote:



On 7/1/23 02:00, Andrew Waterman wrote:



Yeah, that might end up being a false economy for superscalars.

In general, I wouldn't recommend spending too many cleverness beans on
non-Zba+Zbb implementations.  Going forward, we should expect that
even very simple cores provide those extensions.

I suspect you under-estimate how difficult it is to get the distros to
move forward on baseline ISAs.


Ya, we haven't even gotten to the point where most implementations are 
shipping with the B extensions, much less to the point where we can 
start ignoring all the pre-B hardware.


Re: [PATCH] RISC-V: fix TARGET_PROMOTE_FUNCTION_MODE hook for libcalls

2023-10-31 Thread Palmer Dabbelt

On Tue, 31 Oct 2023 16:18:35 PDT (-0700), jeffreya...@gmail.com wrote:



On 10/31/23 12:35, Vineet Gupta wrote:

riscv_promote_function_mode doesn't promote a SI to DI for libcalls
case.

The fix is what generic promote_mode () in explow.cc does. I really
don't understand why the old code didn't work, but stepping thru the
debugger shows old code didn't and fixed does.

This showed up when testing Ajit's REE ABI extension series which probes
the ABI (using a NULL tree type) and ends up hitting the libcall code path.

[Usual caveat, I'll wait for Pre-commit CI to run the tests and report]

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_promote_function_mode): Fix mode
  returned for libcall case.

Hmm.  There may be dragons in here.  I'll need to find and review an old
conversation in this space (libcalls and argument promotions).


We also have a non-orthogonality in the ABI sign extension rules between 
SI and DI, a few of us were talking about it on the internal slack 
(though the specifics were for a different patch, Vineet has a few in 
flight).


Re: RISC-V: Add divmod instruction support

2023-02-18 Thread Palmer Dabbelt

On Fri, 17 Feb 2023 06:02:40 PST (-0800), gcc-patches@gcc.gnu.org wrote:

Hi all,
If we have division and remainder calculations with the same operands:

  a = b / c;
  d = b % c;

We can replace the calculation of remainder with multiplication +
subtraction, using the result from the previous division:

  a = b / c;
  d = a * c;
  d = b - d;

Which will be faster.


Do you have any benchmarks that show that performance increase?  The ISA 
manual specifically says the suggested sequence is div+mod, and while 
those suggestions don't always pan out for real hardware it's likely 
that at least some implementations will end up with the ISA-suggested 
fusions.



Currently, it isn't done for RISC-V.

I've added an expander for DIVMOD which replaces 'rem' with 'mul + sub'.

Best regards,
Matevos.

gcc/ChangeLog:

* config/riscv/riscv.md: Added divmod expander.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/divmod.c: New testcase.

--- inline copy of the patch ---

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index f95dd405e12..d941483d9f1 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -148,6 +148,11 @@
 ;; from the same template.
 (define_code_iterator any_mod [mod umod])

+;; These code iterators allow unsigned and signed divmod to be generated
+;; from the same template.
+(define_code_iterator only_div [div udiv])
+(define_code_attr paired_mod [(div "mod") (udiv "umod")])
+
 ;; These code iterators allow the signed and unsigned scc operations to use
 ;; the same template.
 (define_code_iterator any_gt [gt gtu])
@@ -175,7 +180,8 @@
  (gt "") (gtu "u")
  (ge "") (geu "u")
  (lt "") (ltu "u")
- (le "") (leu "u")])
+ (le "") (leu "u")
+ (div "") (udiv "u")])

 ;;  is like , but the signed form expands to "s" rather than "".
 (define_code_attr su [(sign_extend "s") (zero_extend "u")])
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c8adc5af5d2..2d48ff3f8de 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1044,6 +1044,22 @@
   [(set_attr "type" "idiv")
(set_attr "mode" "DI")])

+(define_expand "divmod4"
+  [(parallel
+ [(set (match_operand:GPR 0 "register_operand")
+   (only_div:GPR (match_operand:GPR 1 "register_operand")
+ (match_operand:GPR 2 "register_operand")))
+  (set (match_operand:GPR 3 "register_operand")
+   (:GPR (match_dup 1) (match_dup 2)))])]
+  "TARGET_DIV"
+  {
+  rtx tmp = gen_reg_rtx (mode);
+  emit_insn (gen_div3 (operands[0], operands[1],
operands[2]));
+  emit_insn (gen_mul3 (tmp, operands[0], operands[2]));
+  emit_insn (gen_sub3 (operands[3], operands[1], tmp));
+  DONE;
+  })
+
 (define_insn "*si3_extended"
   [(set (match_operand:DI 0 "register_operand" "=r")
  (sign_extend:DI
diff --git a/gcc/testsuite/gcc.target/riscv/divmod.c
b/gcc/testsuite/gcc.target/riscv/divmod.c
new file mode 100644
index 000..254b25e654d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/divmod.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+
+void
+foo(int a, int b, int *c, int *d)
+{
+   *c = a / b;
+   *d = a % b;
+}
+
+/* { dg-final { scan-assembler-not "rem" } } */
+/* { dg-final { scan-assembler-times "mul" 1 } } */
+/* { dg-final { scan-assembler-times "sub" 1 } } */


Re: RISC-V: Add divmod instruction support

2023-02-18 Thread Palmer Dabbelt

On Sat, 18 Feb 2023 10:42:51 PST (-0800), pins...@gmail.com wrote:

On Sat, Feb 18, 2023 at 10:27 AM Palmer Dabbelt  wrote:


On Fri, 17 Feb 2023 06:02:40 PST (-0800), gcc-patches@gcc.gnu.org wrote:
> Hi all,
> If we have division and remainder calculations with the same operands:
>
>   a = b / c;
>   d = b % c;
>
> We can replace the calculation of remainder with multiplication +
> subtraction, using the result from the previous division:
>
>   a = b / c;
>   d = a * c;
>   d = b - d;
>
> Which will be faster.

Do you have any benchmarks that show that performance increase?  The ISA
manual specifically says the suggested sequence is div+mod, and while
those suggestions don't always pan out for real hardware it's likely
that at least some implementations will end up with the ISA-suggested
fusions.


I suspect I will be needing this kind of patch for the core that I am
going to be using.


OK, good to know.  Presumably you guys aren't ready to show benchmarks, 
though? 


If anything this should be under a tuning option.


That seems likely, as IIRC the SiFive cores do this fusion.  It 
generally seems like we're going to end up with implementations all over 
the place when it comes to what's fused, so I bet we'll have a lot of 
these differences between cores.



Thanks,
Andrew Pinski




> Currently, it isn't done for RISC-V.
>
> I've added an expander for DIVMOD which replaces 'rem' with 'mul + sub'.
>
> Best regards,
> Matevos.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.md: Added divmod expander.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/divmod.c: New testcase.
>
> --- inline copy of the patch ---
>
> diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> index f95dd405e12..d941483d9f1 100644
> --- a/gcc/config/riscv/iterators.md
> +++ b/gcc/config/riscv/iterators.md
> @@ -148,6 +148,11 @@
>  ;; from the same template.
>  (define_code_iterator any_mod [mod umod])
>
> +;; These code iterators allow unsigned and signed divmod to be generated
> +;; from the same template.
> +(define_code_iterator only_div [div udiv])
> +(define_code_attr paired_mod [(div "mod") (udiv "umod")])
> +
>  ;; These code iterators allow the signed and unsigned scc operations to use
>  ;; the same template.
>  (define_code_iterator any_gt [gt gtu])
> @@ -175,7 +180,8 @@
>   (gt "") (gtu "u")
>   (ge "") (geu "u")
>   (lt "") (ltu "u")
> - (le "") (leu "u")])
> + (le "") (leu "u")
> + (div "") (udiv "u")])
>
>  ;;  is like , but the signed form expands to "s" rather than "".
>  (define_code_attr su [(sign_extend "s") (zero_extend "u")])
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index c8adc5af5d2..2d48ff3f8de 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1044,6 +1044,22 @@
>[(set_attr "type" "idiv")
> (set_attr "mode" "DI")])
>
> +(define_expand "divmod4"
> +  [(parallel
> + [(set (match_operand:GPR 0 "register_operand")
> +   (only_div:GPR (match_operand:GPR 1 "register_operand")
> + (match_operand:GPR 2 "register_operand")))
> +  (set (match_operand:GPR 3 "register_operand")
> +   (:GPR (match_dup 1) (match_dup 2)))])]
> +  "TARGET_DIV"
> +  {
> +  rtx tmp = gen_reg_rtx (mode);
> +  emit_insn (gen_div3 (operands[0], operands[1],
> operands[2]));
> +  emit_insn (gen_mul3 (tmp, operands[0], operands[2]));
> +  emit_insn (gen_sub3 (operands[3], operands[1], tmp));
> +  DONE;
> +  })
> +
>  (define_insn "*si3_extended"
>[(set (match_operand:DI 0 "register_operand" "=r")
>   (sign_extend:DI
> diff --git a/gcc/testsuite/gcc.target/riscv/divmod.c
> b/gcc/testsuite/gcc.target/riscv/divmod.c
> new file mode 100644
> index 000..254b25e654d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/divmod.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> +
> +void
> +foo(int a, int b, int *c, int *d)
> +{
> +   *c = a / b;
> +   *d = a % b;
> +}
> +
> +/* { dg-final { scan-assembler-not "rem" } } */
> +/* { dg-final { scan-assembler-times "mul" 1 } } */
> +/* { dg-final { scan-assembler-times "sub" 1 } } */


Re: RISC-V: Add divmod instruction support

2023-02-18 Thread Palmer Dabbelt

On Sat, 18 Feb 2023 13:06:02 PST (-0800), jeffreya...@gmail.com wrote:



On 2/18/23 11:26, Palmer Dabbelt wrote:

On Fri, 17 Feb 2023 06:02:40 PST (-0800), gcc-patches@gcc.gnu.org wrote:

Hi all,
If we have division and remainder calculations with the same operands:

  a = b / c;
  d = b % c;

We can replace the calculation of remainder with multiplication +
subtraction, using the result from the previous division:

  a = b / c;
  d = a * c;
  d = b - d;

Which will be faster.


Do you have any benchmarks that show that performance increase?  The ISA
manual specifically says the suggested sequence is div+mod, and while
those suggestions don't always pan out for real hardware it's likely
that at least some implementations will end up with the ISA-suggested
fusions.

It'll almost certainly be visible in mcf.  Been there, done that.  In
fact, that's why I asked the team Matevos works on to poke at this case
as I went through this issue on another processor.

It can also be run through LLVM's MCA to estimate counts if you've got a
pipeline description.  THe div+rem will come out at around ~40c while a
div+mul+sub should weigh in around 25c for Veyron v1.


Do you have a link to the patches somewhere?  I couldn't find them 
online, just the custom instruction support.  Or even just some docs 
describing what the pipeline does, as just basing one performance model 
on another is kind of a double-edged sword.


That said, I think just knowing the processor doesn't do the div+mod 
fusion is sufficient to turn something like this on for the mtune for 
that processor.  That's different than turning it on globally, though -- 
unless it turns out nobody is actually doing the fusion suggested in the 
ISA manual, which wouldn't be super surprising.


Maybe some of the SiFive and T-Head folks can chime in on whether or not 
their processors perform the fusion in question -- and if so, do the 
instructions need to say back-to-back?  It doesn't look like we're 
really targeting the code sequences the ISA suggests as it stands, so 
maybe it's OK to just switch the default over too?


It also brings up the question of mulh+mul fusions, which I don't think 
we've really looked at (though maybe they're a lot less important for 
rv64).


Re: [gcc] RTEMS: Tune multilib selection

2023-02-23 Thread Palmer Dabbelt

On Thu, 23 Feb 2023 03:48:26 PST (-0800), sebastian.hu...@embedded-brains.de 
wrote:

gcc/ChangeLog:

* config/riscv/t-rtems: Keep only -mcmodel=medany 64-bit multilibs.
Add non-compact 32-bit multilibs.
---
 gcc/config/riscv/t-rtems | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/gcc/config/riscv/t-rtems b/gcc/config/riscv/t-rtems
index 41f5927fc87..19b12030895 100644
--- a/gcc/config/riscv/t-rtems
+++ b/gcc/config/riscv/t-rtems
@@ -1,8 +1,8 @@
 MULTILIB_OPTIONS   =
 MULTILIB_DIRNAMES  =
 
-MULTILIB_OPTIONS	+= march=rv32i/march=rv32im/march=rv32imafd/march=rv32iac/march=rv32imac/march=rv32imafc/march=rv64imafd/march=rv64imac/march=rv64imafdc

-MULTILIB_DIRNAMES  += rv32i   rv32im   rv32imafd   rv32iac 
  rv32imac   rv32imafc   rv64imafd   rv64imac   rv64imafdc
+MULTILIB_OPTIONS   += 
march=rv32i/march=rv32iac/march=rv32im/march=rv32ima/march=rv32imac/march=rv32imaf/march=rv32imafc/march=rv32imafd/march=rv32imafdc/march=rv64ima/march=rv64imac/march=rv64imafd/march=rv64imafdc
+MULTILIB_DIRNAMES  += rv32i   rv32iac   rv32im   rv32ima   
rv32imac   rv32imaf   rv32imafc   rv32imafd   rv32imafdc   
rv64ima   rv64imac   rv64imafd   rv64imafdc
 
 MULTILIB_OPTIONS	+= mabi=ilp32/mabi=ilp32f/mabi=ilp32d/mabi=lp64/mabi=lp64d

 MULTILIB_DIRNAMES  += ilp32  ilp32f  ilp32d  lp64  lp64d
@@ -12,14 +12,15 @@ MULTILIB_DIRNAMES   += medany
 
 MULTILIB_REQUIRED	=

 MULTILIB_REQUIRED  += march=rv32i/mabi=ilp32
-MULTILIB_REQUIRED  += march=rv32im/mabi=ilp32
-MULTILIB_REQUIRED  += march=rv32imafd/mabi=ilp32d
 MULTILIB_REQUIRED  += march=rv32iac/mabi=ilp32
+MULTILIB_REQUIRED  += march=rv32im/mabi=ilp32
+MULTILIB_REQUIRED  += march=rv32ima/mabi=ilp32
 MULTILIB_REQUIRED  += march=rv32imac/mabi=ilp32
+MULTILIB_REQUIRED  += march=rv32imaf/mabi=ilp32f
 MULTILIB_REQUIRED  += march=rv32imafc/mabi=ilp32f
-MULTILIB_REQUIRED  += march=rv64imafd/mabi=lp64d
-MULTILIB_REQUIRED  += march=rv64imafd/mabi=lp64d/mcmodel=medany
-MULTILIB_REQUIRED  += march=rv64imac/mabi=lp64
+MULTILIB_REQUIRED  += march=rv32imafd/mabi=ilp32d
+MULTILIB_REQUIRED  += march=rv32imafdc/mabi=ilp32d
+MULTILIB_REQUIRED  += march=rv64ima/mabi=lp64/mcmodel=medany
 MULTILIB_REQUIRED  += march=rv64imac/mabi=lp64/mcmodel=medany
-MULTILIB_REQUIRED  += march=rv64imafdc/mabi=lp64d
+MULTILIB_REQUIRED  += march=rv64imafd/mabi=lp64d/mcmodel=medany
 MULTILIB_REQUIRED  += march=rv64imafdc/mabi=lp64d/mcmodel=medany


Reviewed-by: Palmer Dabbelt 

IMO it's fine to remove multilibs from the default set.  It could be 
seen as breaking users, but IIRC last time we talked about something 
like this it was OK as otherwise we're going to end up with a huge set 
of multilibs for defunct ISAs.  This one is also extra safe, since 
moving to medany shouldn't break any users (aside from maybe a slight 
performance issue).


Are you aiming for GCC-13 with this?  I wouldn't be opposed to that: 
there's some risk of breaking users this late in the process, but my 
guess is that most of them aren't looking until release anyway.  Still 
better to hold off, but if there's something in RTEMS land that benefits 
from this being early then I think it's fine.


[PATCH] RISC-V: Disable attribute generation by default

2023-02-23 Thread Palmer Dabbelt
We generate a handful of attributes by default, but they don't really
encode any useful information.  We've broadly stopped ascribing any
meaning to them in binutils; but they trip up LLVM, older toolchains,
and users.  So let's just turn them off by default.  The old binaries
will still be floating around, but at least this way we'll stop tripping
over new incompatibilities.

If we get to a point where there's some attributes that are defined that
we can use then we can sort out how to turn those on without turning on
the old ones, but unless I'm missing something the current set of
attributes are too broken to be useful for anything.

gcc/ChangeLog:

* config.gcc (--with-riscv-attribute): Default to off.
---
I know it's pretty late, but I'd like to target this for GCC-13.  The
Zmmul stuff has resulted in another round of build breakages that we're
going to have to chase down, and while we could update everything to
turn off the attributes it seems easier to just set the default.
---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c070e6ecd2e..52639cf26d6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4596,7 +4596,7 @@ case "${target}" in
tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
""|default)
-   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=1"
+   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
*)
echo "--with-riscv-attribute=${with_riscv_attribute} is 
not supported.  The argument must begin with yes, no or default." 1>&2
-- 
2.39.1



Re: [PATCH] RISC-V: Disable attribute generation by default

2023-02-24 Thread Palmer Dabbelt

On Fri, 24 Feb 2023 05:09:30 PST (-0800), gcc-patches@gcc.gnu.org wrote:

It did help people to identify what extension used in the binary, so I
would prefer keep that enable by default.


IMO it actually hurts more than helps, as it's not really encoding what 
extensions are in the binary (or necessary to run the binary) but 
instead just encodes what was in -march (with some noise added due to 
the merging bugs and ISA string changes).  Having the attributes just 
ends up tricking users into thinking the information is accurate when 
it's not.



and lld is begin fix those merge issue, so the situation should be improved
soon.


If toolchains are just going to ignore then attributes then it's a 
pretty good sign they're not useful.



Palmer Dabbelt  於 2023年2月24日 週五 10:29 寫道:


We generate a handful of attributes by default, but they don't really
encode any useful information.  We've broadly stopped ascribing any
meaning to them in binutils; but they trip up LLVM, older toolchains,
and users.  So let's just turn them off by default.  The old binaries
will still be floating around, but at least this way we'll stop tripping
over new incompatibilities.

If we get to a point where there's some attributes that are defined that
we can use then we can sort out how to turn those on without turning on
the old ones, but unless I'm missing something the current set of
attributes are too broken to be useful for anything.

gcc/ChangeLog:

* config.gcc (--with-riscv-attribute): Default to off.
---
I know it's pretty late, but I'd like to target this for GCC-13.  The
Zmmul stuff has resulted in another round of build breakages that we're
going to have to chase down, and while we could update everything to
turn off the attributes it seems easier to just set the default.
---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c070e6ecd2e..52639cf26d6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4596,7 +4596,7 @@ case "${target}" in
tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
""|default)
-   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=1"
+   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
*)
echo
"--with-riscv-attribute=${with_riscv_attribute} is not supported.  The
argument must begin with yes, no or default." 1>&2
--
2.39.1




Re: [PATCH] riscv: Clarify vlmax and length handling.

2023-05-10 Thread Palmer Dabbelt

On Wed, 10 May 2023 08:24:40 PDT (-0700), rdapp@gmail.com wrote:

Hi,

this patch tries to improve the wrappers that emit either vlmax or
non-vlmax operations.  Now, emit_len_op can be used to
emit a regular operation.  Depending on whether a length != NULL
is passed either no VLMAX flags are set or we emit a vsetvli and
set VLMAX flags.  The patch also adds some comments that describes
some of the rationale of the current handling of vlmax/nonvlmax
operations.

Bootstrapped and regtested.

Regards
 Robin


It's somewhat common for mail clients to treat "--" as a signature 
deliminator, it's "---" that git uses as a comment deliminator.


Re: [PATCH] riscv: Clarify vlmax and length handling.

2023-05-10 Thread Palmer Dabbelt

On Wed, 10 May 2023 11:50:32 PDT (-0700), rdapp@gmail.com wrote:

It's somewhat common for mail clients to treat "--" as a signature
deliminator, it's "---" that git uses as a comment deliminator.


It's in my muscle memory somehow.  Always did it that way because I
didn't want the same delimiter as in the git part of the message.  Time
to change that habit I suppose :) (or automate more of the process).


I guess if you're committing your own code it doesn't matter, but mixing 
them will trip up git-am and such.


The patch LGTM, but it's mostly Juzhe's code so it's probably best to at 
least give him a chance to see it when he's awake.


Re: [PATCH] riscv: Split off shift patterns for autovectorization.

2023-05-10 Thread Palmer Dabbelt
On Wed, 10 May 2023 08:24:50 PDT (-0700), rdapp@gmail.com wrote:
> Hi,
>
> this patch splits off the shift patterns of the binop patterns.
> This is necessary as the scalar shifts require a Pmode operand
> as shift count.  To this end, a new iterator any_int_binop_no_shift
> is introduced.  At a later point when the binops are split up
> further in commutative and non-commutative patterns (which both
> do not include the shift patterns) we might not need this anymore.
>
> Bootstrapped and regtested.
>
> Regards
>  Robin
>
> --
>
> gcc/ChangeLog:
>
>   * config/riscv/autovec.md (3): Add scalar shift
>   pattern.
>   (v3): Add vector shift pattern.
>   * config/riscv/vector-iterators.md: New iterator.
> ---
>  gcc/config/riscv/autovec.md  | 40 +++-
>  gcc/config/riscv/vector-iterators.md |  4 +++
>  2 files changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 8347e42bb9c..2da4fc67d51 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -65,7 +65,7 @@ (define_expand "movmisalign"
>
>  (define_expand "3"
>[(set (match_operand:VI 0 "register_operand")
> -(any_int_binop:VI
> +(any_int_binop_no_shift:VI
>   (match_operand:VI 1 "")
>   (match_operand:VI 2 "")))]
>"TARGET_VECTOR"
> @@ -91,3 +91,41 @@ (define_expand "3"
> NULL_RTX, mode);
>DONE;
>  })
> +
> +;; =
> +;; == Binary integer shifts by scalar.
> +;; =
> +
> +(define_expand "3"
> +  [(set (match_operand:VI 0 "register_operand")
> +(any_shift:VI
> + (match_operand:VI 1 "register_operand")
> + (match_operand: 2 "csr_operand")))]

I don't think VEL is _wrong_ here, as it's an integer type that's big
enough to hold the shift amount, but we might get some odd generated
code for the QI and HI flavors as we frequently don't handle the shorter
types well.

"csr_operand" does seem wrong, though, as that just accepts constants.
Maybe "arith_operand" is the way to go?  I haven't looked at the
V immediates though.

> +  "TARGET_VECTOR"
> +{
> +  if (!CONST_SCALAR_INT_P (operands[2]))
> +  operands[2] = gen_lowpart (Pmode, operands[2]);
> +  riscv_vector::emit_len_binop (code_for_pred_scalar
> + (, mode),
> + operands[0], operands[1], operands[2],
> + NULL_RTX, mode, Pmode);
> +  DONE;
> +})
> +
> +;; =
> +;; == Binary integer shifts by vector.
> +;; =
> +
> +(define_expand "v3"
> +  [(set (match_operand:VI 0 "register_operand")
> +(any_shift:VI
> + (match_operand:VI 1 "register_operand")
> + (match_operand:VI 2 "vector_shift_operand")))]
> +  "TARGET_VECTOR"
> +{
> +  riscv_vector::emit_len_binop (code_for_pred
> + (, mode),
> + operands[0], operands[1], operands[2],
> + NULL_RTX, mode);
> +  DONE;
> +})
> diff --git a/gcc/config/riscv/vector-iterators.md 
> b/gcc/config/riscv/vector-iterators.md
> index 42848627c8c..fdb0bfbe3b1 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -1429,6 +1429,10 @@ (define_code_iterator any_commutative_binop [plus and 
> ior xor
>
>  (define_code_iterator any_non_commutative_binop [minus div udiv mod umod])
>
> +(define_code_iterator any_int_binop_no_shift
> + [plus minus and ior xor smax umax smin umin mult div udiv mod umod
> +])
> +
>  (define_code_iterator any_immediate_binop [plus minus and ior xor])
>
>  (define_code_iterator any_sat_int_binop [ss_plus ss_minus us_plus us_minus])
> --
> 2.40.0

It'd be great to have test cases for the patterns we're adding, at least
for some of the stickier ones.


Re: [PATCH] riscv: Add autovectorization tests for binary integer

2023-05-10 Thread Palmer Dabbelt
On Wed, 10 May 2023 08:24:57 PDT (-0700), rdapp@gmail.com wrote:
> Hi,
>
> this patchs adds scan as well as execution tests for vectorized
> binary integer operations.  It is based on Michael Collison's work
> and also includes scalar variants.  The tests are not fully comprehensive
> as the vector type promotions (vec_unpack, extend etc.) are not
> implemented yet.  Also, vmulh, vmulhu, and vmulhsu and others are
> still missing.

Ah, I guess there's the tests... ;)

>
> Regards
>  Robin
>
> --
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/rvv/autovec/shift-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/shift-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/shift-scalar-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/shift-scalar-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/shift-scalar-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/shift-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/shift-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vadd-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vadd-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vadd-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vadd-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vand-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vand-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vand-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vand-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vdiv-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vdiv-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vdiv-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vdiv-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vmax-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vmax-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vmax-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vmax-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vmin-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vmin-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vmin-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vmin-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vmul-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vmul-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vmul-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vmul-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vor-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vor-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vor-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vor-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vrem-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vrem-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vrem-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vrem-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vsub-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vsub-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vsub-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vsub-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vxor-run-template.h: New test.
>   * gcc.target/riscv/rvv/autovec/vxor-rv32gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vxor-rv64gcv.c: New test.
>   * gcc.target/riscv/rvv/autovec/vxor-template.h: New test.

I just skimmed them, but nothing jumps out as a problem.  IMO that's
good enough to land them on trunk once the dependencies do.

> ---
>  .../riscv/rvv/autovec/shift-run-template.h|  47 +++
>  .../riscv/rvv/autovec/shift-rv32gcv.c |  12 ++
>  .../riscv/rvv/autovec/shift-rv64gcv.c |  12 ++
>  .../riscv/rvv/autovec/shift-scalar-rv32gcv.c  |   7 ++
>  .../riscv/rvv/autovec/shift-scalar-rv64gcv.c  |   7 ++
>  .../riscv/rvv/autovec/shift-scalar-template.h | 119 ++
>  .../riscv/rvv/autovec/shift-template.h|  34 +
>  .../riscv/rvv/autovec/vadd-run-template.h |  64 ++
>  .../riscv/rvv/autovec/vadd-rv32gcv.c  |   8 ++
>  .../riscv/rvv/autovec/vadd-rv64gcv.c  |   8 ++
>  .../riscv/rvv/autovec/vadd-template.h |  56 +
>  .../riscv/rvv/autovec/vand-run-template.h |  64 ++
>  .../riscv/rvv/autovec/vand-rv32gcv.c  |   8 ++
>  .../riscv/rvv/autovec/vand-rv64gcv.c  |   8 ++
>  .../riscv/rvv/autovec/vand-template.h |  56 +
>  .../riscv/rvv/autovec/vdiv-run-template.h |  42 +++
>  .../riscv/rvv/autovec/vdiv-rv32gcv.c  |  10 ++
>  .../riscv/rvv/autovec/vdiv-rv64gcv.c  |  10 ++
>  .../riscv/rvv/autovec/vdiv-template.h |  34 +
>  .../riscv/rvv

Re: [PATCH v2] RISC-V: Split off shift patterns for autovectorization.

2023-05-11 Thread Palmer Dabbelt

On Thu, 11 May 2023 07:21:30 PDT (-0700), jeffreya...@gmail.com wrote:

On 5/11/23 04:33, Robin Dapp wrote:

"csr_operand" does seem wrong, though, as that just accepts constants.
Maybe "arith_operand" is the way to go?  I haven't looked at the
V immediates though.


I was pondering changing the shift-count operand to QImode everywhere
but that indeed does not help code generation across the board.  It can
still work but might require extra patterns here and there.

Yea.  It's a GCC wart and there hasn't ever been a clear best direction
on the mode for the shift count.  If you use QImode, as you note you
often end up having to add various patterns to avoid useless conversions
and such.


Yes, and I think given that we have so much weirdness for the sub-XLEN 
types in the RISC-V port we'd need to have a lot of fairly large 
patterns and some truncation-based fallbacks.  We've got some of those 
for the integer shifts already, though, so maybe it's the way to go?  

FWIW, I was trying to suggest X or REG as the shift amount and thought 
we'd done it that way for the integer shifts too.   I think we can 
reason about that with just some tiny code snippits, even if it's not 
the right way to go long term (as per below).  Probably a minor win, 
though, and I don't think it needs to block the patches.


Also: looks like I was wrong and "csr_operand" does the correct thing 
here because there's only a 5-bit immediate for the shift amounts.  We 
should probably name it something else, though, as this has nothing to 
do with CSRs...



I suspect QImode isn't ideal on a target like RV where we don't really
have QImode operations.  So all we do is force the introduction of
subregs all over the place to force the operand in to QImode.  It's
something I'd like to explore, but would obviously require a fair amount
of benchmarking to be able to confidently say which is better.


Folks have tried a few times and it's never ended up better.  I do think 
we're at a local minimum here, though -- ie, explicitly handling the 
shorter types would result in better generated code if we got everything 
right.  Gut feeling is that'd require a meaningful amount of middle-end 
work, though, as we're sufficiently different than MIPS here (and 
arm64/x86 have many of the ops).


Nobody in Rivos land is looking at this right now, though it's a pretty 
common red flag for new people and frequently trips up code gen so that 
might change with little notice...



Jeff


[PATCH] RISC-V: Add v_uimm_operand

2023-05-11 Thread Palmer Dabbelt
The vector shift immediates happen to have the same constraints as some
of the CSR-related operands, but it's a different usage.  This adds a
name for them, so I don't get confused again next time.

gcc/ChangeLog:

* config/riscv/autovec.md (shifts): Use v_uimm_operand.
* config/riscv/predicates.md (v_uimm_operand): New predicate.
---
I haven't even build tested this one, I just saw it when reviewing some
patch and figured I'd send it along.
---
 gcc/config/riscv/autovec.md| 2 +-
 gcc/config/riscv/predicates.md | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index ac0c939d277..daad51abbc2 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -132,7 +132,7 @@ (define_expand "3"
   [(set (match_operand:VI 0 "register_operand")
 (any_shift:VI
  (match_operand:VI 1 "register_operand")
- (match_operand: 2 "csr_operand")))]
+ (match_operand: 2 "v_uimm_operand")))]
   "TARGET_VECTOR"
 {
   if (!CONST_SCALAR_INT_P (operands[2]))
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index e5adf06fa25..62007d6c6e3 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -43,6 +43,11 @@ (define_predicate "csr_operand"
   (ior (match_operand 0 "const_csr_operand")
(match_operand 0 "register_operand")))
 
+;; V has 32-bit unsigned immediates.  This happens to be the same constraint as
+;  the csr_operand, but it's not CSR related.
+(define_predicate "v_uimm_operand"
+  (match_operand 0 "csr_operand"))
+
 (define_predicate "sle_operand"
   (and (match_code "const_int")
(match_test "SMALL_OPERAND (INTVAL (op) + 1)")))
-- 
2.40.0



[PATCH v2] RISC-V: Add vector_scalar_shift_operand

2023-05-11 Thread Palmer Dabbelt
The vector shift immediates happen to have the same constraints as some
of the CSR-related operands, but it's a different usage.  This adds a
name for them, so I don't get confused again next time.

gcc/ChangeLog:

* config/riscv/autovec.md (shifts): Use
  vector_scalar_shift_operand.
* config/riscv/predicates.md (vector_scalar_shift_operand): New
  predicate.
---
Still haven't built-tested it, my box is busy.

Changes since v1 <20230511182555.26183-1-pal...@rivosinc.com>:
* Change the name to "vector_scalar_shift_operand", as per Juzhe's
  suggestion.
* Add a missing second ";" in the comment.
---
 gcc/config/riscv/autovec.md| 2 +-
 gcc/config/riscv/predicates.md | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index ac0c939d277..4561fcbe957 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -132,7 +132,7 @@ (define_expand "3"
   [(set (match_operand:VI 0 "register_operand")
 (any_shift:VI
  (match_operand:VI 1 "register_operand")
- (match_operand: 2 "csr_operand")))]
+ (match_operand: 2 "vector_scalar_shift_operand")))]
   "TARGET_VECTOR"
 {
   if (!CONST_SCALAR_INT_P (operands[2]))
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index e5adf06fa25..90e6f942c97 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -43,6 +43,11 @@ (define_predicate "csr_operand"
   (ior (match_operand 0 "const_csr_operand")
(match_operand 0 "register_operand")))
 
+;; V has 32-bit unsigned immediates.  This happens to be the same constraint as
+;; the csr_operand, but it's not CSR related.
+(define_predicate "vector_scalar_shift_operand"
+  (match_operand 0 "csr_operand"))
+
 (define_predicate "sle_operand"
   (and (match_code "const_int")
(match_test "SMALL_OPERAND (INTVAL (op) + 1)")))
-- 
2.40.0



Re: [PATCH] RISC-V: Add v_uimm_operand

2023-05-11 Thread Palmer Dabbelt

On Thu, 11 May 2023 15:00:48 PDT (-0700), juzhe.zh...@rivai.ai wrote:

 ;; V has 32-bit unsigned immediates.  This happens to be the same constraint asIt 
should be 5-bit unsigned immediates>> ;  the csr_operand, but it's not CSR 
related.
(define_predicate "v_uimm_operand"
  (match_operand 0 "csr_operand"))

To make name consistent, it should be "vector_", so I suggest it to be 
"vector_scalar_shift_operand".


Makes sense, I sent a v2.


RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

A few of us were talking about test-related issues in the patchwork meeting
this morning.  I bumped to trunk and did a full rebuild, I'm getting the
following (it's in riscv-systems-ci/riscv-gnu-toolchain).  This is about what I
remember seeing last time I ran the tests, which was a week or so ago.  I
figured it'd be best to just blast the lists, as Jeff said his test running had
been hanging so there might be some issue preventing folks from seeing the
failures.

I guess I didn't get time to look last time and I doubt things are looking any
better right now.  I'll try and take a look at some point, but any help would
of course be appreciated.

$ cat toolchain/report
make[1]: Entering directory '/scratch/merges/rgt-gcc-trunk/toolchain'
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/scripts/testsuite-filter gcc glibc 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/test/allowlist `find 
build-gcc-linux-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`
=== g++: Unexpected fails for rv64imac lp64 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== g++: Unexpected fails for rv32imac ilp32 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
=== g++: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== g++: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
=== g++: Unexpected fails for rv64imafdcv lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C execution test
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C execution test
=== g++: Unexpected fails for rv32imafdcv ilp32d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-18.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-19.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-20.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-21.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-22.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C (test for excess errors)
=== g++: Unexpected fails for rv64gczba_zbb_zbc_zbs lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== gcc: Unexpected fails for rv64imac lp64 medlow ===
ERROR: tcl error sourcing 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp.
ERROR: torture-init: torture_without_loops is not empty as expected
ERROR: tcl error sourcing 
/scratch/merges/rgt-gcc-trunk/risc

Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 17:16:11 PDT (-0700), Vineet Gupta wrote:

On 5/16/23 16:06, Palmer Dabbelt wrote:

A few of us were talking about test-related issues in the patchwork
meeting
this morning.  I bumped to trunk and did a full rebuild, I'm getting the
following (it's in riscv-systems-ci/riscv-gnu-toolchain).  This is
about what I
remember seeing last time I ran the tests, which was a week or so ago.  I
figured it'd be best to just blast the lists, as Jeff said his test
running had
been hanging so there might be some issue preventing folks from seeing
the
failures.

I guess I didn't get time to look last time and I doubt things are
looking any
better right now.  I'll try and take a look at some point, but any
help would
of course be appreciated.


Yes I was seeing similar tcl errors and such - and in my case an even
higher count.
Also for posterity, what was your configure cmdline ? multilibs or no


If only I'd saved those in the build somewhere... :)

It's all in github.com/palmer-dabbelt/riscv-systems-ci, which points to 
riscv-gnu-toolchain.  I've always got uncommitted diff in my various 
local checkous, but I think this would only be


   toolchain: toolchain/install.stamp
   
   toolchain/install.stamp: toolchain/Makefile

   $(MAKE) -C $(dir $<)
   date > $@
   
   toolchain/Makefile: riscv-gnu-toolchain/configure

   mkdir -p $(dir $@)
   env -C $(dir $@) $(abspath $<) --prefix="$(abspath $(dir 
$@)/install)" --enable-linux --enable-multilib --enable-gcc-checking=yes
   
   toolchain/check.log: toolchain/install.stamp

   $(MAKE) -C $(dir $<) check \
   
GLIBC_TARGET_BOARDS_EXTRA="riscv-sim/-march=rv64gczba_zbb_zbc_zbs/-mabi=lp64d 
riscv-sim/-march=rv64imafdcv/-mabi=lp64d riscv-sim/-march=rv32imafdcv/-mabi=ilp32d" 
|& tee $@
   touch -c $@
   
   toolchain/report: toolchain/check.log

   $(MAKE) -C $(dir $<) report \
   
GLIBC_TARGET_BOARDS_EXTRA="riscv-sim/-march=rv64gczba_zbb_zbc_zbs/-mabi=lp64d 
riscv-sim/-march=rv64imafdcv/-mabi=lp64d riscv-sim/-march=rv32imafdcv/-mabi=ilp32d" 
|& tee $@
   touch -c $@


We really need to add some CI around RV toolchains to trip on these sooner !


Sounds like you're volunteering to set one up?


$ cat toolchain/report
make[1]: Entering directory '/scratch/merges/rgt-gcc-trunk/toolchain'
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/scripts/testsuite-filter
gcc glibc
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/test/allowlist `find
build-gcc-linux-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`
    === g++: Unexpected fails for rv64imac lp64 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imac ilp32 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess
errors)
    === g++: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess
errors)
    === g++: Unexpected fails for rv64imafdcv lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C execution test
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C execution test
    === g++: Unexpected fails for rv32imafdcv ilp32d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess
errors)
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/

Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 18:04:37 PDT (-0700), Vineet Gupta wrote:

+ Christoph, Jiawei

On 5/16/23 17:20, Palmer Dabbelt wrote:

We really need to add some CI around RV toolchains to trip on these
sooner !


Sounds like you're volunteering to set one up?


Patrick's github CI patch seems to be a great start. Lets wait for it to
get merged, that will at least catch rv toolchain snafus: although the
granularity of testing is not ideal (tc changes are not so frequent)


You mean riscv-gnu-toolchain changes?  That's not super useful for GCC 
development, they're on a fork.



I think the most pressing need is bleeding edge gcc regression tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.


FWIW rivos gitlab CI (not public) has capability to track upstream gcc
(Kevin almost has it working), but there is no easy way to publish it
for rest of the world and I'd rather that be done in a public infra.


+Kevin

At least having the failure lists public would be a must-have, and I 
think that's tricky to do with gitlab.  Bjorn and Conor have something 
glued to the kernel patchwork that uploads test results to github as 
snippits, but IIRC we're trying to replace it with something more 
directly visible.



Didn't ISCAS/PLCT have such infra - sorry Kito asked the same question
this morning, but I was not fully awoke so don't remember what Jiawei
replied.


I didn't even remember he asked ;)


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:00:12 PDT (-0700), Jeff Law wrote:



On 5/16/23 19:29, Palmer Dabbelt wrote:




I think the most pressing need is bleeding edge gcc regression tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.

Correct.  More precisely, the riscv64 builds hang.  Not sure if it's
stage2 or stage3 of the bootstrap.  Been happening for the last couple
weeks.  I suspect some codegen bug in the riscv port.  I'll have to
bisect it which will be quite painful.


Can anyone else do it?  If the only blocker for having an upstream 
regression CI thing is just sorting out why it broke over the last few 
weeks then I'm happy to try and trick someone around here into doing 
some work...


Re: Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:07:01 PDT (-0700), juzhe.zh...@rivai.ai wrote:

Oh, I see. Kito has add /* { dg-do run { target { riscv_vector } } } */
But not all RVV tests has use this and I not sure whether it can work.
I think Kito can answer it.
If yes, I think we should add all of them.


Unless I'm missing something, it looks like that only checks if GCC is 
compiling for V.  Nothing appears to be checking if the system the tests 
are running on supports V.


   # Return 1 if the target has RISC-V vector extension, 0 otherwise.
   # Cache the result.
   
   proc check_effective_target_riscv_vector { } {

   # Check that we are compiling for v by checking the __riscv_v marco.
   return [check_no_compiler_messages riscv_vector assembly {
  #if !defined(__riscv_v)
  #error "__riscv_v not defined!"
  #endif
   }]
   }

Those are really just two different things.

It seems pretty reasonably to me to just avoid running the tests when 
the DUT lacks V, but I'm never great with DG.  We should probably add 
similar checks for the other ISA extensions, there's going to be a bunch 
of this.




Thanks.


juzhe.zh...@rivai.ai
 
From: Andrew Pinski

Date: 2023-05-17 10:02
To: juzhe.zh...@rivai.ai
CC: gcc-patches; palmer; Kito.cheng
Subject: Re: RISC-V Test Errors and Failures
On Tue, May 16, 2023 at 6:58 PM juzhe.zh...@rivai.ai
 wrote:


Hi, Palmer.
I saw your patch showed there are a lot of run time fail (execution fail) of 
C++.
bug-*.C

These tests are RVV api intrinsics tests coming from Kito's that I have already 
fixed all of them.
I just double checked again they all passed.
I think it may be your regression environment does not set up simulator (QEMU 
or SPIKE or GEM5) correctly.
For example, did not enable vector extension in simulator, I don't you may try.
 
So on x86_64, we test to see if you have the right vector unit before

running those tests? The same thing was true on powerpc (and I think
aarch64 does the same for SVE now too). The reason why I am asking is
that I would need to run the testsuite using the simulator as setup
for the RISCV ISA I am using rather than the one with everything on.
So does the RVV runtime testsuite tests to see if you can run RVV
before running them (or running them and return they passed)?
 
Thanks,

Andrew Pinski
 


Thanks.


juzhe.zh...@rivai.ai
 


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:32:21 PDT (-0700), jeffreya...@gmail.com wrote:



On 5/16/23 20:05, Palmer Dabbelt wrote:

On Tue, 16 May 2023 19:00:12 PDT (-0700), Jeff Law wrote:



On 5/16/23 19:29, Palmer Dabbelt wrote:




I think the most pressing need is bleeding edge gcc regression
tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.

Correct.  More precisely, the riscv64 builds hang.  Not sure if it's
stage2 or stage3 of the bootstrap.  Been happening for the last couple
weeks.  I suspect some codegen bug in the riscv port.  I'll have to
bisect it which will be quite painful.


Can anyone else do it?  If the only blocker for having an upstream
regression CI thing is just sorting out why it broke over the last few
weeks then I'm happy to try and trick someone around here into doing
some work...

Probably easiest for me unless someone else has a chroot environment
handy.  It's not hard to do the bisection, it just involves a lot of
waiting.


By "chroot environment" you mean something like a 
debootstrap-into-chroot with qemu-user/binfmt-misc?  I don't have that 
setup right now, but it wouldn't be a big lift.



I've just about got the my problem from earlier today under control,
then I can probably start bisection.


That's fine with me, I have plenty of other stuff to do ;)


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:46:28 PDT (-0700), Vineet Gupta wrote:

On 5/16/23 19:21, Kito Cheng wrote:

Palmer:

For short-term, this should help your internal test:
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1233


That only helps if using bleeding edge toolchain scripts (which I
regularly do and so did Patrick).

Palmer has a fork of toolchain scripts and I'm assuming he hasn't caught
up to that point ;-)


I'm fine dropping the fork if the bugs have been fixed.  IIRC last week 
we were still waiting for them to merge something?



-Vineet


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:51:48 PDT (-0700), Patrick O'Neill wrote:


On 5/16/23 19:47, Palmer Dabbelt wrote:

On Tue, 16 May 2023 19:46:28 PDT (-0700), Vineet Gupta wrote:

On 5/16/23 19:21, Kito Cheng wrote:

Palmer:

For short-term, this should help your internal test:
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1233


That only helps if using bleeding edge toolchain scripts (which I
regularly do and so did Patrick).

Palmer has a fork of toolchain scripts and I'm assuming he hasn't caught
up to that point ;-)


I'm fine dropping the fork if the bugs have been fixed.  IIRC last
week we were still waiting for them to merge something?

The testsuite was broken last week, but was fixed by
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1247 which was
merged last Friday.

That might be the thing you were thinking about?


Probably, I'll go try and bump stuff and see if it works...

Thanks!


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 20:08:26 PDT (-0700), Vineet Gupta wrote:


On 5/16/23 19:53, Palmer Dabbelt wrote:


Probably, I'll go try and bump stuff and see if it works...


Word of caution: Best to not disturb your existing setup, a try a fresh
checkout first


Even easier, I think I can get away with just

diff --git a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run 
b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
index 94d6ec5..efc3a80 100755
--- a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
+++ b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
@@ -12,4 +12,4 @@ done

xlen="$(readelf -h $1 | grep 'Class' | cut -d: -f 2 | xargs echo | sed 
's/^ELF//')"

-qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
rv$xlen,zba=on,zbb=on,zbc=on,zbs=on "$@"
+qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
rv$xlen,zba=on,zbb=on,zbc=on,zbs=on,v=on "$@"

for now.  I'm going to throw together hwprobe for qemu-user, from looking at
the AVX stuff it should be pretty easy to plumb that into DG and then get the
detection going.


Re: [PATCH] RISC-V: improve codegen for large constants with same 32-bit lo and hi parts [2]

2023-05-19 Thread Palmer Dabbelt

On Fri, 19 May 2023 09:33:34 PDT (-0700), jeffreya...@gmail.com wrote:



On 5/18/23 14:57, Vineet Gupta wrote:

[part #2 of PR/109279]

SPEC2017 deepsjeng uses large constants which currently generates less than
ideal code. This fix improves codegen for large constants which have
same low and hi parts: e.g.

long long f(void) { return 0x0101010101010101ull; }

Before
 li  a5,0x101
 addia5,a5,0x101
 mv  a0,a5
 sllia5,a5,32
 add a0,a5,a0
 ret

With patch
li  a5,0x101
addia5,a5,0x101
sllia0,a5,32
add a0,a0,a5
ret

This is testsuite clean.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_split_integer): if loval is equal
  to hival, ASHIFT the corresponding regs.

LGTM.  Please install.  Thanks for taking care of this!  The updated
sequence looks good.


Works for me.  Did you start that performance backports branch?  Either 
way, I think this should go on it.


Re: [PATCH V2] RISC-V: Fix magic number of RVV auto-vectorization expander

2023-05-23 Thread Palmer Dabbelt
_for_pred_mov (subpart_mode),
-   riscv_vector::RVV_UNOP, operands);
+ rtx operands[] = {subreg, mem, ops[4]};
+ emit_vlmax_insn (code_for_pred_mov (subpart_mode), RVV_UNOP,
+  operands);
}
  else
emit_move_insn (subreg, mem);
@@ -1147,9 +1144,9 @@ expand_tuple_move (rtx *ops)

  if (fractional_p)
{
- rtx operands[3] = {mem, subreg, ops[4]};
- emit_vlmax_insn (code_for_pred_mov (subpart_mode),
-   riscv_vector::RVV_UNOP, operands);
+ rtx operands[] = {mem, subreg, ops[4]};
+ emit_vlmax_insn (code_for_pred_mov (subpart_mode), RVV_UNOP,
+  operands);
}
  else
emit_move_insn (mem, subreg);
@@ -1281,8 +1278,8 @@ expand_vector_init_insert_elems (rtx target, const 
rvv_builder &builder,
   unsigned int unspec
= FLOAT_MODE_P (mode) ? UNSPEC_VFSLIDE1DOWN : UNSPEC_VSLIDE1DOWN;
   insn_code icode = code_for_pred_slide (unspec, mode);
-  rtx ops[3] = {target, target, builder.elt (i)};
-  emit_vlmax_insn (icode, riscv_vector::RVV_BINOP, ops);
+  rtx ops[] = {target, target, builder.elt (i)};
+  emit_vlmax_insn (icode, RVV_BINOP, ops);
 }
 }

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index e7300b2e97c..09fc9e5d95e 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7396,7 +7396,7 @@ vector_zero_call_used_regs (HARD_REG_SET 
need_zeroed_hardregs)
  emitted_vlmax_vsetvl = true;
}

- rtx ops[3] = {target, CONST0_RTX (mode), vl};
+ rtx ops[] = {target, CONST0_RTX (mode), vl};
  riscv_vector::emit_vlmax_insn (code_for_pred_mov (mode),
 riscv_vector::RVV_UNOP, ops);


Reviewed-by: Palmer Dabbelt 

as both cleanups look better to me.  Thanks!


Re: Re: [PATCH V2] RISC-V: Fix magic number of RVV auto-vectorization expander

2023-05-23 Thread Palmer Dabbelt

On Tue, 23 May 2023 18:34:00 PDT (-0700), juzhe.zh...@rivai.ai wrote:

Yeah. Can I merge it?


You built it?  Then I'm fine with merging it.





juzhe.zh...@rivai.ai
 
From: Palmer Dabbelt

Date: 2023-05-24 09:32
To: juzhe.zhong
CC: gcc-patches; Kito Cheng; kito.cheng; jeffreyalaw; rdapp.gcc; juzhe.zhong
Subject: Re: [PATCH V2] RISC-V: Fix magic number of RVV auto-vectorization 
expander
On Tue, 23 May 2023 18:28:48 PDT (-0700), juzhe.zh...@rivai.ai wrote:

From: Juzhe-Zhong 

This simple patch fixes the magic number, remove magic number make codes more 
reasonable.

Ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vec_series): Remove magic number.
(expand_const_vector): Ditto.
(legitimize_move): Ditto.
(sew64_scalar_helper): Ditto.
(expand_tuple_move): Ditto.
(expand_vector_init_insert_elems): Ditto.
* config/riscv/riscv.cc (vector_zero_call_used_regs): Ditto.

---
 gcc/config/riscv/riscv-v.cc | 53 +
 gcc/config/riscv/riscv.cc   |  2 +-
 2 files changed, 26 insertions(+), 29 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 478a052a779..fa61a850a22 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -406,14 +406,14 @@ expand_vec_series (rtx dest, rtx base, rtx step)
   int shift = exact_log2 (INTVAL (step));
   rtx shift_amount = gen_int_mode (shift, Pmode);
   insn_code icode = code_for_pred_scalar (ASHIFT, mode);
-   rtx ops[3] = {step_adj, vid, shift_amount};
-   emit_vlmax_insn (icode, riscv_vector::RVV_BINOP, ops);
+   rtx ops[] = {step_adj, vid, shift_amount};
+   emit_vlmax_insn (icode, RVV_BINOP, ops);
 
Looks like it also removes the "riscv_vector" namespace from some of the 
constants?  No big deal, it's just a different cleanup (assuming it 
still builds and such).
 

 }
   else
 {
   insn_code icode = code_for_pred_scalar (MULT, mode);
-   rtx ops[3] = {step_adj, vid, step};
-   emit_vlmax_insn (icode, riscv_vector::RVV_BINOP, ops);
+   rtx ops[] = {step_adj, vid, step};
+   emit_vlmax_insn (icode, RVV_BINOP, ops);
 }
 }

@@ -428,8 +428,8 @@ expand_vec_series (rtx dest, rtx base, rtx step)
 {
   rtx result = gen_reg_rtx (mode);
   insn_code icode = code_for_pred_scalar (PLUS, mode);
-  rtx ops[3] = {result, step_adj, base};
-  emit_vlmax_insn (icode, riscv_vector::RVV_BINOP, ops);
+  rtx ops[] = {result, step_adj, base};
+  emit_vlmax_insn (icode, RVV_BINOP, ops);
   emit_move_insn (dest, result);
 }
 }
@@ -445,8 +445,8 @@ expand_const_vector (rtx target, rtx src)
   gcc_assert (
 const_vec_duplicate_p (src, &elt)
 && (rtx_equal_p (elt, const0_rtx) || rtx_equal_p (elt, const1_rtx)));
-  rtx ops[2] = {target, src};
-  emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP, ops);
+  rtx ops[] = {target, src};
+  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
   return;
 }

@@ -458,16 +458,14 @@ expand_const_vector (rtx target, rtx src)
 we use vmv.v.i instruction.  */
   if (satisfies_constraint_vi (src) || satisfies_constraint_Wc0 (src))
 {
-   rtx ops[2] = {tmp, src};
-   emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP,
-ops);
+   rtx ops[] = {tmp, src};
+   emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
 }
   else
 {
   elt = force_reg (elt_mode, elt);
-   rtx ops[2] = {tmp, elt};
-   emit_vlmax_insn (code_for_pred_broadcast (mode),
-riscv_vector::RVV_UNOP, ops);
+   rtx ops[] = {tmp, elt};
+   emit_vlmax_insn (code_for_pred_broadcast (mode), RVV_UNOP, ops);
 }

   if (tmp != target)
@@ -536,9 +534,8 @@ legitimize_move (rtx dest, rtx src)
   rtx tmp = gen_reg_rtx (mode);
   if (MEM_P (src))
 {
-   rtx ops[2] = {tmp, src};
-   emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP,
-ops);
+   rtx ops[] = {tmp, src};
+   emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
 }
   else
 emit_move_insn (tmp, src);
@@ -548,8 +545,8 @@ legitimize_move (rtx dest, rtx src)
   if (satisfies_constraint_vu (src))
 return false;

-  rtx ops[2] = {dest, src};
-  emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP, ops);
+  rtx ops[] = {dest, src};
+  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
   return true;
 }

@@ -813,7 +810,7 @@ sew64_scalar_helper (rtx *operands, rtx *scalar_op, rtx vl,
 *scalar_op = force_reg (scalar_mode, *scalar_op);

   rtx tmp = gen_reg_rtx (vector_mode);
-  rtx ops[3] = {tmp, *scalar_op, vl};
+  rtx ops[] = {tmp, *scalar_op, vl};
   riscv_vector::emit_nonvlmax_insn (code_for_pred_broadcast (vector_mode),
 riscv_vector::RVV_UNOP, ops);
   emit_vector_func (operands, tmp);
@@ -1122,9 +1119,9 @@ expand_tuple_move (rtx *ops)

   if (fractional_p)
 {
-   rtx operands[3] = {subreg, mem, ops[4]};
-   emit_vlmax_insn (code_for_pred_mov (subpart_mode),
- riscv_vect

Re: Re: [PATCH V2] RISC-V: Fix magic number of RVV auto-vectorization expander

2023-05-23 Thread Palmer Dabbelt

On Tue, 23 May 2023 18:37:39 PDT (-0700), juzhe.zh...@rivai.ai wrote:

Yes, I built it and regression has passed.


OK, thanks!





juzhe.zh...@rivai.ai
 
From: Palmer Dabbelt

Date: 2023-05-24 09:37
To: juzhe.zhong
CC: gcc-patches; Kito Cheng; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: Re: [PATCH V2] RISC-V: Fix magic number of RVV auto-vectorization 
expander
On Tue, 23 May 2023 18:34:00 PDT (-0700), juzhe.zh...@rivai.ai wrote:

Yeah. Can I merge it?
 
You built it?  Then I'm fine with merging it.
 




juzhe.zh...@rivai.ai
 
From: Palmer Dabbelt

Date: 2023-05-24 09:32
To: juzhe.zhong
CC: gcc-patches; Kito Cheng; kito.cheng; jeffreyalaw; rdapp.gcc; juzhe.zhong
Subject: Re: [PATCH V2] RISC-V: Fix magic number of RVV auto-vectorization 
expander
On Tue, 23 May 2023 18:28:48 PDT (-0700), juzhe.zh...@rivai.ai wrote:

From: Juzhe-Zhong 

This simple patch fixes the magic number, remove magic number make codes more 
reasonable.

Ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vec_series): Remove magic number.
(expand_const_vector): Ditto.
(legitimize_move): Ditto.
(sew64_scalar_helper): Ditto.
(expand_tuple_move): Ditto.
(expand_vector_init_insert_elems): Ditto.
* config/riscv/riscv.cc (vector_zero_call_used_regs): Ditto.

---
 gcc/config/riscv/riscv-v.cc | 53 +
 gcc/config/riscv/riscv.cc   |  2 +-
 2 files changed, 26 insertions(+), 29 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 478a052a779..fa61a850a22 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -406,14 +406,14 @@ expand_vec_series (rtx dest, rtx base, rtx step)
   int shift = exact_log2 (INTVAL (step));
   rtx shift_amount = gen_int_mode (shift, Pmode);
   insn_code icode = code_for_pred_scalar (ASHIFT, mode);
-   rtx ops[3] = {step_adj, vid, shift_amount};
-   emit_vlmax_insn (icode, riscv_vector::RVV_BINOP, ops);
+   rtx ops[] = {step_adj, vid, shift_amount};
+   emit_vlmax_insn (icode, RVV_BINOP, ops);
 
Looks like it also removes the "riscv_vector" namespace from some of the 
constants?  No big deal, it's just a different cleanup (assuming it 
still builds and such).
 

 }
   else
 {
   insn_code icode = code_for_pred_scalar (MULT, mode);
-   rtx ops[3] = {step_adj, vid, step};
-   emit_vlmax_insn (icode, riscv_vector::RVV_BINOP, ops);
+   rtx ops[] = {step_adj, vid, step};
+   emit_vlmax_insn (icode, RVV_BINOP, ops);
 }
 }

@@ -428,8 +428,8 @@ expand_vec_series (rtx dest, rtx base, rtx step)
 {
   rtx result = gen_reg_rtx (mode);
   insn_code icode = code_for_pred_scalar (PLUS, mode);
-  rtx ops[3] = {result, step_adj, base};
-  emit_vlmax_insn (icode, riscv_vector::RVV_BINOP, ops);
+  rtx ops[] = {result, step_adj, base};
+  emit_vlmax_insn (icode, RVV_BINOP, ops);
   emit_move_insn (dest, result);
 }
 }
@@ -445,8 +445,8 @@ expand_const_vector (rtx target, rtx src)
   gcc_assert (
 const_vec_duplicate_p (src, &elt)
 && (rtx_equal_p (elt, const0_rtx) || rtx_equal_p (elt, const1_rtx)));
-  rtx ops[2] = {target, src};
-  emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP, ops);
+  rtx ops[] = {target, src};
+  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
   return;
 }

@@ -458,16 +458,14 @@ expand_const_vector (rtx target, rtx src)
 we use vmv.v.i instruction.  */
   if (satisfies_constraint_vi (src) || satisfies_constraint_Wc0 (src))
 {
-   rtx ops[2] = {tmp, src};
-   emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP,
-ops);
+   rtx ops[] = {tmp, src};
+   emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
 }
   else
 {
   elt = force_reg (elt_mode, elt);
-   rtx ops[2] = {tmp, elt};
-   emit_vlmax_insn (code_for_pred_broadcast (mode),
-riscv_vector::RVV_UNOP, ops);
+   rtx ops[] = {tmp, elt};
+   emit_vlmax_insn (code_for_pred_broadcast (mode), RVV_UNOP, ops);
 }

   if (tmp != target)
@@ -536,9 +534,8 @@ legitimize_move (rtx dest, rtx src)
   rtx tmp = gen_reg_rtx (mode);
   if (MEM_P (src))
 {
-   rtx ops[2] = {tmp, src};
-   emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP,
-ops);
+   rtx ops[] = {tmp, src};
+   emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
 }
   else
 emit_move_insn (tmp, src);
@@ -548,8 +545,8 @@ legitimize_move (rtx dest, rtx src)
   if (satisfies_constraint_vu (src))
 return false;

-  rtx ops[2] = {dest, src};
-  emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP, ops);
+  rtx ops[] = {dest, src};
+  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
   return true;
 }

@@ -813,7 +810,7 @@ sew64_scalar_helper (rtx *operands, rtx *scalar_op, rtx vl,
 *scalar_op = force_reg (scalar_mode, *scalar_op);

   rtx tmp = gen_reg_rtx (vector_mode);
-  rtx ops[3] = {tmp, *scalar_op, vl};
+  

Re: [PATCH] RISC-V: Add missing torture-init and torture-finish for rvv.exp

2023-05-24 Thread Palmer Dabbelt

On Wed, 24 May 2023 16:12:20 PDT (-0700), Vineet Gupta wrote:



On 5/24/23 15:13, Vineet Gupta wrote:


PASS: gcc.target/riscv/zmmul-2.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
PASS: gcc.target/riscv/zmmul-2.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects   scan-assembler-times mul\t 1
PASS: gcc.target/riscv/zmmul-2.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects   scan-assembler-not div\t
PASS: gcc.target/riscv/zmmul-2.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects   scan-assembler-not rem\t
testcase
/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/riscv.exp
completed in 60 seconds
Running
/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
...
ERROR: tcl error sourcing
/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp.
ERROR: tcl error code NONE
ERROR: torture-init: torture_without_loops is not empty as expected
    while executing
"error "torture-init: torture_without_loops is not empty as expected""
    invoked from within
"if [info exists torture_without_loops] {
    error "torture-init: torture_without_loops is not empty as expected"
    }"
    (procedure "torture-init" line 4)
    invoked from within
"torture-init"
    (file
"/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp"
line 42)
    invoked from within
"source
/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 source
/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp"
    invoked from within
"catch "uplevel #0 source $test_file_name" msg"
UNRESOLVED: testcase
'/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp'
aborted due to Tcl error
testcase
/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
completed in 0 seconds
Running
/scratch/vineetg/gnu/toolchain-upstream/gcc/gcc/testsuite/gcc.target/rl78/rl78.exp
...
...



Never mind. Looks like I found the issue - with just trial and error and
no idea of how this stuff works.
The torture-{init,finish} needs to be in riscv.exp not rvv.exp
Running full tests now.


Thanks!



-Vineet


Re: [PATCH] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

2021-08-16 Thread Palmer Dabbelt

On Mon, 16 Aug 2021 03:02:42 PDT (-0700), Kito Cheng wrote:

HI Christoph:

Could you submit v3 patch which is v1 with overlap_op_by_pieces field,
testcase from v2 and add a few more comments to describe the field?

And add an -mtune=ultra-size to make it able to test without change
other behavior?

Hi Palmer:

Are you OK with that?


I'm still not convinced on the performance: like Andrew and I pointed 
out, this is a difficult case for pipelines of this flavor to handle.  
Nobody here knows anything about this pipeline deeply enough to say 
anything difinitive, though, so this is really just a guess.


As I'm not convinced this is an obvious performance win I'm not going to 
merge it without a benchmark.  If you're convinced and want to merge it 
that's fine, I don't really care about the performance fo the C906 and 
if someone complains we can always just revert it later.



On Sat, Aug 14, 2021 at 1:54 AM Christoph Müllner via Gcc-patches
 wrote:


Ping.

On Thu, Aug 5, 2021 at 11:11 AM Christoph Müllner  wrote:
>
> Ping.
>
> On Thu, Jul 29, 2021 at 9:36 PM Christoph Müllner  
wrote:
> >
> > On Thu, Jul 29, 2021 at 8:54 PM Palmer Dabbelt  wrote:
> > >
> > > On Tue, 27 Jul 2021 02:32:12 PDT (-0700), cmuell...@gcc.gnu.org wrote:
> > > > Ok, so if I understand correctly Palmer and Andrew prefer
> > > > overlap_op_by_pieces to be controlled
> > > > by its own field in the riscv_tune_param struct and not by the field
> > > > slow_unaligned_access in this struct
> > > > (i.e. slow_unaligned_access==false is not enough to imply
> > > > overlap_op_by_pieces==true).
> > >
> > > I guess, but I'm not really worried about this at that level of detail
> > > right now.  It's not like the tune structures form any sort of external
> > > interface we have to keep stable, we can do whatever we want with those
> > > fields so I'd just aim for encoding the desired behavior as simply as
> > > possible rather than trying to build something extensible.
> > >
> > > There are really two questions we need to answer: is this code actually
> > > faster for the C906, and is this what the average users wants under -Os.
> >
> > I never mentioned -Os.
> > My main goal is code compiled for -O2, -O3 or even -Ofast.
> > And I want to execute code as fast as possible.
> >
> > Loading hot data from cache is faster when being done by a single
> > load-word instruction than 4 load-byte instructions.
> > Less instructions implies less pressure for the instruction cache.
> > Less instructions implies less work for a CPU pipeline.
> > Architectures, which don't have a penalty for unaligned accesses
> > therefore observe a performance benefit.
> >
> > What I understand from Andrew's email is that it is not that simple
> > and implementation might have a penalty for overlapping accesses
> > that is high enough to avoid them. I don't have the details for C906,
> > so I can't say if that's the case.
> >
> > > That first one is pretty easy: just running those simple code sequences
> > > under a sweep of page offsets should be sufficient to determine if this
> > > is always faster (in which case it's an easy yes), if it's always slower
> > > (an easy no), or if there's some slow cases like page/cache line
> > > crossing (in which case we'd need to think a bit).
> > >
> > > The second one is a bit tricker.  In the past we'd said these sort of
> > > "actively misalign accesses to generate smaller code" sort of thing
> > > isn't suitable for -Os (as most machines still have very slow unaligned
> > > accesses) but is suitable for -Oz (don't remember if that ever ended up
> > > in GCC, though).  That still seems like a reasonable decision, but if it
> > > turns out that implementations with fast unaligned accesses become the
> > > norm then it'd probably be worth revisiting it.  Not sure exactly how to
> > > determine that tipping point, but I think we're a long way away from it
> > > right now.
> > >
> > > IMO it's really just premature to try and design an encoding of the
> > > tuning paramaters until we have an idea of what they are, as we'll just
> > > end up devolving down the path of trying to encode all possible hardware
> > > and that's generally a huge waste of time.  Since there's no ABI here we
> > > can refactor this however we want as new tunings show up.
> >
> > I guess you mean that there needs

Re: [PATCH] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

2021-08-16 Thread Palmer Dabbelt

On Mon, 16 Aug 2021 09:29:16 PDT (-0700), Kito Cheng wrote:

> Could you submit v3 patch which is v1 with overlap_op_by_pieces field,
> testcase from v2 and add a few more comments to describe the field?
>
> And add an -mtune=ultra-size to make it able to test without change
> other behavior?
>
> Hi Palmer:
>
> Are you OK with that?

I'm still not convinced on the performance: like Andrew and I pointed
out, this is a difficult case for pipelines of this flavor to handle.
Nobody here knows anything about this pipeline deeply enough to say
anything difinitive, though, so this is really just a guess.


So with an extra field to indicate should resolve that?
I believe people should only set overlap_op_by_pieces
to true only if they are sure it has benefits.


My only issue there is that we'd have no way to turn it on, but see 
below...



As I'm not convinced this is an obvious performance win I'm not going to
merge it without a benchmark.  If you're convinced and want to merge it
that's fine, I don't really care about the performance fo the C906 and
if someone complains we can always just revert it later.


I suppose Christoph has tried with their internal processor, and it's
benefit on performance,
but it can't be open-source yet, so v2 patch set using C906 to demo
and test that since that is
the only processor with slow_unaligned_access=False.


Well, that's a very different discussion.  The C906 tuning model should 
be for the C906, not a proxy for some internal-only processor.  If the 
goal here is to allow this pass to be flipped on by an out-of-tree 
pipeline model then we can talk about it.



I agree on the C906 part, we never know it's benefit or not, so I propose
adding one -mtune=ultra-size to make this test-able rather than changing C906.


That's essentially the same conclusion we came to last time this came 
up, except that we were calling it "-Oz" (because LLVM does).  I guess 
we never got around having the broader GCC discussion about "-Oz".  IIRC 
we had some other "-Oz" candidates we never got around to dealing with, 
but that was a while ago so I'm not sure if any of that panned out.


Re: [PATCH] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

2021-08-16 Thread Palmer Dabbelt

On Mon, 16 Aug 2021 11:56:05 PDT (-0700), pins...@gmail.com wrote:

On Mon, Aug 16, 2021 at 10:10 AM Palmer Dabbelt  wrote:


On Mon, 16 Aug 2021 09:29:16 PDT (-0700), Kito Cheng wrote:
>> > Could you submit v3 patch which is v1 with overlap_op_by_pieces field,
>> > testcase from v2 and add a few more comments to describe the field?
>> >
>> > And add an -mtune=ultra-size to make it able to test without change
>> > other behavior?
>> >
>> > Hi Palmer:
>> >
>> > Are you OK with that?
>>
>> I'm still not convinced on the performance: like Andrew and I pointed
>> out, this is a difficult case for pipelines of this flavor to handle.
>> Nobody here knows anything about this pipeline deeply enough to say
>> anything difinitive, though, so this is really just a guess.
>
> So with an extra field to indicate should resolve that?
> I believe people should only set overlap_op_by_pieces
> to true only if they are sure it has benefits.

My only issue there is that we'd have no way to turn it on, but see
below...

>> As I'm not convinced this is an obvious performance win I'm not going to
>> merge it without a benchmark.  If you're convinced and want to merge it
>> that's fine, I don't really care about the performance fo the C906 and
>> if someone complains we can always just revert it later.
>
> I suppose Christoph has tried with their internal processor, and it's
> benefit on performance,
> but it can't be open-source yet, so v2 patch set using C906 to demo
> and test that since that is
> the only processor with slow_unaligned_access=False.

Well, that's a very different discussion.  The C906 tuning model should
be for the C906, not a proxy for some internal-only processor.  If the
goal here is to allow this pass to be flipped on by an out-of-tree
pipeline model then we can talk about it.

> I agree on the C906 part, we never know it's benefit or not, so I propose
> adding one -mtune=ultra-size to make this test-able rather than changing C906.

That's essentially the same conclusion we came to last time this came
up, except that we were calling it "-Oz" (because LLVM does).  I guess
we never got around having the broader GCC discussion about "-Oz".  IIRC
we had some other "-Oz" candidates we never got around to dealing with,
but that was a while ago so I'm not sure if any of that panned out.


-Oz was a bad idea that Apple came up because GCC decided to start
emitting store multiple on PowerPC around 13 years ago.
I don't think we should repeat that mistake for GCC and especially for RISCV.
If people want to optimize for size, they get the performance issues.


Makes sense.  Probably best to avoid adding the RISC-V specific version 
of this as well, then, as it's really just two sides of the same coin.


Sounds like we'll likely want to stop implementing -Os via a tuning on 
RISC-V: that was a convienent way to do it wen we didn't have any 
conflicts between -O and -mtune, but assuming this will eventually land 
that won't be valid any more.  That's a pretty mechinacial process.


It still leaves us with the question of what to do with this pass, which 
IMO really just depends on what the actual goal is here: if we're trying 
to optimize for the C906 then we should just wait for the benchmarks to 
demorstrate this is worth doing (though again, Kito, if you think this 
is good enough and want to flip this on I don't really care that much), 
but if we're trying to optimize for some other pipeline then we should 
really wait for that to show up.


I'm not going to speculate about what this new pipeline is, but if 
there's anything concrete announced about it then I'm happy to take a 
look.  Historically we've never been super strict about waiting for 
hardware before taking a pipeline model, but I do think we should have 
something as just trying to support any hypothetical future hardware 
will lead to insanity.  IMO we need to be extra explicit that we're 
willing to work with hardware vendors, as due to the nature of RISC-V 
that can get lost in translation, but there has to be some balance.


Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED

2022-09-17 Thread Palmer Dabbelt

On Fri, 16 Sep 2022 16:48:24 PDT (-0700), gcc-patches@gcc.gnu.org wrote:


On 9/6/22 05:39, Alexander Monakov via Gcc-patches wrote:

On Mon, 5 Sep 2022, Philipp Tomsich wrote:


+riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep)
+{
+  /* On 64-bit targets, SImode register values are sign-extended to DImode.  */
+  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
+return SIGN_EXTEND;

I think this leads to a counter-intuitive requirement that a hand-written
inline asm must sign-extend its output operands that are bound to either
signed or unsigned 32-bit lvalues. Will compiler users be aware of that?


Is this significantly different than on MIPS?  Hand-written code there
also has to ensure that the results are properly sign extended and it's
been that way for 20+ years since the introduction of mips64 IIRC. 
Though I don't think we had MODE_REP_EXTENDED that long.


IMO the problem isn't so much that asm has this constraint, it's that 
it's a new constraint and thus risks breaking code that used to work.  
That said...



Haha, MIPS is the only target that currently defines
TARGET_MODE_REP_EXTENDED :-)





Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause
miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing
hook says that nothing needs to be done to truncate, but the new hook says
that the result of the truncation is properly sign-extended.

The documentation for TARGET_MODE_REP_EXTENDED warns about that:

 In order to enforce the representation of mode, 
TARGET_TRULY_NOOP_TRUNCATION
 should return false when truncating to mode.


This may well need adjusting in Philipp's patch.   I'd be surprised if
the MIPS definition wasn't usable nearly verbatim here.


Yes, and we have a few MIPS-isms in the ISA but don't have the same 
flavor of TRULY_NOOP_TRUNCATION.  It's been pointed out a handful of 
times and I'm not sure what the right way to go is here, every time I 
try and reason about which is going to produce better code I come up 
with a different answer.  IIRC last time I looked at this I came to the 
conclusion that we're doing the right thing for RISC-V because most of 
our instructions implicitly truncate.  It's pretty easy to generate bad 
code here and I'm pretty sure we could fix some of that by moving to a 
more MIPS-like TRULY_MODE_TRUNCATION, but I think we'd end up just 
pushing the problems around.


Every time I look at this I also get worried that we've leaked some of 
these internal promotion rules into something visible to inline asm, but 
when I poke around it seems like things generally work.







jeff


Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-09-17 Thread Palmer Dabbelt

On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:

Hello,
Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
existing instruction fcvt, but rather calls ceil and floor from the
library. This patch adds the missing iterator and attributes for lceil and
lfloor to produce the optimized code.
 The test cases check the correct generation of the fcvt instruction for
float/double to int/long/long long. Passed the test in riscv-linux.
Could this patch be committed?


Reviewed-by: Palmer Dabbelt 
Acked-by: Palmer Dabbelt 

Not sure if Kito had any comments for this one, but it looks good to me.


gcc/ChangeLog:
   Michael Collison  
* config/riscv/riscv.md (RINT): Add iterator for lceil and lround.
(rint_pattern): Add ceil and floor.
(rint_rm): Add rup and rdn.

gcc/testsuite/ChangeLog:
Kevin Lee  
* gcc.target/riscv/lfloor-lceil.c: New test.
---
 gcc/config/riscv/riscv.md | 13 ++-
 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
 2 files changed, 88 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c6399b1389e..070004fa7fe 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -43,6 +43,9 @@ (define_c_enum "unspec" [
   UNSPEC_LRINT
   UNSPEC_LROUND

+  UNSPEC_LCEIL
+  UNSPEC_LFLOOR
+
   ;; Stack tie
   UNSPEC_TIE
 ])
@@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
 ;; the controlling mode.
 (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])

-;; Iterator and attributes for floating-point rounding instructions.
-(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
-(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
"round")])
-(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
+;; Iterator and attributes for floating-point rounding instructions.f
+(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
UNSPEC_LFLOOR])
+(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
"round")
+ (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
"floor")])
+(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
+(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])

 ;; Iterator and attributes for quiet comparisons.
 (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
new file mode 100644
index 000..4d81c12cefa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int
+ceil1(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil2(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil3(float i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+ceil4(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil5(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil6(double i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+floor1(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor2(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor3(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+int
+floor4(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor5(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor6(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+/* { dg-final { scan-assembler-times "fcvt.l.s" 6 } } */
+/* { dg-final { scan-assembler-times "fcvt.l.d" 6 } } */
+/* { dg-final { scan-assembler-not "call" } } */


Re: [PATCH] Document -fexcess-precision=16 in tm.texi

2022-09-18 Thread Palmer Dabbelt

On Fri, 09 Sep 2022 02:46:40 PDT (-0700), Palmer Dabbelt wrote:

I just happened to stuble on this one while trying to sort out the
RISC-V bits.

gcc/ChangeLog

* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.
---
 gcc/doc/tm.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 858bfb80cec..7590924f2ca 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1009,7 +1009,7 @@ of the excess precision explicitly added.  For
 @code{EXCESS_PRECISION_TYPE_FLOAT16}, and
 @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the
 explicit excess precision that should be added depending on the
-value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.
+value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{|}16@r{]}}.
 Note that unpredictable explicit excess precision does not make sense,
 so a target should never return @code{FLT_EVAL_METHOD_UNPREDICTABLE}
 when @var{type} is @code{EXCESS_PRECISION_TYPE_STANDARD},


Just pinging this one as I'm not sure if it's OK to self-approve -- no 
rush on my end, I already figured it out so I don't need the 
documentation any more.


Re: [PATCH] RISC-V: Don't try to vectorize tree-ssa/gen-vect-34.c

2022-09-18 Thread Palmer Dabbelt

On Fri, 02 Sep 2022 18:28:10 PDT (-0700), Palmer Dabbelt wrote:

We don't yet support vectorization on RISC-V.

gcc/testsuite/ChangeLog

* gcc.dg/tree-ssa/gen-vect-34.c: Skip RISC-V targets.
---
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c 
b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
index 8d2d36401fe..41877e05efd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
@@ -13,4 +13,4 @@ float summul(int n, float *arg1, float *arg2)
 return res1;
 }

-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
! { avr-*-* pru-*-* } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
! { avr-*-* pru-*-* riscv*-*-* } } } } } */


Committed.


Re: [PATCH] RISC-V: Support -fexcess-precision=16

2022-09-30 Thread Palmer Dabbelt

On Fri, 09 Sep 2022 02:56:26 PDT (-0700), Kito Cheng wrote:

LGTM, seems like you have landed now, see you soon :)


Committed.



On Fri, Sep 9, 2022 at 5:44 PM Palmer Dabbelt  wrote:


This fixes f19a327077e ("Support -fexcess-precision=16 which will enable
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.") on
RISC-V targets.

gcc/ChangeLog

PR target/106815
* config/riscv/riscv.cc (riscv_excess_precision): Add support
for EXCESS_PRECISION_TYPE_FLOAT16.
---
 gcc/config/riscv/riscv.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 675d92c0961..9b6d3e95b1b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5962,6 +5962,7 @@ riscv_excess_precision (enum excess_precision_type type)
   return (TARGET_ZFH ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
 : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
 case EXCESS_PRECISION_TYPE_IMPLICIT:
+case EXCESS_PRECISION_TYPE_FLOAT16:
   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
 default:
   gcc_unreachable ();
--
2.34.1


To:  gcc-patches@gcc.gnu.org
CC:  gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] RISC-V: Support -fexcess-precision=16
In-Reply-To: 


On Fri, 09 Sep 2022 02:56:26 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

LGTM, seems like you have landed now, see you soon :)

On Fri, Sep 9, 2022 at 5:44 PM Palmer Dabbelt  wrote:


This fixes f19a327077e ("Support -fexcess-precision=16 which will enable
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.") on
RISC-V targets.

gcc/ChangeLog

PR target/106815
* config/riscv/riscv.cc (riscv_excess_precision): Add support
for EXCESS_PRECISION_TYPE_FLOAT16.
---
 gcc/config/riscv/riscv.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 675d92c0961..9b6d3e95b1b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5962,6 +5962,7 @@ riscv_excess_precision (enum excess_precision_type type)
   return (TARGET_ZFH ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
 : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
 case EXCESS_PRECISION_TYPE_IMPLICIT:
+case EXCESS_PRECISION_TYPE_FLOAT16:
   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
 default:
   gcc_unreachable ();
--
2.34.1



Re: [PATCH] Document -fexcess-precision=16 in tm.texi

2022-09-30 Thread Palmer Dabbelt

On Sat, 24 Sep 2022 19:13:36 PDT (-0700), san...@codesourcery.com wrote:

On 9/18/22 02:47, Palmer Dabbelt wrote:

On Fri, 09 Sep 2022 02:46:40 PDT (-0700), Palmer Dabbelt wrote:

I just happened to stuble on this one while trying to sort out the
RISC-V bits.

gcc/ChangeLog

* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.
---
 gcc/doc/tm.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 858bfb80cec..7590924f2ca 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1009,7 +1009,7 @@ of the excess precision explicitly added.  For
 @code{EXCESS_PRECISION_TYPE_FLOAT16}, and
 @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the
 explicit excess precision that should be added depending on the
-value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.
+value set for
@option{-fexcess-precision=@r{[}standard@r{|}fast@r{|}16@r{]}}.
 Note that unpredictable explicit excess precision does not make sense,
 so a target should never return @code{FLT_EVAL_METHOD_UNPREDICTABLE}
 when @var{type} is @code{EXCESS_PRECISION_TYPE_STANDARD},


Just pinging this one as I'm not sure if it's OK to self-approve -- no
rush on my end, I already figured it out so I don't need the
documentation any more.


This is fine, looks like a trivial correction.


Thanks, committed.


Re: [PATCH] Document -fexcess-precision=16 in tm.texi

2022-09-30 Thread Palmer Dabbelt

On Fri, 30 Sep 2022 15:51:02 PDT (-0700), H.J. Lu wrote:

On Fri, Sep 30, 2022 at 3:25 PM Palmer Dabbelt  wrote:


On Sat, 24 Sep 2022 19:13:36 PDT (-0700), san...@codesourcery.com wrote:
> On 9/18/22 02:47, Palmer Dabbelt wrote:
>> On Fri, 09 Sep 2022 02:46:40 PDT (-0700), Palmer Dabbelt wrote:
>>> I just happened to stuble on this one while trying to sort out the
>>> RISC-V bits.
>>>
>>> gcc/ChangeLog
>>>
>>> * doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.
>>> ---
>>>  gcc/doc/tm.texi | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>>> index 858bfb80cec..7590924f2ca 100644
>>> --- a/gcc/doc/tm.texi
>>> +++ b/gcc/doc/tm.texi
>>> @@ -1009,7 +1009,7 @@ of the excess precision explicitly added.  For
>>>  @code{EXCESS_PRECISION_TYPE_FLOAT16}, and
>>>  @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the
>>>  explicit excess precision that should be added depending on the
>>> -value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.
>>> +value set for
>>> @option{-fexcess-precision=@r{[}standard@r{|}fast@r{|}16@r{]}}.
>>>  Note that unpredictable explicit excess precision does not make sense,
>>>  so a target should never return @code{FLT_EVAL_METHOD_UNPREDICTABLE}
>>>  when @var{type} is @code{EXCESS_PRECISION_TYPE_STANDARD},
>>
>> Just pinging this one as I'm not sure if it's OK to self-approve -- no
>> rush on my end, I already figured it out so I don't need the
>> documentation any more.
>
> This is fine, looks like a trivial correction.

Thanks, committed.


tm.texi is a generated file.  I am checking in this patch to restore bootstrap.


Sorry about that, and thanks for fixing it.


[PATCH] Fix the build of record_edge_info()

2022-09-30 Thread Palmer Dabbelt
As of 1214196da79 ("More gimple const/copy propagation opportunities"),
I'm getting some build failures during bootstrap

../../gcc/tree-ssa-dom.cc: In function ‘void record_edge_info(basic_block)’:
../../gcc/tree-ssa-dom.cc:689:27: error: ‘dst’ was not declared in this 
scope; did you mean ‘dse’?
  689 |   if (dst == PHI_ARG_DEF (phi, !alternative))
  |   ^~~
  |   dse
In file included from ../../gcc/gimple-ssa.h:24,
 from ../../gcc/ssa.h:27,
 from ../../gcc/tree-ssa-dom.cc:28:
../../gcc/tree-ssa-dom.cc:689:47: error: ‘phi’ was not declared in this 
scope; did you mean ‘gphi’?
  689 |   if (dst == PHI_ARG_DEF (phi, !alternative))
  |   ^~~
../../gcc/tree-ssa-operands.h:82:54: note: in definition of macro 
‘PHI_ARG_DEF’
   82 | #define PHI_ARG_DEF(PHI, I) gimple_phi_arg_def ((PHI), (I))
  |

I've never looked at this stuff before so I've sort of just pattern
matched this, it at least fixes the build.  Happy to go try and
understand what's going on here, but I'm in the middle of a few things
so I figured it'd be better to just send it along in case anyone else is
running into the same issue -- it's more of a bug report than a fix,
though.

gcc/ChangeLog

* tree-ssa-dom.c (record_edge_info): Move the alternative check
below the phi definition.
---
 gcc/tree-ssa-dom.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc
index 8d8312ca350..e6b8dace5e9 100644
--- a/gcc/tree-ssa-dom.cc
+++ b/gcc/tree-ssa-dom.cc
@@ -684,11 +684,6 @@ record_edge_info (basic_block bb)
   !gsi_end_p (gsi);
   gsi_next (&gsi))
{
- /* If the other alternative is the same as the result,
-then this is a degenerate and can be ignored.  */
- if (dst == PHI_ARG_DEF (phi, !alternative))
-   continue;
-
  /* Now get the EDGE_INFO class so we can append
 it to our list.  We want the successor edge
 where the destination is not the source of
@@ -697,6 +692,11 @@ record_edge_info (basic_block bb)
  tree src = PHI_ARG_DEF (phi, alternative);
  tree dst = PHI_RESULT (phi);
 
+ /* If the other alternative is the same as the result,
+then this is a degenerate and can be ignored.  */
+ if (dst == PHI_ARG_DEF (phi, !alternative))
+   continue;
+
  if (EDGE_SUCC (bb, 0)->dest
  != EDGE_PRED (bb, !alternative)->src)
edge_info = (class edge_info *)EDGE_SUCC (bb, 0)->aux;
-- 
2.34.1



Re: [PATCH] Fix the build of record_edge_info()

2022-09-30 Thread Palmer Dabbelt

On Fri, 30 Sep 2022 18:01:00 PDT (-0700), jeffreya...@gmail.com wrote:


On 9/30/22 18:57, Palmer Dabbelt wrote:

As of 1214196da79 ("More gimple const/copy propagation opportunities"),
I'm getting some build failures during bootstrap

 ../../gcc/tree-ssa-dom.cc: In function ‘void 
record_edge_info(basic_block)’:
 ../../gcc/tree-ssa-dom.cc:689:27: error: ‘dst’ was not declared in this 
scope; did you mean ‘dse’?
   689 |   if (dst == PHI_ARG_DEF (phi, !alternative))
   |   ^~~
   |   dse
 In file included from ../../gcc/gimple-ssa.h:24,
  from ../../gcc/ssa.h:27,
  from ../../gcc/tree-ssa-dom.cc:28:
 ../../gcc/tree-ssa-dom.cc:689:47: error: ‘phi’ was not declared in this 
scope; did you mean ‘gphi’?
   689 |   if (dst == PHI_ARG_DEF (phi, !alternative))
   |   ^~~
 ../../gcc/tree-ssa-operands.h:82:54: note: in definition of macro 
‘PHI_ARG_DEF’
82 | #define PHI_ARG_DEF(PHI, I) gimple_phi_arg_def ((PHI), (I))
   |

I've never looked at this stuff before so I've sort of just pattern
matched this, it at least fixes the build.  Happy to go try and
understand what's going on here, but I'm in the middle of a few things
so I figured it'd be better to just send it along in case anyone else is
running into the same issue -- it's more of a bug report than a fix,
though.

gcc/ChangeLog

* tree-ssa-dom.c (record_edge_info): Move the alternative check
below the phi definition.
---


You got it right, but it's already fixed on the trunk (I pushed the
wrong version of the patch).


Thanks, I must have just had some unlucky timing ;)


Re: [PATCH] Enable shrink wrapping for the RISC-V target.

2022-10-02 Thread Palmer Dabbelt

On Tue, 06 Sep 2022 03:39:02 PDT (-0700), manolis.tsa...@vrull.eu wrote:

This commit implements the target macros (TARGET_SHRINK_WRAP_*) that
enable separate shrink wrapping for function prologues/epilogues in
RISC-V.

Tested against SPEC CPU 2017, this change always has a net-positive
effect on the dynamic instruction count.  See the following table for
the breakdown on how this reduces the number of dynamic instructions
per workload on a like-for-like (i.e., same config file; suppressing
shrink-wrapping with -fno-shrink-wrap):


Does this also pass the regression tests?

(there's also some comments on the code in-line)



 # dynamic instructions
w/o shrink-wrap   w/ shrink-wrap  reduction
500.perlbench_r   12657167865931262156218578 3560568015   0.28%
500.perlbench_r779224795689 76533700902513887786664   1.78%
500.perlbench_r724087331471 71130715252212780178949   1.77%
502.gcc_r  204259864844 194517006339 9742858505   4.77%
502.gcc_r  244047794302 23155583472212491959580   5.12%
502.gcc_r  230896069400 221877703011 9018366389   3.91%
502.gcc_r  192130616624 183856450605 8274166019   4.31%
502.gcc_r  258875074079 2477562032268870853   4.30%
505.mcf_r  662653430325 660678680547 1974749778   0.30%
520.omnetpp_r  985114167068 93419131015450922856914   5.17%
523.xalancbmk_r927037633578 921688937650 5348695928   0.58%
525.x264_r 490953958454 490565583447  388375007   0.08%
525.x264_r19946622944211993171932425 1490361996   0.07%
525.x264_r18976171204501896062750609 1554369841   0.08%
531.deepsjeng_r   1695189878907166930413041125885748496   1.53%
541.leela_r   192594122189790086119828040361024   1.46%
548.exchange2_r   20738162279442073816226729   1215   0.00%
557.xz_r   379572090003 379057409041  514680962   0.14%
557.xz_r   953117469352 952680431430  437037922   0.05%
557.xz_r   536859579650 536456690164  402889486   0.08%
 18421773405376   18223938521833   197834883543   1.07%  totals

Signed-off-by: Manolis Tsamis 

gcc/ChangeLog:

* config/riscv/riscv.cc (struct machine_function): Add array to store
register wrapping information.
(riscv_for_each_saved_reg): Skip registers that are wrapped separetely.
(riscv_get_separate_components): New function.
(riscv_components_for_bb): Likewise.
(riscv_disqualify_components): Likewise.
(riscv_process_components): Likewise.
(riscv_emit_prologue_components): Likewise.
(riscv_emit_epilogue_components): Likewise.
(riscv_set_handled_components): Likewise.
(TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS): Define.
(TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB): Likewise.
(TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS): Likewise.
(TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS): Likewise.
(TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS): Likewise.
(TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/shrink-wrap-1.c: New test.

---

 gcc/config/riscv/riscv.cc | 187 +-
 .../gcc.target/riscv/shrink-wrap-1.c  |  25 +++
 2 files changed, 210 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/shrink-wrap-1.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5a0adffb5ce..3b633149a9a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
+#include "backend.h"
 #include "tm.h"
 #include "rtl.h"
 #include "regs.h"
@@ -52,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "bitmap.h"
 #include "df.h"
+#include "function-abi.h"
 #include "diagnostic.h"
 #include "builtins.h"
 #include "predict.h"
@@ -147,6 +149,11 @@ struct GTY(())  machine_function {

   /* The current frame information, calculated by riscv_compute_frame_info.  */
   struct riscv_frame_info frame;
+
+  /* The components already handled by separate shrink-wrapping, which should
+ not be considered by the prologue and epilogue.  */
+  bool reg_is_wrapped_separately[FIRST_PSEUDO_REGISTER];
+
 };

 /* Information about a single argument.  */
@@ -4209,7 +4216,7 @@ riscv_for_each_saved_reg (HOST_WIDE_INT sp_offset, 
riscv_save_restore_fn fn,
   for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
 if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
   {
-   bool handle_reg = TRUE;
+   bool handle_reg = !cfun->machine->reg_is_wrapped_separately[regno];

/* If this is a normal retur

Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-10-02 Thread Palmer Dabbelt

On Sat, 17 Sep 2022 14:16:13 PDT (-0700), Kito Cheng wrote:

LGTM, thanks, I guess I just missed this before


No worries, I'd just stubmled on it looking through old stuff.

Kevin: Looks like this got corrupted, possibly from copy/paste into 
gmail.  I resurrect it, but there's a floating-point test failure in 
gfortran.  Looks like it predates this, but I'm trying to bisect it to 
at least have a root cause before just ignoring it.  I've got this 
floating around on a branch and hopefully that'll remind me to commit 
it after I sort that out.




Palmer Dabbelt  於 2022年9月17日 週六 23:07 寫道:


On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:
> Hello,
> Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
> existing instruction fcvt, but rather calls ceil and floor from the
> library. This patch adds the missing iterator and attributes for lceil
and
> lfloor to produce the optimized code.
>  The test cases check the correct generation of the fcvt instruction for
> float/double to int/long/long long. Passed the test in riscv-linux.
> Could this patch be committed?

Reviewed-by: Palmer Dabbelt 
Acked-by: Palmer Dabbelt 

Not sure if Kito had any comments for this one, but it looks good to me.

> gcc/ChangeLog:
>Michael Collison  
> * config/riscv/riscv.md (RINT): Add iterator for lceil and
lround.
> (rint_pattern): Add ceil and floor.
> (rint_rm): Add rup and rdn.
>
> gcc/testsuite/ChangeLog:
> Kevin Lee  
> * gcc.target/riscv/lfloor-lceil.c: New test.
> ---
>  gcc/config/riscv/riscv.md | 13 ++-
>  gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
>  2 files changed, 88 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
>
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index c6399b1389e..070004fa7fe 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -43,6 +43,9 @@ (define_c_enum "unspec" [
>UNSPEC_LRINT
>UNSPEC_LROUND
>
> +  UNSPEC_LCEIL
> +  UNSPEC_LFLOOR
> +
>;; Stack tie
>UNSPEC_TIE
>  ])
> @@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
>  ;; the controlling mode.
>  (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
>
> -;; Iterator and attributes for floating-point rounding instructions.
> -(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
> -(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> "round")])
> -(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
> +;; Iterator and attributes for floating-point rounding instructions.f
> +(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
> UNSPEC_LFLOOR])
> +(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> "round")
> + (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
> "floor")])
> +(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
> +(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])
>
>  ;; Iterator and attributes for quiet comparisons.
>  (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET
UNSPEC_FLE_QUIET])
> diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> new file mode 100644
> index 000..4d81c12cefa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> @@ -0,0 +1,79 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc -mabi=lp64d" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> +
> +int
> +ceil1(float i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long
> +ceil2(float i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long long
> +ceil3(float i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +int
> +ceil4(double i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long
> +ceil5(double i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long long
> +ceil6(double i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +int
> +floor1(float i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long
> +floor2(float i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long long
> +floor3(float i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +int
> +floor4(double i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long
> +floor5(double i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long long
> +floor6(double i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +/* { dg-final { scan-assembler-times "fcvt.l.s" 6 } } */
> +/* { dg-final { scan-assembler-times "fcvt.l.d" 6 } } */
> +/* { dg-final { scan-assembler-not "call" } } */



[PATCH] RISC-V: Default to tuning for the thead-c906

2022-10-04 Thread Palmer Dabbelt
The C906 is by far the most widely available RISC-V processor, so let's
default to tuning for it.

gcc/ChangeLog

* config/riscv/riscv.h (RISCV_TUNE_STRING_DEFAULT): Change to
thead-c906.
* doc/invoke.texi (RISC-V -mtune): Change the default to
thead-c906.

---

This has come up a handful of times, most recently during the Cauldron.
It seems like a grey area to me: we're changing the behavior of some
command-line arguments (ie, everything that doesn't specify -mtune), but
we sort of change that anyway as the tuning parameters change between
releases.

I'm not really seeing much of a precedent from the other ports.  It
looks like aarch64 sort of changed the default in 02fdbd5beb0
("[AArch64] [-mtune cleanup 2/5] Tune for Cortex-A53 by default.") but I
think at that point -mtune=generic and -mtune=cortex-a53 were equivalent
so I'm not sure that counts.  I can't quite sort out if the default x86
tuning has ever changed, but the tuning parameters have changed.  I
don't see any way around having the tuning parameters change as they're
pretty tightly coupled to the GCC internals, but changing to a different
tuning target is a bit bigger of a change.

We also have a bit of a special case here: -mtune is in theory only a
performance issue, but this change will emit a lot more misaligned
accesses and we've seen those trigger bugs in the trap handlers before.
Those bugs are elsewhere so it's sort of not a GCC problem, but I'm sure
there's still users out there with broken firmware and this may cause
visible fallout.  We can just tell those users their systems were always
broken, but that's never a fun way to do things.

I figured the easiest way to talk about this would be to just send the
patch, but I definitely don't plan on committing it without some
discussion.
---
 gcc/config/riscv/riscv.h | 2 +-
 gcc/doc/invoke.texi  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 363113c6511..1d9379fa5ee 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -40,7 +40,7 @@ along with GCC; see the file COPYING3.  If not see
 #endif
 
 #ifndef RISCV_TUNE_STRING_DEFAULT
-#define RISCV_TUNE_STRING_DEFAULT "rocket"
+#define RISCV_TUNE_STRING_DEFAULT "thead-c906"
 #endif
 
 extern const char *riscv_expand_arch (int argc, const char **argv);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e0c2c57c9b2..2a9ea3455f6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -28529,7 +28529,7 @@ particular CPU name.  Permissible values for this 
option are: @samp{rocket},
 @samp{thead-c906}, @samp{size}, and all valid options for @option{-mcpu=}.
 
 When @option{-mtune=} is not specified, use the setting from @option{-mcpu},
-the default is @samp{rocket} if both are not specified.
+the default is @samp{thead-c906} if both are not specified.
 
 The @samp{size} choice is not intended for use by end-users.  This is used
 when @option{-Os} is specified.  It overrides the instruction cost info
-- 
2.34.1



Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.

2022-10-06 Thread Palmer Dabbelt

On Thu, 06 Oct 2022 21:03:25 PDT (-0700), kito.ch...@sifive.com wrote:

From: Monk Chiang 

Currnetly setting of -falign-functions=N will be ignored if the function
is optimized for size or marked as cold function.

However function alignment requirement is needed even optimized for
size in some situations, RISC-V target is an example, RISC-V kernel implement
patchable function that require function must be align to at least 4 bytes for
atomicly patch the function prologue.

Here is 4 way to fix that:
1. Fix -falign-functions=N, let it work as expect on -Os and all cold
functions, which is this patch.
2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
RISC-V's FUNCTION_BOUNDARY.
3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.

And this patch is the first approach.

gcc/ChangeLog:

PR middle-end/88345
* varasm.cc (assemble_start_function): Adjust function align
even optimized for size.
* doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
list of -Os.

gcc/testsuite/ChangeLog:

PR middle-end/88345
* gcc.target/i386/pr88345-1.c: New.
* gcc.target/i386/pr88345-2.c: Ditto.
* gcc.target/riscv/pr88345-1.c: Ditto.
* gcc.target/riscv/pr88345-2.c: Ditto.
---
 gcc/doc/invoke.texi| 2 +-
 gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +
 gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +
 gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +
 gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +
 gcc/varasm.cc  | 3 +--
 6 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a2b0b9636f0..acf98c68825 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11381,7 +11381,7 @@ results.  This is the default.
 Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
 except those that often increase code size:

-@gccoptlist{-falign-functions  -falign-jumps @gol
+@gccoptlist{-falign-jumps @gol
 -falign-labels  -falign-loops @gol
 -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}

diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c 
b/gcc/testsuite/gcc.target/i386/pr88345-1.c
new file mode 100644
index 000..226bb9ffc82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128" } */
+/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
+
+__attribute__((__cold__)) void a() {}
diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c 
b/gcc/testsuite/gcc.target/i386/pr88345-2.c
new file mode 100644
index 000..a7fc3b162dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128 -Os" } */
+/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
+
+void a() {}
diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c 
b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
new file mode 100644
index 000..7d5afe683eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128" } */
+/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
+
+__attribute__((__cold__)) void a() {}
diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c 
b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
new file mode 100644
index 000..d4fc89d86d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128 -Os" } */
+/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
+
+void a() {}
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 423f3f91af8..4001648b214 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
  Note that we still need to align to DECL_ALIGN, as above,
  because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
   if (! DECL_USER_ALIGN (decl)
-  && align_functions.levels[0].log > align
-  && optimize_function_for_speed_p (cfun))
+  && align_functions.levels[0].log > align)
 {
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
   int align_log = align_functions.levels[0].log;


Reviewed-by: Palmer Dabbelt 

Though I'm not a global reviewer, so not sure how much that helps...


[PATCH] doc: -falign-functions doesn't override the __attribute__((align(N)))

2022-10-07 Thread Palmer Dabbelt
I found this when reading the documentation for Kito's recent patch.
>From the discussion it sounds like this is the desired behavior, so
let's document it.

gcc/doc/ChangeLog

* invoke.texi (-falign-functions): Mention __align__
---
 gcc/doc/invoke.texi | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2a9ea3455f6..8326a60dcf1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13136,7 +13136,9 @@ effective only in combination with 
@option{-fstrict-aliasing}.
 Align the start of functions to the next power-of-two greater than or
 equal to @var{n}, skipping up to @var{m}-1 bytes.  This ensures that at
 least the first @var{m} bytes of the function can be fetched by the CPU
-without crossing an @var{n}-byte alignment boundary.
+without crossing an @var{n}-byte alignment boundary.  This does not override
+functions that otherwise specify their own alignment constraints, such as via
+an alignment attribute.
 
 If @var{m} is not specified, it defaults to @var{n}.
 
-- 
2.34.1



Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.

2022-10-07 Thread Palmer Dabbelt

On Fri, 07 Oct 2022 05:56:39 PDT (-0700), hubi...@ucw.cz wrote:

On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng  wrote:
>
> From: Monk Chiang 
>
> Currnetly setting of -falign-functions=N will be ignored if the function
> is optimized for size or marked as cold function.
>
> However function alignment requirement is needed even optimized for
> size in some situations, RISC-V target is an example, RISC-V kernel implement
> patchable function that require function must be align to at least 4 bytes for
> atomicly patch the function prologue.
>
> Here is 4 way to fix that:
> 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
> functions, which is this patch.
> 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
> RISC-V's FUNCTION_BOUNDARY.
> 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
> 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated 
that.
>
> And this patch is the first approach.

The behavior changed with r0-42853-g194734e9e5501f but documentation
wasn't changed to reflect that -falign-functions=N is now only a hint.

I'm not sure in what circumstances users are expected to use -falign-functions
and whether if it is for ABI reasons (as in this case) we should instead have
a -malign-functions or other magic.

Honza - any comment on your change?


This was done a long time ago and -falign-functions was/is about
performance.

The basic idea of the patch was to use minimal required alignment for
-Os and cold functions while use -falign-functions for functions
optimized for speed.  This is not different what we do for other
optimization options.


I'd also noticed last night that -falign-functions doesn't override 
__attribute__((align(N))), but got too tired to write up the patch.  I 
wasn't sure it was the desired behavior at the time, but sounds like it 
is?  I just send a doc patch to mention that.



i386 targets define quite large alignments (especially for older ones
that needed function entry to be separated by given number of
instructions from cache page boundary) so enabling it for all functions
unconditionally is expensive (in code size).

We have FUNCTION_BOUNDARY to specify minimal function alignment required
by the ABI.  So I think if live patches requires bigger alignment, I
would go with adjusting FUNCTION_BOUNDARY with -flive-patching.
Alternatively we could introduce -malign-all-function or
-falign-all-functions to adjust minimal alignment if this is specific to
a particular implementation of live patching in the kernel.

Honza


Thanks,
Richard.

> gcc/ChangeLog:
>
> PR middle-end/88345
> * varasm.cc (assemble_start_function): Adjust function align
> even optimized for size.
> * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
> list of -Os.
>
> gcc/testsuite/ChangeLog:
>
> PR middle-end/88345
> * gcc.target/i386/pr88345-1.c: New.
> * gcc.target/i386/pr88345-2.c: Ditto.
> * gcc.target/riscv/pr88345-1.c: Ditto.
> * gcc.target/riscv/pr88345-2.c: Ditto.
> ---
>  gcc/doc/invoke.texi| 2 +-
>  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +
>  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +
>  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +
>  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +
>  gcc/varasm.cc  | 3 +--
>  6 files changed, 22 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index a2b0b9636f0..acf98c68825 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11381,7 +11381,7 @@ results.  This is the default.
>  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
>  except those that often increase code size:
>
> -@gccoptlist{-falign-functions  -falign-jumps @gol
> +@gccoptlist{-falign-jumps @gol
>  -falign-labels  -falign-loops @gol
>  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c 
b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> new file mode 100644
> index 000..226bb9ffc82
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c 
b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> new file mode 100644
> index 000..a7fc3b162dd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fa

Re: [PATCH v2 00/10] [RISC-V] Atomics improvements [PR100265/PR100266]

2022-10-11 Thread Palmer Dabbelt

On Tue, 11 Oct 2022 12:06:27 PDT (-0700), Vineet Gupta wrote:

Hi Christoph, Kito,

On 5/5/21 12:36, Christoph Muellner via Gcc-patches wrote:

This series provides a cleanup of the current atomics implementation
of RISC-V:

* PR100265: Use proper fences for atomic load/store
* PR100266: Provide programmatic implementation of CAS

As both are very related, I merged the patches into one series.

The first patch could be squashed into the following patches,
but I found it easier to understand the chances with it in place.

The series has been tested as follows:
* Building and testing a multilib RV32/64 toolchain
   (bootstrapped with riscv-gnu-toolchain repo)
* Manual review of generated sequences for GCC's atomic builtins API

The programmatic re-implementation of CAS benefits from a REE improvement
(see PR100264):
   https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568680.html
If this patch is not in place, then an additional extension instruction
is emitted after the SC.W (in case of RV64 and CAS for uint32_t).

Further, the new CAS code requires cbranch INSN helpers to be present:
   https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569689.html


I was wondering is this patchset is blocked on some technical grounds.


There's a v3 (though I can't find all of it, so not quite sure what 
happened), but IIUC that still has the same fundamental problems that 
all these have had: changing over to the new fence model may by an ABI 
break and the split CAS implementation doesn't ensure eventual success 
(see Jim's comments).  Not sure if there's other comments floating 
around, though, that's just what I remember.


+Andrea, in case he has time to look at the memory model / ABI issues.  

We'd still need to sort out the CAS issues, though, and it's not 
abundantly clear it's worth the work: we're essentailly constrained to 
just emitting those fixed CAS sequences due to the eventual success 
rules, so it's not clear what the benefit of splitting those up is.  
With WRS there are some routines we might want to generate code for 
(cond_read_acquire() in Linux, for example) but we'd really need to dig 
into those to see if it's even sane/fast.


There's another patch set to fix the lack of inline atomic routines 
without breaking stuff, there were some minor comments from Kito and 
IIRC I had some test failures that I needed to chase down as well.  
That's a much safer fix in the short term, we'll need to deal with this 
eventually but at least we can stop the libatomic issues for the distro 
folks.




Thx,
-Vineet


Changes for v2:
* Guard LL/SC sequence by compiler barriers ("blockage")
   (suggested by Andrew Waterman)
* Changed commit message for AMOSWAP->STORE change
   (suggested by Andrew Waterman)
* Extracted cbranch4 patch from patchset (suggested by Kito Cheng)
* Introduce predicate riscv_sync_memory_operand (suggested by Jim Wilson)
* Fix small code style issue

Christoph Muellner (10):
   RISC-V: Simplify memory model code [PR 100265]
   RISC-V: Emit proper memory ordering suffixes for AMOs [PR 100265]
   RISC-V: Eliminate %F specifier from riscv_print_operand() [PR 100265]
   RISC-V: Use STORE instead of AMOSWAP for atomic stores [PR 100265]
   RISC-V: Emit fences according to chosen memory model [PR 100265]
   RISC-V: Implement atomic_{load,store} [PR 100265]
   RISC-V: Model INSNs for LR and SC [PR 100266]
   RISC-V: Add s.ext-consuming INSNs for LR and SC [PR 100266]
   RISC-V: Provide programmatic implementation of CAS [PR 100266]
   RISC-V: Introduce predicate "riscv_sync_memory_operand" [PR 100266]

  gcc/config/riscv/riscv-protos.h |   1 +
  gcc/config/riscv/riscv.c| 136 +---
  gcc/config/riscv/sync.md| 216 +---
  3 files changed, 235 insertions(+), 118 deletions(-)



[PATCH v2 1/3] doc: -falign-functions doesn't override the __attribute__((align(N)))

2022-10-11 Thread Palmer Dabbelt
I found this when reading the documentation for Kito's recent patch.
>From the discussion it sounds like this is the desired behavior, so
let's document it.

gcc/doc/ChangeLog

* invoke.texi (-falign-functions): Mention __align__
---
 gcc/doc/invoke.texi | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2a9ea3455f6..8326a60dcf1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13136,7 +13136,9 @@ effective only in combination with 
@option{-fstrict-aliasing}.
 Align the start of functions to the next power-of-two greater than or
 equal to @var{n}, skipping up to @var{m}-1 bytes.  This ensures that at
 least the first @var{m} bytes of the function can be fetched by the CPU
-without crossing an @var{n}-byte alignment boundary.
+without crossing an @var{n}-byte alignment boundary.  This does not override
+functions that otherwise specify their own alignment constraints, such as via
+an alignment attribute.
 
 If @var{m} is not specified, it defaults to @var{n}.
 
-- 
2.34.1



[PATCH v2 0/3] doc: -falign-functions improvements

2022-10-11 Thread Palmer Dabbelt
There were some recent discussions about the desired behavior of
-falign-functions, which is behaving as desired.  This improves the
documentation to make that explicit.

Change since v1 <20221007134901.5078-1-pal...@rivosinc.com>:

* New patch 2 and 3




[PATCH v2 3/3] doc: -falign-functions is ignored for cold/size-optimized functions

2022-10-11 Thread Palmer Dabbelt
gcc/doc/ChangeLog

* invoke.texi (-falign-functions): Mention cold/size-optimized
functions.
---
 gcc/doc/invoke.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a24798d5029..6af18ae9bfd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13138,7 +13138,8 @@ equal to @var{n}, skipping up to @var{m}-1 bytes.  This 
ensures that at
 least the first @var{m} bytes of the function can be fetched by the CPU
 without crossing an @var{n}-byte alignment boundary.  This does not override
 functions that otherwise specify their own alignment constraints, such as via
-an alignment attribute.
+an alignment attribute.  Functions that are optimized for size, for example
+cold functions, are not aligned.
 
 If @var{m} is not specified, it defaults to @var{n}.
 
-- 
2.34.1



Re: [PATCH] doc: -falign-functions doesn't override the __attribute__((align(N)))

2022-10-11 Thread Palmer Dabbelt

On Sun, 09 Oct 2022 23:07:21 PDT (-0700), richard.guent...@gmail.com wrote:

On Fri, Oct 7, 2022 at 3:50 PM Palmer Dabbelt  wrote:


I found this when reading the documentation for Kito's recent patch.
From the discussion it sounds like this is the desired behavior, so
let's document it.


Maybe also mention that the alignment doesn't apply to functions
optimized for size?


Oops, I guess that was the whole point of the discussion ;).  I sent a 
v2, which also mentions -Os but not sure we need to do that explicitly.





gcc/doc/ChangeLog

* invoke.texi (-falign-functions): Mention __align__
---
 gcc/doc/invoke.texi | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2a9ea3455f6..8326a60dcf1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13136,7 +13136,9 @@ effective only in combination with 
@option{-fstrict-aliasing}.
 Align the start of functions to the next power-of-two greater than or
 equal to @var{n}, skipping up to @var{m}-1 bytes.  This ensures that at
 least the first @var{m} bytes of the function can be fetched by the CPU
-without crossing an @var{n}-byte alignment boundary.
+without crossing an @var{n}-byte alignment boundary.  This does not override
+functions that otherwise specify their own alignment constraints, such as via
+an alignment attribute.

 If @var{m} is not specified, it defaults to @var{n}.

--
2.34.1



[PATCH v2 2/3] doc: -falign-functions is ignored under -Os

2022-10-11 Thread Palmer Dabbelt
This is implicitly mentioned in the docs, but there were some questions
in a recent patch.  This makes it more exlicit that -falign-functions is
meant to be ignored under -Os.

gcc/doc/ChangeLog

* invoke.texi (-falign-functions): Mention -Os
---
 gcc/doc/invoke.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8326a60dcf1..a24798d5029 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13164,7 +13164,8 @@ equivalent and mean that functions are not aligned.
 If @var{n} is not specified or is zero, use a machine-dependent default.
 The maximum allowed @var{n} option value is 65536.
 
-Enabled at levels @option{-O2}, @option{-O3}.
+Enabled at levels @option{-O2}, @option{-O3}.  This has no behavior under under
+@option{-Os}.
 
 @item -flimit-function-alignment
 If this option is enabled, the compiler tries to avoid unnecessarily
-- 
2.34.1



Re: [PATCH v2 00/10] [RISC-V] Atomics improvements [PR100265/PR100266]

2022-10-11 Thread Palmer Dabbelt

On Tue, 11 Oct 2022 16:31:25 PDT (-0700), Vineet Gupta wrote:



On 10/11/22 13:46, Christoph Müllner wrote:

On Tue, Oct 11, 2022 at 9:31 PM Palmer Dabbelt  wrote:

On Tue, 11 Oct 2022 12:06:27 PDT (-0700), Vineet Gupta wrote:
> Hi Christoph, Kito,
>
> On 5/5/21 12:36, Christoph Muellner via Gcc-patches wrote:
>> This series provides a cleanup of the current atomics
implementation
>> of RISC-V:
>>
>> * PR100265: Use proper fences for atomic load/store
>> * PR100266: Provide programmatic implementation of CAS
>>
>> As both are very related, I merged the patches into one series.
>>
>> The first patch could be squashed into the following patches,
>> but I found it easier to understand the chances with it in place.
>>
>> The series has been tested as follows:
>> * Building and testing a multilib RV32/64 toolchain
>>    (bootstrapped with riscv-gnu-toolchain repo)
>> * Manual review of generated sequences for GCC's atomic
builtins API
>>
>> The programmatic re-implementation of CAS benefits from a REE
improvement
>> (see PR100264):
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568680.html
>> If this patch is not in place, then an additional extension
instruction
>> is emitted after the SC.W (in case of RV64 and CAS for uint32_t).
>>
>> Further, the new CAS code requires cbranch INSN helpers to be
present:
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569689.html
>
> I was wondering is this patchset is blocked on some technical
grounds.

There's a v3 (though I can't find all of it, so not quite sure what
happened), but IIUC that still has the same fundamental problems that
all these have had: changing over to the new fence model may by an
ABI
break and the split CAS implementation doesn't ensure eventual
success
(see Jim's comments).  Not sure if there's other comments floating
around, though, that's just what I remember.


v3 was sent on May 27, 2022, when I rebased this on an internal tree:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595712.html
I dropped the CAS patch in v3 (issue: stack spilling under extreme 
register pressure instead of erroring out) as I thought that this was 
the blocker for the series.
I just learned a few weeks ago, when I asked Palmer at the GNU 
Cauldron about this series, that the ABI break is the blocker.


Yeah I was confused about the ABI aspect as I didn't see any mention of 
that in the public reviews of v1 and v2.


Sorry, I thought we'd talked about it somewhere but it must have just 
been in meetings and such.  Patrick was writing a similar patch set 
around the same time so it probably just got tied up in that, we ended 
up reducing it to just the strong CAS inline stuff because we couldn't 
sort out the correctness of the rest of it.


My initial understanding was that fixing something broken cannot be an 
ABI break.
And that the mismatch of the implementation in 2021 and the 
recommended mappings in the ratified specification from 2019 is 
something that is broken. I still don't know the background here, but 
I guess this assumption is incorrect from a historical point of view.


We agreed that we wouldn't break binaries back when we submitted the 
port.  The ISA has changed many times since then, including adding the 
recommended mappings, but those binaries exist and we can't just 
silently break things for users.


However, I'm sure that I am not the only one that assumes the mappings 
in the specification to be implemented in compilers and tools. 
Therefore I still consider the implementation of the RISC-V atomics in 
GCC as broken (at least w.r.t. user expectation from people that lack 
the historical background and just read the RISC-V specification).


You can't just read one of those RISC-V PDFs and assume that 
implementations that match those words will function correctly.  Those 
words regularly change in ways where reasonable readers would end up 
with incompatible implementations due to those differences.  That's why 
we're so explicit about versions and such these days, we're just getting 
burned by these old mappings because they're from back when we though 
the RISC-V definition of compatibility was going to match the more 
common one and we didn't build in fallbacks.



+Andrea, in case he has time to look at the memory model / ABI
issues.

We'd still need to sort out the CAS issues, though, and it's not
abundantly clear it's worth the work: we're essentailly
constrained to
just emitting those fixed CAS sequences due to the eventual success
rules, so it&

Re: [PATCH] RISC-V: Implement ZTSO extension.

2022-03-21 Thread Palmer Dabbelt

On Thu, 17 Mar 2022 23:52:04 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

Hi Shi-Hua:

Thanks, this patch is LGTM, but I would defer that until stage 1,
because the binutils part isn't merget yet.


IMO we should at least have a __riscv_ztso define, and ideally have the 
relevent builtins ported (atomics, fences, etc) as well.  Otherwise this 
is really just setting a bit that makes binaries incompatible without 
providing any real benefit.  That'll also let us work through how these 
mappings should be implemented, so we don't end up with issues like we 
did with WMO.




On Tue, Mar 15, 2022 at 5:10 PM  wrote:


From: LiaoShihua 

  ZTSO is the extension of tatol store order model.
  This extension adds no new instructions to the ISA, and you can use it with arch 
"ztso".
  If you use it, TSO flag will be generate in the ELF header.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: define new arch.
* config/riscv/riscv-opts.h (MASK_ZTSO): Ditto.
(TARGET_ZTSO):Ditto.
* config/riscv/riscv.opt:Ditto.

---
 gcc/common/config/riscv/riscv-common.cc | 4 +++-
 gcc/config/riscv/riscv-opts.h   | 3 +++
 gcc/config/riscv/riscv.opt  | 3 +++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index a904893b9ed..f4730b991d7 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -185,6 +185,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zvl32768b", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvl65536b", ISA_SPEC_CLASS_NONE, 1, 0},

+  {"ztso", ISA_SPEC_CLASS_NONE, 0, 1},
+
   /* Terminate the list.  */
   {NULL, ISA_SPEC_CLASS_NONE, 0, 0}
 };
@@ -1080,7 +1082,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zvl32768b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL32768B},
   {"zvl65536b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL65536B},

-
+  {"ztso", &gcc_options::x_riscv_ztso_subext, MASK_ZTSO},
   {NULL, NULL, 0}
 };

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 929e4e3a7c5..9cb5f2a550a 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -136,4 +136,7 @@ enum stack_protector_guard {
 #define TARGET_ZVL32768B ((riscv_zvl_flags & MASK_ZVL32768B) != 0)
 #define TARGET_ZVL65536B ((riscv_zvl_flags & MASK_ZVL65536B) != 0)

+#define MASK_ZTSO(1 <<  0)
+#define TARGET_ZTSO((riscv_ztso_subext & MASK_ZTSO) != 0)
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 9fffc08220d..6128bfa31dc 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -209,6 +209,9 @@ int riscv_vector_eew_flags
 TargetVariable
 int riscv_zvl_flags

+TargetVariable
+int riscv_ztso_subext
+
 Enum
 Name(isa_spec_class) Type(enum riscv_isa_spec_class)
 Supported ISA specs (for use with the -misa-spec= option):
--
2.31.1.windows.1



Re: [PATCH] RISC-V: Implement ZTSO extension.

2022-03-21 Thread Palmer Dabbelt

On Mon, 21 Mar 2022 19:39:24 PDT (-0700), kito.ch...@sifive.com wrote:

Hi Palmer:

I guess the problem is binutils isn't included and it's too close to the
GCC release, and binutils will report errors if it has any unsupported
extensions.


Ya, sorry, I was trying to say that we should have more than just the 
binutils support -- IIUC having binutils support the GCC flags at 
release is the standard way to do things, and I don't see any reason to 
rush this.




Most distro will use GCC 12 + binutils 2.38 or GCC 11 + binutils 2.38, so
either combination doesn't work for march string with ztso.

So that's why I am not intending to include that at this moment, but maybe
we could include that first and it'll work once binutils 2.39 released,
then we can have GCC 12 + binutils 2.39 in the next few months.

Anyway, I think I am fine with that, and I'll ping Nelson for the binutils
part.

On Tue, Mar 22, 2022 at 9:13 AM Palmer Dabbelt  wrote:


On Thu, 17 Mar 2022 23:52:04 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
> Hi Shi-Hua:
>
> Thanks, this patch is LGTM, but I would defer that until stage 1,
> because the binutils part isn't merget yet.

IMO we should at least have a __riscv_ztso define, and ideally have the
relevent builtins ported (atomics, fences, etc) as well.  Otherwise this
is really just setting a bit that makes binaries incompatible without
providing any real benefit.  That'll also let us work through how these
mappings should be implemented, so we don't end up with issues like we
did with WMO.

>
> On Tue, Mar 15, 2022 at 5:10 PM  wrote:
>>
>> From: LiaoShihua 
>>
>>   ZTSO is the extension of tatol store order model.
>>   This extension adds no new instructions to the ISA, and you can
use it with arch "ztso".
>>   If you use it, TSO flag will be generate in the ELF header.
>>
>> gcc/ChangeLog:
>>
>> * common/config/riscv/riscv-common.cc: define new arch.
>> * config/riscv/riscv-opts.h (MASK_ZTSO): Ditto.
>> (TARGET_ZTSO):Ditto.
>> * config/riscv/riscv.opt:Ditto.
>>
>> ---
>>  gcc/common/config/riscv/riscv-common.cc | 4 +++-
>>  gcc/config/riscv/riscv-opts.h   | 3 +++
>>  gcc/config/riscv/riscv.opt  | 3 +++
>>  3 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/common/config/riscv/riscv-common.cc
b/gcc/common/config/riscv/riscv-common.cc
>> index a904893b9ed..f4730b991d7 100644
>> --- a/gcc/common/config/riscv/riscv-common.cc
>> +++ b/gcc/common/config/riscv/riscv-common.cc
>> @@ -185,6 +185,8 @@ static const struct riscv_ext_version
riscv_ext_version_table[] =
>>{"zvl32768b", ISA_SPEC_CLASS_NONE, 1, 0},
>>{"zvl65536b", ISA_SPEC_CLASS_NONE, 1, 0},
>>
>> +  {"ztso", ISA_SPEC_CLASS_NONE, 0, 1},
>> +
>>/* Terminate the list.  */
>>{NULL, ISA_SPEC_CLASS_NONE, 0, 0}
>>  };
>> @@ -1080,7 +1082,7 @@ static const riscv_ext_flag_table_t
riscv_ext_flag_table[] =
>>{"zvl32768b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL32768B},
>>{"zvl65536b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL65536B},
>>
>> -
>> +  {"ztso", &gcc_options::x_riscv_ztso_subext, MASK_ZTSO},
>>{NULL, NULL, 0}
>>  };
>>
>> diff --git a/gcc/config/riscv/riscv-opts.h
b/gcc/config/riscv/riscv-opts.h
>> index 929e4e3a7c5..9cb5f2a550a 100644
>> --- a/gcc/config/riscv/riscv-opts.h
>> +++ b/gcc/config/riscv/riscv-opts.h
>> @@ -136,4 +136,7 @@ enum stack_protector_guard {
>>  #define TARGET_ZVL32768B ((riscv_zvl_flags & MASK_ZVL32768B) != 0)
>>  #define TARGET_ZVL65536B ((riscv_zvl_flags & MASK_ZVL65536B) != 0)
>>
>> +#define MASK_ZTSO(1 <<  0)
>> +#define TARGET_ZTSO((riscv_ztso_subext & MASK_ZTSO) != 0)
>> +
>>  #endif /* ! GCC_RISCV_OPTS_H */
>> diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
>> index 9fffc08220d..6128bfa31dc 100644
>> --- a/gcc/config/riscv/riscv.opt
>> +++ b/gcc/config/riscv/riscv.opt
>> @@ -209,6 +209,9 @@ int riscv_vector_eew_flags
>>  TargetVariable
>>  int riscv_zvl_flags
>>
>> +TargetVariable
>> +int riscv_ztso_subext
>> +
>>  Enum
>>  Name(isa_spec_class) Type(enum riscv_isa_spec_class)
>>  Supported ISA specs (for use with the -misa-spec= option):
>> --
>> 2.31.1.windows.1
>>



Re: [PATCH] libgcc, riscv: Add restore libcalls to be used by tail calling functions

2022-03-31 Thread Palmer Dabbelt

On Tue, 29 Mar 2022 07:08:35 PDT (-0700), lewis.rev...@embecosm.com wrote:

Currently the existing libcalls for restoring registers have the
requirement that they must be tail called by the parent function, so
that they can safely return through the restored return address
register. This does impose the restriction that the libcalls cannot be
used if there already exists a tail call at the end of the parent
function in question, and as such this patch forms part of an effort to
rectify this situation.

There already exists patches to LLVM and Compiler-RT to add the libcalls
and the capability for the compiler to generate them
(https://reviews.llvm.org/D91720 and https://reviews.llvm.org/D91719),
and the behaviour that we want to standardize across the compilers is
documented in the following pull request to the RISC-V toolchain
conventions repository:
https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/10


This generally looks good to me, but the timing is awkward: we're in 
stage 4 (so features need an exception), but my bigger worry is that 
taking support for a draft spec so late in the cycle puts us at serious 
risk of shipping the draft and being stuck with it (which is bad for 
everyone).  It looks like the spec is just waiting on GCC, though, so 
maybe we're in that chicken-and-egg stage -- a bit of a headache for 
that to show up in stage 4 as there's no room for error, but this one 
seems manageable.


If this is aimed at GCC-13, then I think it's best to make sure we also 
have the GCC support for emitting calls to those routines -- otherwise 
it'll be very hard to test this.  The good news is that in that case 
there's time, it's just a chunk of extra work to do.  That should also 
make alignment with the spec timeline easy, as we'll have many months of 
slack.


Regardless, it seems like this is mostly Jim's code so I'll defer to him 
here.


Thanks!


The libcalls added in this patch follow that documented behaviour and
are based off a suggested implementation provided by Jim Wilson in the
thread of that pull request. Similar to the existing restore libcalls,
restores are grouped according to the expected stack alignment, and the
'upper' libcalls fall through to the lower libcalls, finally ending in
return through the temporary register t1.

libgcc/

* config/riscv/restore-tail.S: Add restore libcalls compatible
with use from functions ending in tail calls.
* config/riscv/t-elf: Add file restore-tail.S.
---
 libgcc/config/riscv/restore-tail.S | 279 +
 libgcc/config/riscv/t-elf  |   1 +
 2 files changed, 280 insertions(+)
 create mode 100644 libgcc/config/riscv/restore-tail.S

diff --git a/libgcc/config/riscv/restore-tail.S
b/libgcc/config/riscv/restore-tail.S
new file mode 100644
index 000..54116beff17
--- /dev/null
+++ b/libgcc/config/riscv/restore-tail.S
@@ -0,0 +1,279 @@
+/* Tail-call compatible callee-saved register restore routines for RISC-V.
+
+   Copyright (C) 2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+#include "riscv-asm.h"
+
+  .text
+
+#if __riscv_xlen == 64
+
+FUNC_BEGIN (__riscv_restore_tailcall_12)
+  .cfi_startproc
+  .cfi_def_cfa_offset 112
+  .cfi_offset 27, -104
+  .cfi_offset 26, -96
+  .cfi_offset 25, -88
+  .cfi_offset 24, -80
+  .cfi_offset 23, -72
+  .cfi_offset 22, -64
+  .cfi_offset 21, -56
+  .cfi_offset 20, -48
+  .cfi_offset 19, -40
+  .cfi_offset 18, -32
+  .cfi_offset 9, -24
+  .cfi_offset 8, -16
+  .cfi_offset 1, -8
+  ld s11, 8(sp)
+  .cfi_restore 27
+  addi sp, sp, 16
+
+FUNC_BEGIN (__riscv_restore_tailcall_11)
+FUNC_BEGIN (__riscv_restore_tailcall_10)
+  .cfi_restore 27
+  .cfi_def_cfa_offset 96
+  ld s10, 0(sp)
+  .cfi_restore 26
+  ld s9, 8(sp)
+  .cfi_restore 25
+  addi sp, sp, 16
+
+FUNC_BEGIN (__riscv_restore_tailcall_9)
+FUNC_BEGIN (__riscv_restore_tailcall_8)
+  .cfi_restore 25
+  .cfi_restore 26
+  .cfi_restore 27
+  .cfi_def_cfa_offset 80
+  ld s8, 0(sp)
+  .cfi_restore 24
+  ld s7, 8(sp)
+  .cfi_restore 23
+  addi sp, sp, 16
+
+FUNC_BEGIN (__riscv_restore_tailcall_7)
+FUNC_BEGIN (__riscv_restore

[PATCH v1] libstdc++: Default to mutex-based atomics on RISC-V

2022-04-07 Thread Palmer Dabbelt
The RISC-V port requires libatomic to be linked in order to resolve
various atomic functions, which results in builds that have
"--with-libstdcxx-lock-policy=auto" defaulting to mutex-based locks.
Changing this to direct atomics breaks the ABI, this forces the auto
detection mutex-based atomics on RISC-V in order to avoid a silent ABI
break for users.

See Bug 84568 for more discussion.  In the long run there may be a way
to get the higher-performance atomics without an ABI flag day, but
that's going to be a much more complicated operation.  We don't even
have support for the inline atomics yet, but given that some folks have
been discussing hacks to make these libatomic routines appear implicitly
it seems prudent to just turn off the automatic detection for RISC-V.

libstdc++-v3/ChangeLog

* acinclude.md (GLIBCXX_ENABLE_LOCK_POLICY): Force auto to mutex
  for RISC-V.

---

I haven't even built this one, as I'm sure there's a better way to do it
then sticking some more C code in there.
---
 libstdc++-v3/acinclude.m4 | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index f53461c85a5..945c0c66f8d 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3612,6 +3612,9 @@ AC_DEFUN([GLIBCXX_ENABLE_LOCK_POLICY], [
 dnl Why don't we check 8-byte CAS for sparc64, where _Atomic_word is long?!
 dnl New targets should only check for CAS for the _Atomic_word type.
 AC_TRY_COMPILE([
+#if defined __riscv
+# error "Defaulting to mutex-based locks for ABI compatibility"
+#endif
 #if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
 # error "No 2-byte compare-and-swap"
 #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
-- 
2.34.1



Re: [PATCH v1] libstdc++: Default to mutex-based atomics on RISC-V

2022-04-14 Thread Palmer Dabbelt

On Thu, 14 Apr 2022 08:08:17 PDT (-0700), jwak...@redhat.com wrote:

On 07/04/22 11:46 -0700, Palmer Dabbelt wrote:

The RISC-V port requires libatomic to be linked in order to resolve
various atomic functions, which results in builds that have
"--with-libstdcxx-lock-policy=auto" defaulting to mutex-based locks.
Changing this to direct atomics breaks the ABI, this forces the auto
detection mutex-based atomics on RISC-V in order to avoid a silent ABI
break for users.

See Bug 84568 for more discussion.  In the long run there may be a way
to get the higher-performance atomics without an ABI flag day, but
that's going to be a much more complicated operation.  We don't even
have support for the inline atomics yet, but given that some folks have
been discussing hacks to make these libatomic routines appear implicitly
it seems prudent to just turn off the automatic detection for RISC-V.

libstdc++-v3/ChangeLog

* acinclude.md (GLIBCXX_ENABLE_LOCK_POLICY): Force auto to mutex
  for RISC-V.


As documented at https://gcc.gnu.org/lists.html all patches for
libstdc++ need to go to the libstdc++ list as well as gcc-patches
(otherwise I won't see them).


Thanks, I'll try to remember to look next time.


We'd usually do something like:

case "${host}" in
   *-*-riscv) libstdcxx_atomic_lock_policy=mutex ;;
   *-*-*) AC_TRY_COMPILE([ ... ],,[],[])
esac

but this way is simpler. If we add more customization for other
targets we can reconsider using the 'case "${host}"' form.


Ya, that's kind of where I came to as well -- the proper autoconf flavor 
would scale way better, but hopefully nobody else makes this mistake and 
thus we don't need to worry about that.


I'm fine with either way (though I think we'd need a "riscv*" there, to 
match riscv32 and riscv64?), so if you want to swap it over (or have me 
re-spin this) it's no big deal on my end -- also fine, as per below, 
with you just committing this ;)



So this is OK for trunk, modulo regenerating libstdc++-v3/configure
with this change. Let me know if you want me to do that regen for you
(or commit the whole thing for you).


That'd be great, thanks!  It usually takes me a while to get all the 
autotools versions lined up (we just got new machines at the office), 
that way I won't have to do so.






---

I haven't even built this one, as I'm sure there's a better way to do it
then sticking some more C code in there.
---
libstdc++-v3/acinclude.m4 | 3 +++
1 file changed, 3 insertions(+)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index f53461c85a5..945c0c66f8d 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3612,6 +3612,9 @@ AC_DEFUN([GLIBCXX_ENABLE_LOCK_POLICY], [
dnl Why don't we check 8-byte CAS for sparc64, where _Atomic_word is long?!
dnl New targets should only check for CAS for the _Atomic_word type.
AC_TRY_COMPILE([
+#if defined __riscv
+# error "Defaulting to mutex-based locks for ABI compatibility"
+#endif
#if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
# error "No 2-byte compare-and-swap"
#elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4


Re: [PATCH v1] libstdc++: Default to mutex-based atomics on RISC-V

2022-04-14 Thread Palmer Dabbelt

On Thu, 14 Apr 2022 08:22:05 PDT (-0700), jwak...@redhat.com wrote:

On Thu, 14 Apr 2022 at 16:18, Palmer Dabbelt wrote:


On Thu, 14 Apr 2022 08:08:17 PDT (-0700), jwak...@redhat.com wrote:
> On 07/04/22 11:46 -0700, Palmer Dabbelt wrote:
>>The RISC-V port requires libatomic to be linked in order to resolve
>>various atomic functions, which results in builds that have
>>"--with-libstdcxx-lock-policy=auto" defaulting to mutex-based locks.
>>Changing this to direct atomics breaks the ABI, this forces the auto
>>detection mutex-based atomics on RISC-V in order to avoid a silent ABI
>>break for users.
>>
>>See Bug 84568 for more discussion.  In the long run there may be a way
>>to get the higher-performance atomics without an ABI flag day, but
>>that's going to be a much more complicated operation.  We don't even
>>have support for the inline atomics yet, but given that some folks have
>>been discussing hacks to make these libatomic routines appear implicitly
>>it seems prudent to just turn off the automatic detection for RISC-V.
>>
>>libstdc++-v3/ChangeLog
>>
>>  * acinclude.md (GLIBCXX_ENABLE_LOCK_POLICY): Force auto to mutex
>>for RISC-V.
>
> As documented at https://gcc.gnu.org/lists.html all patches for
> libstdc++ need to go to the libstdc++ list as well as gcc-patches
> (otherwise I won't see them).

Thanks, I'll try to remember to look next time.

> We'd usually do something like:
>
> case "${host}" in
>*-*-riscv) libstdcxx_atomic_lock_policy=mutex ;;
>*-*-*) AC_TRY_COMPILE([ ... ],,[],[])
> esac
>
> but this way is simpler. If we add more customization for other
> targets we can reconsider using the 'case "${host}"' form.

Ya, that's kind of where I came to as well -- the proper autoconf flavor
would scale way better, but hopefully nobody else makes this mistake and
thus we don't need to worry about that.





I'm fine with either way (though I think we'd need a "riscv*" there, to
match riscv32 and riscv64?), so if you want to swap it over (or have me
re-spin this) it's no big deal on my end -- also fine, as per below,
with you just committing this ;)


Yeah, I figured *-*-riscv probably wasn't right, so that's another
reason to prefer your approach.




> So this is OK for trunk, modulo regenerating libstdc++-v3/configure
> with this change. Let me know if you want me to do that regen for you
> (or commit the whole thing for you).

That'd be great, thanks!  It usually takes me a while to get all the
autotools versions lined up (we just got new machines at the office),
that way I won't have to do so.


No problem, I can regen+push for you.


Great, thanks!


Re: 回复:[PATCH] Asan changes for RISC-V.

2022-04-20 Thread Palmer Dabbelt

On Tue, 19 Apr 2022 23:13:15 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

Does Asan work for RISC-V currently? It seems that '-fsanitize=address' is 
still unsupported for RISC-V. If I add '--enable-libsanitizer' in Makefile.in 
to reconfigure, there are compiling errors.
Is it because # libsanitizer not supported rv32, but it will break the rv64 
multi-lib build, so we disable that temporally until rv32 supported# in 
Makefile.in?


Not quite sure what's going on here, I keep getting copies of this 
message that look empty in gmail.


I was under the impression that asan worked on rv64, but remember there 
being some worrisome constants floating around (as Jim alludes to in the 
forwarded patch).  As far as I can tell there's no libsanitizer support 
for rv32 (upstream is at LLVM), probably because we didn't have a stable 
uABI back then.  It's not super hard to do a libsanitizer port, but I 
don't see any other 32-bit targets with asan so either I'm missing 
something or it's tricky (and we don't have much free VA space, so not 
sure if it'd even run anything useful).



--
发件人:Jim Wilson 
发送时间:2020年10月29日(星期四) 07:59
收件人:gcc-patches 
抄 送:cooper.joshua ; Jim Wilson 

主 题:[PATCH] Asan changes for RISC-V.

We have only riscv64 asan support, there is no riscv32 support as yet.  So I
need to be able to conditionally enable asan support for the riscv target.  I
implemented this by returning zero from the asan_shadow_offset function.  This
requires a change to toplev.c and docs in target.def.

The asan support works on a 5.5 kernel, but does not work on a 4.15 kernel.
The problem is that the asan high memory region is a small wedge below
0x40.  The new kernel puts shared libraries at 0x3f and going
down which works.  But the old kernel puts shared libraries at 0x20
and going up which does not work, as it isn't in any recognized memory
region.  This might be fixable with more asan work, but we don't really need
support for old kernel versions.

The asan port is curious in that it uses 1<<29 for the shadow offset, but all
other 64-bit targets use a number larger than 1<<32.  But what we have is
working OK for now.

I did a make check RUNTESTFLAGS="asan.exp" on Fedora rawhide image running on
qemu and the results look reasonable.

  === gcc Summary ===

# of expected passes  1905
# of unexpected failures 11
# of unsupported tests  224

  === g++ Summary ===

# of expected passes  2002
# of unexpected failures 6
# of unresolved testcases 1
# of unsupported tests  175

OK?

Jim

2020-10-28  Jim Wilson  

 gcc/
 * config/riscv/riscv.c (riscv_asan_shadow_offset): New.
 (TARGET_ASAN_SHADOW_OFFSET): New.
 * doc/tm.texi: Regenerated.
 * target.def (asan_shadow_offset); Mention that it can return zero.
 * toplev.c (process_options): Check for and handle zero return from
 targetm.asan_shadow_offset call.

Co-Authored-By: cooper.joshua 
---
 gcc/config/riscv/riscv.c | 16 
 gcc/doc/tm.texi  |  3 ++-
 gcc/target.def   |  3 ++-
 gcc/toplev.c |  3 ++-
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..6909e200de1 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -5299,6 +5299,19 @@ riscv_gpr_save_operation_p (rtx op)
   return true;
 }

+/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
+
+static unsigned HOST_WIDE_INT
+riscv_asan_shadow_offset (void)
+{
+  /* We only have libsanitizer support for RV64 at present.
+
+ This number must match kRiscv*_ShadowOffset* in the file
+ libsanitizer/asan/asan_mapping.h which is currently 1<<29 for rv64,
+ even though 1<<36 makes more sense.  */
+  return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -5482,6 +5495,9 @@ riscv_gpr_save_operation_p (rtx op)
 #undef TARGET_NEW_ADDRESS_PROFITABLE_P
 #define TARGET_NEW_ADDRESS_PROFITABLE_P riscv_new_address_profitable_p

+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET riscv_asan_shadow_offset
+
 struct gcc_target targetm = TARGET_INITIALIZER;

 #include "gt-riscv.h"
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 24c37f655c8..39c596b647a 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12078,7 +12078,8 @@ is zero, which disables this optimization.
 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_ASAN_SHADOW_OFFSET 
(void)
 Return the offset bitwise ored into shifted address to get corresponding
 Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
-supported by the target.
+supported by the target.  May return 0 if Address Sanitizer is not supported
+by a subtarget.
 @end deftypefn

 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK 
(unsigned HOST_WIDE_INT @var{val})
diff --git a/gcc/target.d

[PATCH] c++: Add srodata to the allowed sections

2022-04-20 Thread Palmer Dabbelt
This fires errors like

FAIL: g++.dg/opt/const7.C  -std=c++14  scan-assembler-symbol-section symbol 
b_var (found _ZL5b_var) has section ^\\.(const|rodata)|\\[RO\\] (found .srodata)

on RISC-V, where RO data can end up in the srodata section.

gcc/testsuite/ChangeLog:

* g++.dg/opt/const7.C: Allow symbols in .srodata

---

I didn't actually re-run the test suite, as I was poking around with
something else.  This one seems pretty trivial, though.  Happy to do so
before committing, but figured I'd send it out anyway in case anyone
else is triaging our bugs.
---
 gcc/testsuite/g++.dg/opt/const7.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/opt/const7.C 
b/gcc/testsuite/g++.dg/opt/const7.C
index 5bcf94897a8..8bbd9db973f 100644
--- a/gcc/testsuite/g++.dg/opt/const7.C
+++ b/gcc/testsuite/g++.dg/opt/const7.C
@@ -4,4 +4,4 @@
 
 struct B { B()=default; };
 static const B b_var;  //  { dg-bogus "" }
-// { dg-final { scan-assembler-symbol-section {b_var} 
{^\.(const|rodata)|\[RO\]} } }
+// { dg-final { scan-assembler-symbol-section {b_var} 
{^\.(const|rodata|srodata)|\[RO\]} } }
-- 
2.34.1



Re: [PATCH] Asan changes for RISC-V.

2022-04-21 Thread Palmer Dabbelt

On Wed, 20 Apr 2022 18:41:08 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

Hi Joshua:


[from the other thread: Thanks, no idea how I missed all those 32-bit 
ports...]





Does Asan work for RISC-V currently? It seems that '-fsanitize=address' is 
still unsupported for RISC-V. If I add '--enable-libsanitizer' in Makefile.in 
to reconfigure, there are compiling errors.

Is it because # libsanitizer not supported rv32, but it will break the
rv64 multi-lib build, so we disable that temporally until rv32
supported# in Makefile.in?

IIUC, you mean the Makefile in riscv-gnu-toolchain instead of upstream
GCC, right? I guess we can make a configure option to enable that and
check it does not come with multi-lib, or maybe you could fix that on
GCC's configure script to make the multi-lib build be ignored for
rv32?


A super simple option is to just let folks select this an configure time 
in riscv-gnu-toolchain, there's already a bunch of options like that and 
there's probably more logic to the "do we want libsanitizer" than we 
want to bake into the riscv-gnu-toolchain -- doubly so as it's really a 
developer thing these days.


I just opened a PR to pass through the top-level configure argument 
.





On Wed, Apr 20, 2022 at 2:13 PM joshua via Gcc-patches
 wrote:


Does Asan work for RISC-V currently? It seems that '-fsanitize=address' is 
still unsupported for RISC-V. If I add '--enable-libsanitizer' in Makefile.in 
to reconfigure, there are compiling errors.
Is it because # libsanitizer not supported rv32, but it will break the rv64 
multi-lib build, so we disable that temporally until rv32 supported# in 
Makefile.in?


--
发件人:Jim Wilson 
发送时间:2020年10月29日(星期四) 07:59
收件人:gcc-patches 
抄 送:cooper.joshua ; Jim Wilson 

主 题:[PATCH] Asan changes for RISC-V.

We have only riscv64 asan support, there is no riscv32 support as yet.  So I
need to be able to conditionally enable asan support for the riscv target.  I
implemented this by returning zero from the asan_shadow_offset function.  This
requires a change to toplev.c and docs in target.def.

The asan support works on a 5.5 kernel, but does not work on a 4.15 kernel.
The problem is that the asan high memory region is a small wedge below
0x40.  The new kernel puts shared libraries at 0x3f and going
down which works.  But the old kernel puts shared libraries at 0x20
and going up which does not work, as it isn't in any recognized memory
region.  This might be fixable with more asan work, but we don't really need
support for old kernel versions.

The asan port is curious in that it uses 1<<29 for the shadow offset, but all
other 64-bit targets use a number larger than 1<<32.  But what we have is
working OK for now.

I did a make check RUNTESTFLAGS="asan.exp" on Fedora rawhide image running on
qemu and the results look reasonable.

  === gcc Summary ===

# of expected passes  1905
# of unexpected failures 11
# of unsupported tests  224

  === g++ Summary ===

# of expected passes  2002
# of unexpected failures 6
# of unresolved testcases 1
# of unsupported tests  175

OK?

Jim

2020-10-28  Jim Wilson  

 gcc/
 * config/riscv/riscv.c (riscv_asan_shadow_offset): New.
 (TARGET_ASAN_SHADOW_OFFSET): New.
 * doc/tm.texi: Regenerated.
 * target.def (asan_shadow_offset); Mention that it can return zero.
 * toplev.c (process_options): Check for and handle zero return from
 targetm.asan_shadow_offset call.

Co-Authored-By: cooper.joshua 
---
 gcc/config/riscv/riscv.c | 16 
 gcc/doc/tm.texi  |  3 ++-
 gcc/target.def   |  3 ++-
 gcc/toplev.c |  3 ++-
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..6909e200de1 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -5299,6 +5299,19 @@ riscv_gpr_save_operation_p (rtx op)
   return true;
 }

+/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
+
+static unsigned HOST_WIDE_INT
+riscv_asan_shadow_offset (void)
+{
+  /* We only have libsanitizer support for RV64 at present.
+
+ This number must match kRiscv*_ShadowOffset* in the file
+ libsanitizer/asan/asan_mapping.h which is currently 1<<29 for rv64,
+ even though 1<<36 makes more sense.  */
+  return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -5482,6 +5495,9 @@ riscv_gpr_save_operation_p (rtx op)
 #undef TARGET_NEW_ADDRESS_PROFITABLE_P
 #define TARGET_NEW_ADDRESS_PROFITABLE_P riscv_new_address_profitable_p

+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET riscv_asan_shadow_offset
+
 struct gcc_target targetm = TARGET_INITIALIZER;

 #include "gt-riscv.h"
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 24c37f655c8..39c596b647a 100644

Re: [PATCH] riscv: fix -Wformat-diag errors.

2022-01-18 Thread Palmer Dabbelt

On Tue, 18 Jan 2022 08:31:12 PST (-0800), gcc-patches@gcc.gnu.org wrote:

Thanks Martin!


Yep.  Seeing this go by, though, I think there's some English issues 
with the original messages.  I'd write it more like this, but I'm never 
100% sure on these things:


   diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
   index 004822bfe6c..2f83303ca51 100644
   --- a/gcc/common/config/riscv/riscv-common.cc
   +++ b/gcc/common/config/riscv/riscv-common.cc
   @@ -375,7 +375,7 @@ riscv_subset_list::add (const char *subset, int 
major_version,
  else
error_at (
  m_loc,
   -  "%<-march=%s%>: extension %qs appear more than one time",
   +  "%<-march=%s%>: extension %qs appears more than one time",
  m_arch,
  subset);
   
   @@ -620,7 +620,7 @@ riscv_subset_list::parsing_subset_version (const char *ext,

if (!major_p)
  {
error_at (m_loc, "%<-march=%s%>: for %<%s%dp%dp?%>, version "
   -  "number with more than 2 level is not supported",
   +  "numbers with more than 2 levels are not supported",
  m_arch, ext, major, version);
return NULL;
  }
   @@ -701,8 +701,9 @@ riscv_subset_list::parse_std_ext (const char *p)
  /* std_ext_p= */ true, &explicit_version_p);
  if (major_version != 0 || minor_version != 0)
{
   -  warning_at (m_loc, 0, "version of % will be omitted, please "
   -"specify version for individual extension");
   +  warning_at (m_loc, 0, "version of % will be ignored, please "
   +"specify versions for each individual "
   +"extension");
}
   
  /* We have special rule for G, we disallow rv32gm2p but allow rv32g_zicsr





On Wed, Jan 19, 2022 at 12:23 AM Martin Liška  wrote:


Pushed as pre-approved by Jeff. The patch fixes -Wformat-diag warnings.

Martin

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_subset_list::add):
Wrap keywords with quotes and remove trailing dots.
(riscv_subset_list::parsing_subset_version): Likewise.
(riscv_subset_list::parse_std_ext): Likewise.
(riscv_subset_list::parse_multiletter_ext): Likewise.
* config/riscv/riscv.cc (riscv_handle_type_attribute): Likewise.
---
  gcc/common/config/riscv/riscv-common.cc | 16 
  gcc/config/riscv/riscv.cc   |  4 ++--
  2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index c1d8431c1fa..004822bfe6c 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -375,7 +375,7 @@ riscv_subset_list::add (const char *subset, int 
major_version,
else
error_at (
  m_loc,
- "%<-march=%s%>: Extension `%s' appear more than one time.",
+ "%<-march=%s%>: extension %qs appear more than one time",
  m_arch,
  subset);

@@ -613,14 +613,14 @@ riscv_subset_list::parsing_subset_version (const char 
*ext,
  {
if (!ISDIGIT (*(p+1)))
  {
-   error_at (m_loc, "%<-march=%s%>: Expect number "
- "after %<%dp%>.", m_arch, version);
+   error_at (m_loc, "%<-march=%s%>: expect number "
+ "after %<%dp%>", m_arch, version);
return NULL;
  }
if (!major_p)
  {
-   error_at (m_loc, "%<-march=%s%>: For %<%s%dp%dp?%>, version "
- "number with more than 2 level is not supported.",
+   error_at (m_loc, "%<-march=%s%>: for %<%s%dp%dp?%>, version "
+ "number with more than 2 level is not supported",
  m_arch, ext, major, version);
return NULL;
  }
@@ -701,8 +701,8 @@ riscv_subset_list::parse_std_ext (const char *p)
  /* std_ext_p= */ true, &explicit_version_p);
if (major_version != 0 || minor_version != 0)
{
- warning_at (m_loc, 0, "version of `g` will be omitted, please "
-   "specify version for individual extension.");
+ warning_at (m_loc, 0, "version of % will be omitted, please "
+   "specify version for individual extension");
}

/* We have special rule for G, we disallow rv32gm2p but allow 
rv32g_zicsr
@@ -906,7 +906,7 @@ riscv_subset_list::parse_multiletter_ext (const char *p,

if (*p != '\0' && *p != '_')
{
- error_at (m_loc, "%<-march=%s%>: %s must separate with _",
+ error_at (m_loc, "%<-march=%s%>: %s must separate with %<_%>",
m_arch, ext_type_str);
   

[PATCH] RISC-V: Document the degree of position independence that medany affords

2022-01-18 Thread Palmer Dabbelt
The code generated by -mcmodel=medany is defined to be
position-independent, but is not guaranteed to function correctly when
linked into position-independent executables or libraries.  See the
recent discussion at the psABI specification [1] for more details.

It would be better to reject these invalid sequences when linking, but
as pointed out in a recent LD bug [2] there may be some compatibility
issues related to the PCREL_HI20 relocations used to initialize GP.
Given the complexity here it's unlikely we'll be able to reject these
sequences any time soon, so instead just document that these may not
work.

[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/245
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=28789

gcc/ChangeLog:

* doc/invoke.texi: Document the degree of position independence
that -mcmodel=medany affords.

Signed-off-by: Palmer Dabbelt 

---

Changes since v1:

* Fix spelling of "guaranteed", twice.
* Reference the binutils bug on rejecting these sequences, for more
  context.
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5504971ea81..7bca621535f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27568,6 +27568,10 @@ Generate code for the medium-any code model. The 
program and its statically
 defined symbols must be within any single 2 GiB address range. Programs can be
 statically or dynamically linked.
 
+The code generated by the medium-any code model is position-independent, but is
+not guaranteed to function correctly when linked into position-independent
+executables or libraries.
+
 @item -mexplicit-relocs
 @itemx -mno-exlicit-relocs
 Use or do not use assembler relocation operators when dealing with symbolic
-- 
2.32.0



Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'

2022-01-20 Thread Palmer Dabbelt

On Thu, 20 Jan 2022 07:44:25 PST (-0800), ma...@embecosm.com wrote:

RISC-V FMIN and FMAX machine instructions are IEEE-754-conformant[1]:

"For FMIN and FMAX, if at least one input is a signaling NaN, or if both
inputs are quiet NaNs, the result is the canonical NaN.  If one operand
is a quiet NaN and the other is not a NaN, the result is the non-NaN
operand."

as required by our `fminM3' and `fmaxM3' standard RTL patterns.

However we only define `sminM3' and `smaxM3' standard RTL patterns to
produce the FMIN and FMAX machine instructions, which in turn causes the
`__builtin_fmin' and `__builtin_fmax' family of intrinsics to emit the
corresponding libcalls rather than the relevant machine instructions.

Rename the `smin3' and `smax3' patterns to `fmin3' and
`fmax3' respectively then, removing the need to use libcalls for
IEEE 754 semantics with the minimum and maximum operations.

[1] "The RISC-V Instruction Set Manual, Volume I: User-Level ISA",
Document Version 2.2, May 7, 2017, Section 8.3 "NaN Generation and
Propagation", p. 48

gcc/
* config/riscv/riscv.md (smin3): Rename pattern to...
(fmin3): ... this.
(smax3): Likewise...
(fmax3): ... this.
---
Hi,

 It's not clear to me how it's been missed or whether there is anything I
might be actually missing.  It looks to me like a clear oversight however.


I'm not really a floating point person, but IIUC It's actually on 
purpose: earlier versions of the ISA spec didn't have this behavior, and 
at the time we originally merged the GCC port we decided to play it 
safe.  Pretty sure we discussed this before on the GCC mailing list 
,maybe around the time the glibc port was going upstream?  I think Jim 
was the one who figured out how all the specs fit together.


I can't find those older discussions, but this definately recently came 
up in glibc:
https://sourceware.org/pipermail/libc-alpha/2021-October/131637.html .  
Looks like back then nobody knew of any hardware that ran glibc and 
implemented the old behavior, but there also haven't been patches posted 
yet so it's not set in stone.


It's probably worth repeating the question here since there are a lot of 
RISC-V users that don't use glibc but do use GCC.  I don't know of 
anyone who implemented the old floating point standards off the top of 
my head, even in embedded land, but I'm pretty lost when it comes to ISA 
versioning these days so I might be missing something.


One option could be to tie this to the ISA spec version and emit the 
required emulation routines, but I don't think that's worth bothering to 
do unless someone knows of an implementation that implements the old 
behavior.



And in any case this change has passed full GCC regression testing (except
for the D frontend, which has stopped being built recently due to a defect
in Debian I haven't yet got to getting fixed) with the `riscv64-linux-gnu'
target using the HiFive Unmatched (U74 CPU) target board, so it seems to
be doing the right thing.

 Timing might a bit unfortunate for this submission and given that it is
not a regression fix I guess this is GCC 13 material.  Please let me know
otherwise.

 In any case OK to apply (when the time comes)?


IMO waiting is the right way to go, as if this does uncover any issues 
they'll be a long-tail sort of thing.  That way we'll at least have a 
whole release cycle for folks to test on their hardware, which is about 
as good as we can do here.


Acked-by: Palmer Dabbelt  # for 13

Someone should probably do the glibc version, too ;)



  Maciej
---
 gcc/config/riscv/riscv.md |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

gcc-riscv-fmin-fmax.diff
Index: gcc/gcc/config/riscv/riscv.md
===
--- gcc.orig/gcc/config/riscv/riscv.md
+++ gcc/gcc/config/riscv/riscv.md
@@ -1214,7 +1214,7 @@
 ;;
 ;;  

-(define_insn "smin3"
+(define_insn "fmin3"
   [(set (match_operand:ANYF0 "register_operand" "=f")
(smin:ANYF (match_operand:ANYF 1 "register_operand" " f")
   (match_operand:ANYF 2 "register_operand" " f")))]
@@ -1223,7 +1223,7 @@
   [(set_attr "type" "fmove")
(set_attr "mode" "")])

-(define_insn "smax3"
+(define_insn "fmax3"
   [(set (match_operand:ANYF0 "register_operand" "=f")
(smax:ANYF (match_operand:ANYF 1 "register_operand" " f")
   (match_operand:ANYF 2 "register_operand" " f")))]


Re: [PATCH] dwarf2out: Fix -gsplit-dwarf on riscv [PR103874]

2022-01-20 Thread Palmer Dabbelt

On Thu, 20 Jan 2022 02:45:53 PST (-0800), gcc-patches@gcc.gnu.org wrote:

Hi!

riscv*-*-* are the only modern targets that !HAVE_AS_LEB128 (apparently
due to some aggressive linker optimizations).


I don't really understand the rest of this, but we do have a subset of 
LEB128 (constant expressions only).  I'm not sure exactly what the 
requirements are here, but one could imagine extending our assembler 
support to cover them -- we might never have full support for LEB128 
expressions (because of linker relaxation), but we might be able to make 
more stuff work.


I'm not sure if that helps or hurts, though, as we'll still be a special 
case.



As the following testcase shows, we mishandle in index_rnglists the
!HAVE_AS_LEB128 && !have_multiple_function_sections case.

output_rnglists does roughly:
  FOR_EACH_VEC_SAFE_ELT (ranges_table, i, r)
{
...
  if (block_num > 0)
{
...
  if (HAVE_AS_LEB128)
{
  if (!have_multiple_function_sections)
{
  // code not using r->*_entry
  continue;
}
  // code that sometimes doesn't use r->*_entry,
  // sometimes r->begin_entry
}
  else if (dwarf_split_debug_info)
{
  // code that uses both r->begin_entry and r->end_entry
}
  else
{
  // code not using r->*_entry
}
}
  else if (block_num < 0)
{
  if (!have_multiple_function_sections)
gcc_unreachable ();
...
}
}
and index_rnglists is what sets up those r->{begin,end}_entry members.
The code did an early if (!have_multiple_function_sections) continue;
which is fine for the HAVE_AS_LEB128 case, because r->*_entry is not
used in that case, but not for !HAVE_AS_LEB128 that uses it anyway.

Fixed thusly, tested on the testcase with x86_64 -> riscv64 cross,
bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-01-20  Jakub Jelinek  

PR debug/103874
* dwarf2out.cc (index_rnglists): For !HAVE_AS_LEB128 and
block_num > 0, index entry even if !have_multiple_function_sections.

* gcc.dg/debug/dwarf2/pr103874.c: New test.

--- gcc/dwarf2out.cc.jj 2022-01-18 11:58:59.0 +0100
+++ gcc/dwarf2out.cc2022-01-19 13:30:08.936008194 +0100
@@ -12094,9 +12094,10 @@ index_rnglists (void)
   if (r->label && r->idx != DW_RANGES_IDX_SKELETON)
r->idx = rnglist_idx++;

-  if (!have_multiple_function_sections)
-   continue;
   int block_num = r->num;
+  if ((HAVE_AS_LEB128 || block_num < 0)
+ && !have_multiple_function_sections)
+   continue;
   if (HAVE_AS_LEB128 && (r->label || r->maybe_new_sec))
base = false;
   if (block_num > 0)
--- gcc/testsuite/gcc.dg/debug/dwarf2/pr103874.c.jj 2022-01-19 
13:35:25.485631843 +0100
+++ gcc/testsuite/gcc.dg/debug/dwarf2/pr103874.c2022-01-19 
13:36:53.608413534 +0100
@@ -0,0 +1,12 @@
+/* PR debug/103874 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g -gsplit-dwarf -dA -Wno-implicit-function-declaration" 
} */
+
+void
+foo (void)
+{
+  {
+bar ();
+baz ();
+  }
+}

Jakub


Re: [PATCH] dwarf2out: Fix -gsplit-dwarf on riscv [PR103874]

2022-01-20 Thread Palmer Dabbelt

On Thu, 20 Jan 2022 13:20:34 PST (-0800), gcc-patches@gcc.gnu.org wrote:

On Thu, Jan 20, 2022 at 01:13:45PM -0800, Palmer Dabbelt wrote:

On Thu, 20 Jan 2022 02:45:53 PST (-0800), gcc-patches@gcc.gnu.org wrote:
> riscv*-*-* are the only modern targets that !HAVE_AS_LEB128 (apparently
> due to some aggressive linker optimizations).

I don't really understand the rest of this, but we do have a subset of
LEB128 (constant expressions only).  I'm not sure exactly what the
requirements are here, but one could imagine extending our assembler support
to cover them -- we might never have full support for LEB128 expressions
(because of linker relaxation), but we might be able to make more stuff
work.

I'm not sure if that helps or hurts, though, as we'll still be a special
case.


HAVE_AS_LEB128 really needs to be able to handle both constants and
difference of two labels in the same section.
Most targets resolve something like that in the assembler as constant which
they encode into sleb128 or uleb128 and put into the section that uses
those directives (typically debugging sections).
If a target performs aggressive linker relaxation, then probably some
relocation would need to be added (but one that can encode the two symbols
for the difference, so perhaps two relocations that must be consecutive or
something similar) and resolve that by the linker.  Though, that would mean
that even in the debugging section offsets wouldn't be fixed during
assembly...


Differences are the hard case for RISC-V, as they can grow numerically.  
That could  then cause the LEB to grow in byte size, possibly violating 
one of our linker relaxation invariants.  The only way I've come up with 
to support these would be to pad the LEBs, and I'm not sure if that's 
legal.


Not sure if I'm missing something, though.


Re: [PATCH] dwarf2out: Fix -gsplit-dwarf on riscv [PR103874]

2022-01-20 Thread Palmer Dabbelt

On Thu, 20 Jan 2022 13:33:35 PST (-0800), Palmer Dabbelt wrote:

On Thu, 20 Jan 2022 13:20:34 PST (-0800), gcc-patches@gcc.gnu.org wrote:

On Thu, Jan 20, 2022 at 01:13:45PM -0800, Palmer Dabbelt wrote:

On Thu, 20 Jan 2022 02:45:53 PST (-0800), gcc-patches@gcc.gnu.org wrote:
> riscv*-*-* are the only modern targets that !HAVE_AS_LEB128 (apparently
> due to some aggressive linker optimizations).

I don't really understand the rest of this, but we do have a subset of
LEB128 (constant expressions only).  I'm not sure exactly what the
requirements are here, but one could imagine extending our assembler support
to cover them -- we might never have full support for LEB128 expressions
(because of linker relaxation), but we might be able to make more stuff
work.

I'm not sure if that helps or hurts, though, as we'll still be a special
case.


HAVE_AS_LEB128 really needs to be able to handle both constants and
difference of two labels in the same section.
Most targets resolve something like that in the assembler as constant which
they encode into sleb128 or uleb128 and put into the section that uses
those directives (typically debugging sections).
If a target performs aggressive linker relaxation, then probably some
relocation would need to be added (but one that can encode the two symbols
for the difference, so perhaps two relocations that must be consecutive or
something similar) and resolve that by the linker.  Though, that would mean
that even in the debugging section offsets wouldn't be fixed during
assembly...


Differences are the hard case for RISC-V, as they can grow numerically.
That could  then cause the LEB to grow in byte size, possibly violating
one of our linker relaxation invariants.  The only way I've come up with
to support these would be to pad the LEBs, and I'm not sure if that's
legal.

Not sure if I'm missing something, though.


Andrew points out that label differences within the same section can't 
increase, so this might be a lot more manageable than I thought it was.


  1   2   3   4   5   6   >