; (rs6000_builtin_types[RS6000_BTI_ptr_V16QI])
Not new of course, but those outer parens are pointless. In macros
write extra parens around uses of parameters, and nowhere else.
Okay for trunk with the formatting fixed. Thanks!
Segher
On Thu, Jul 29, 2021 at 08:30:57AM -0500, Bill Schmidt wrote:
> * config/rs6000/rs6000-builtin-new.def: Add power10 and power10-64
> stanzas.
> + void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
> +TR_STXVRBX vsx_stxvrbx {stvec}
> +
> + void __builtin_altivec_
: it is possible
it makes more opportunities to use unpack etc. insns invisible than that
it helps over unspec. This needs to be tested, and the usual idioms
need testcases, is that what you add here? (/me reads on...)
> + if (BYTES_BIG_ENDIAN)
> +emit_insn (gen_altivec_vmrgh (res, vzero, op1));
> + else
> +emit_insn (gen_altivec_vmrgl (res, op1, vzero));
Ah, so it is *not* using unspecs? Excellent.
Okay for trunk. Thank you!
Segher
Hi!
On Tue, Aug 24, 2021 at 09:20:09AM -0500, Bill Schmidt wrote:
> On 8/23/21 4:40 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:30:55AM -0500, Bill Schmidt wrote:
> >>+; These things need some review to see whether they really require
> >>+; MASK_POW
u should do this like change_zero_ext is done, and
perhaps make sure you do not introduce new is_just_move insns that can
make 2->2 combinations do the wrong thing.
Also somehow make this not take exponential time? It looks like this
should onle be done in cases where change_zero_ext is not, and the
reverse, so this will work fine with a little attention to detail.
gl;hf,
Segher
xt* thing is there
because combine *itself* creates a lot of extra zext*, whether those
exist for the target or not. So this isn't obvious precedent (and that
wouldn't mean it is a good idea anyway ;-) )
Segher
On Fri, Aug 13, 2021 at 11:18:46AM +0800, Kewen.Lin wrote:
> on 2021/8/12 下午11:51, Segher Boessenkool wrote:
> > It is a bad idea to initialise things unnecessary: it hinders many
> > optimisations, but much more importantly, it silences warnings without
> > fixing the proble
Hi!
On Fri, Aug 13, 2021 at 10:34:46AM +0800, Kewen.Lin wrote:
> on 2021/8/12 下午11:10, Segher Boessenkool wrote:
> >> + && VECTOR_UNIT_ALTIVEC_OR_VSX_P (in_vmode))
> >> +{
> >> + machine_mode exp_mode = DImode;
> >> + mach
ig/rs6000/rs6000-call.c +++
> >b/gcc/config/rs6000/rs6000-call.c @@ -12159,6 +12159,11 @@
> >rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) return true; /*
> >flavors of vec_min. */ case VSX_BUILTIN_XVMINDP: + case
format=flawed :-(
Segher
ribute. Use
> UUNSPEC_XXSPLTIDP instead of UNSPEC_XXSPLTID.
Typo ("UU").
Okay for trunk with those trivial fixes. Also okay for backport to 11,
it is trivial enough. Thanks!
Out of interest, did you notice any scheduling differences with this?
Segher
On Wed, Aug 25, 2021 at 02:22:06PM -0400, Michael Meissner wrote:
> On Wed, Aug 25, 2021 at 12:44:16PM -0500, Segher Boessenkool wrote:
> > Out of interest, did you notice any scheduling differences with this?
>
> I don't use the built-ins so I wouldn't notice a differen
Hi!
On Thu, Jul 29, 2021 at 08:30:58AM -0500, Bill Schmidt wrote:
> * config/rs6000/rs6000-builtin-new.def: Add mma stanza.
Okay for trunk. Thanks!
Segher
On Thu, Jul 29, 2021 at 08:30:59AM -0500, Bill Schmidt wrote:
> * config/rs6000/rs6000-builtin-new.def: Add ieee128-hw, dfp,
> crypto, and htm stanzas.
Okay for trunk. Thanks!
Segher
On Thu, Jul 29, 2021 at 08:31:00AM -0500, Bill Schmidt wrote:
> * config/rs6000/rs6000-builtin-new.def: Add cell stanza.
This one is fine, too. Thanks!
Segher
; **
> +; Deprecated overloads that should never have existed at all
> +; **
> +; **
The coding conventions say not to use showy block comments like that,
but it seems appropriate here :-)
Okay for trunk with the looked at. Please don't repost this one.
Thanks!
Segher
to do serious engineering on. If we want
any serious optimisation on it we should do that at tree level (why does
that not happen yet anyway?), and inline all of this. This patch is
really just to make benchmark results saner ;-)
Thanks David!
Segher
Hi!
On Thu, Aug 26, 2021 at 07:59:04AM -0500, Bill Schmidt wrote:
> On 8/25/21 6:27 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:31:01AM -0500, Bill Schmidt wrote:
> >>* config/rs6000/rs6000-overload.def: Add remaining overloads.
> >>+; TODO: Note
c. My use of $HOME/usr for PATH and
> LD_LIBRARY_PATH may be inadequate for bypassing the older environment.
> I don't have access to a current (el8 or later) IBM environment.
You need a newer glibc to get new features provided by that. But,
everything should work with the older versions as well.
Segher
, always refer to
That is historical as well. Long ago the only 128-bit format was
double-double, and those mode names became part of the symbol names.
Changing this to __divic3 etc. would not really help.
Since the internal GCC symbol names are not really user-visible it does
not matter so much.
Segher
g-final { scan-assembler-not {stfd 1,8\(3\)\n\tstfd 3,16\(3\)} } } */
Heh. A little fragile, the compiler could reorder the stores for other
reasons, but the best we can do here I guess.
Okay for trunk with the trivial cleanups. Thanks!
Segher
cc_qsub use? This is fine for complexity, it is just
a simple tail-call jump, just wondering what the compiler thinks is best
here (it matters in other cases, if the inline function has conditional
branches for example).
Segher
ave an #ifdef but
an empty macro (or a "do {} while (0)"), etc.
Okay for trunk, if this is revisited later. Thanks!
Segher
On Fri, Aug 27, 2021 at 07:35:05AM -0500, Bill Schmidt wrote:
> On 8/26/21 6:15 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:31:02AM -0500, Bill Schmidt wrote:
> >>+ /* Execute the autogenerated initialization code for builtins. */
> >>+ rs6000_autoin
quot;p?" as well, or would it
be bad if that ever is used, is this test testing it is not done?
Segher
unk. Thanks!
Segher
e
should be optimised for the human reader, the compiler does not care at
all, you almost never can use that as excuse :-) )
Anyway, you know what is needed :-) Okay for trunk. Thanks!
Segher
though those types are called "vector", they are not, so this does
work correctly even if not TARGET_EXTRA_BUILTINS.
Ideally we will that macro always enabled eventually, but that is later
work. Okay for trunk. Thanks!
Segher
ror ("invalid parameter combination for AltiVec intrinsic %qs", name);
> +return error_mark_node;
> + }
A huge function with a lot of "goto bad;" just *screams* "this needs to
be factored".
> +case ENB_P5:
> + if (!TARGET_POPCNTB)
> + return false;
> + break;
case ENB_P5:
return TARGET_POPCNTB;
and similar for all further cases. It is shorter and does not have
negations, win-win!
> + break;
> +};
Stray semicolon. Did this not warn?
Could you please try to factor this better?
Segher
d the motivating
> >example above is derived from the behaviour of backend patches not yet
> >in the tree [nvptx is currently a STORE_FLAG_VALUE=-1 target].
Should the TARGET_TRULY_NOOP_TRUNCATION documentation be updated now?
In particular the part that talks about TARGET_MODES_TIEABLE_P.
Segher
Hi!
I have backported 9080a3bf2329, a3f6bd789149, and f0529d96f567 to the
GCC 11 branch. This solves PR102062, but is a >2% performance win for
such a trivial patch, too, which is enough reason on its own :-)
Segher
On Fri, Sep 11, 2020 at 04:29:16PM -0500, Qing Zhao wrote:
> > On Sep 11, 2020, at 4:03 PM, Segher Boessenkool
> > wrote:
> >> The parameters that are passed to sys call will be destroyed, therefore,
> >> the attack will likely failed.
> >
> >
instructions.
> * config/rs6000/power10.md (power10-mffgpr, power10-mftgpr): Rename to
> mtvsr/mfvsr.
Please spell out the new names in full, so that it can be searched for.
Okay for trunk. Thank you!
Segher
On Mon, Sep 14, 2020 at 09:46:11AM +0200, Richard Biener wrote:
> On Fri, Sep 11, 2020 at 4:18 PM Segher Boessenkool
> wrote:
> > Until 2014 (and documented just days ago ;-) ) all bits of a partial
> > integer mode were considered unknown.
>
> All bits or all bits outs
is is not a predicate. Do not name it _p please.
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr97019.c
> @@ -0,0 +1,79 @@
> +/* { dg-do compile { target { powerpc_p8vector_ok && le } } } */
Why only on LE? (If there is a reason, the testcase should say; if
there is not, well, it shouldn't say it does then :-) )
Please resend with those things fixed.
Segher
On Thu, Sep 10, 2020 at 12:08:44PM +0200, Richard Biener wrote:
> On Wed, Sep 9, 2020 at 6:03 PM Segher Boessenkool
> wrote:
> > There often are problems over function calls (where the compiler cannot
> > usually *see* how something is used).
>
> Yep. The best way wou
GPR[RA].
The contents of bits 32:63 of GPR[RB] are placed into
byte elements index:index+3 of VSR[VRT+32].
All other byte elements of VSR[VRT+32] are not
modified.
If index is greater than 12, the result is undefined.
Segher
requires this
> treatment and other optabs don't.
Yeah. The register allocator is normally very good in using the same
reg in both places, if that is useful. And it also handles the case
where your machine insns require the two to be the same pretty well.
Not restricting this stuff before RA should be a win.
Segher
After the previous patch we are left with an unreachable BB. This will
ICE if either we have -fschedule-fusion, or we do not have peephole2.
This fixes it. Okay for trunk?
Segher
2020-09-14 Segher Boessenkool
PR rtl-optimization/96475
* bb-reorder.c
On Fri, Sep 11, 2020 at 05:41:47PM -0500, Qing Zhao wrote:
> > On Sep 11, 2020, at 4:51 PM, Segher Boessenkool
> > wrote:
> > It is definitely *not* effective if there are gadgets that set rax to
> > a value the attacker wants and then do a syscall.
>
>
one
> major issue with this as
> Segher mentioned, The middle end does not know some details on the registers,
> lacking such
> detailed information might result incorrect code generation at middle end.
>
> For example, on x86_64 target, when “return” with pop, the scratch regis
On Mon, Sep 14, 2020 at 05:33:33PM +0100, Richard Sandiford wrote:
> > However, for the cases on Power as Segher mentioned, there are also some
> > scratch registers used for
> > Other purpose, not sure whether we can correctly generate zeroing in
> > middle-end for Pow
On Tue, Sep 15, 2020 at 08:32:54AM +0200, Richard Biener wrote:
> On Tue, Sep 15, 2020 at 12:06 AM Segher Boessenkool
> wrote:
> >
> > After the previous patch we are left with an unreachable BB. This will
> > ICE if either we have -fschedule-fusion, or we do not have
ns do).
> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,8,8" } } */
> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,24,8" } } */
> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,8" } } */
> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,48" } } */
> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,23" } } */
Please use {} quotes, and \m\M. \d can be helpful, too.
Segher
rotldi 3,3,26
> blr
>
> would be two valid possibilities for test3 and test5 that don't use
> rldic. Ideally the test would verify the actual values generated by
> the test functions and count instructions.
Well, the point of the test is to verify we get the expected code for
this?
Maybe we should just count insns here? But that would be a different
test. I'm a bit worried about how often the one-bit thing will do
something unexpected, but the rest should be fine and not cause churn.
Segher
ejagnu-cpu=power8" } */
Do you need to test for LE? If not, just always run it? If it works,
it works, it doesn't matter that you do not expect it to ever fail (we
do not really expect *any* test we have to ever fail *anywhere*, heh).
> +/* { dg-final { scan-assembler-not "rldicr\[ \t\]+\[0-9\]+,\[0-9\]+,0,59" }
> } */
Please use {} quotes, and \s and \d.
You can also use {(?n)rldicr.*,0,59} since (?n) makes . not match
newlines anymore.
Okay for trunk with or without those suggestions. Thanks!
Segher
ote that the builtin is not the same as the machine instruction --
here there shouldn't be a difference if compiling for a new enough ISA,
but the builtin is available on anything with at least AltiVec.
Segher
On Mon, Sep 14, 2020 at 10:07:31PM -0500, Qing Zhao wrote:
> > On Sep 14, 2020, at 6:09 PM, Segher Boessenkool
> > wrote:
> >> Gadget 1:
> >>
> >> mov rax, value
> >> syscall
> >> ret
> >
> > No, just
> >
> >
On Tue, Sep 15, 2020 at 12:46:00PM +0100, Richard Sandiford wrote:
> Segher Boessenkool writes:
> > On Mon, Sep 14, 2020 at 05:33:33PM +0100, Richard Sandiford wrote:
> >> > However, for the cases on Power as Segher mentioned, there are also some
> >> > scratc
u need to prevent the
inserted (zeroing) insns from moving -- if you don't, the code after
some zeroing can be used as gadget! You want to always have all
zeroing insns after *any* computational insn, or it becomes a gadget.
Segher
at describes what regex patterns are allowed.
"man re_syntax", and "man tcl" for the one-page tcl intro (it describes
the whole language: the substitutions, the quotes, etc.)
https://www.tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm
https://www.tcl.tk/man/tcl8.6/TclCmd/Tcl.htm
> This all said, Alan's rtx_costs patch touches this same area and he talked
> about removing a similar splitter, so I think I will wait for his code to
> be committed and then rework this on top of his changes.
Yes, good plan. Thanks!
Segher
fs on these expressions, because of some constexpr issue I
> haven't really looked into.
Yeah, the system compiler is 4.8.5 (this is centos7).
> I'm testing this patch. I'll check it in when I'm done.
It is pre-approved, just check it in already please!
Segher
> ---
xample where it is used: it can much easier
say something much more generic!
(And then send a patch first doing FP just as SFDF and replacing it
where we want it; and then a later patch adding KF. That way, your
patch might be readable!)
Thanks,
Segher
Wow, did I miss that? Whoops. (That was PR93012, 72b2f3317b44.)
Okay for trunk. Thanks!
Segher
Okay for trunk. Thanks!
(Btw, please use [patch 2/6] etc. markers? It helps refer to them :-) )
Segher
other passes here
are harmful (maybe the shorten stuff)? But. Targets can also insert
more passes here.
If you want the zeroing insns to stay with the return, you have to
express that in RTL. Anything else is extremely fragile.
Segher
ot;might break users' code"
thing ;-)
Segher
On Tue, Sep 15, 2020 at 08:51:57PM -0500, Qing Zhao wrote:
> > On Sep 15, 2020, at 6:09 PM, Segher Boessenkool
> > wrote:
> > If you want the zeroing insns to stay with the return, you have to
> > express that in RTL.
>
> What do you mean by “express that in RT
On Wed, Sep 16, 2020 at 10:31:54AM +0200, Richard Biener wrote:
> On Tue, Sep 15, 2020 at 6:18 PM Segher Boessenkool
> wrote:
> >
> > On Tue, Sep 15, 2020 at 08:51:09AM +0200, Richard Biener wrote:
> > > On Tue, Sep 15, 2020 at 5:56 AM luoxhu wrote:
On Wed, Sep 16, 2020 at 08:36:35AM -0500, Bill Schmidt wrote:
> This is a cleanup requested by Segher in a previous review. Most
> uses of rs6000_pcrel_p are called for the current function. A
> specialized version for cfun is more efficient for these uses.
Then rename th
... and it causes testsuite regressions on Power. We haven't determined
et if it actually is worse code, but there are testcases that trip on
it. Either way, all patches should be send to gcc-patches@, whether
pre-approved or not.
Please correct this. Thanks!
Segher
Hi!
On Wed, Sep 16, 2020 at 08:37:34PM +0200, Andrea Corallo wrote:
> Segher Boessenkool writes:
>
> > ... and it causes testsuite regressions on Power. We haven't determined
> > et if it actually is worse code, but there are testcases that trip on
> > it. Eith
de, too.
> If we do that, we should be able to remove the handling of
> extract-based addresses in aarch64_classify_index & co.
If we do what? I don't follow, sorry.
(Patch to combine sounds fine fwiw; patches welcome, as always.)
Segher
ore
> + complex" is determined by having a higher set_src_cost. So for
> + example, if we want a plain (reg) address to be replaced with
> + (plus (reg) (const)) when possible then PLUS needs to cost more
> + than zero here. */
Maybe it helps if you more prominenty mention set_rtx_cost and
set_src_cost? Either way, okay for trunk. Thanks!
Segher
(rs6000_rtx_costs): Cost multi-insn
> constants.
Okay for trunk. Note that some p10 insns take a floating point
immediate, but those need to be handled specially anyway.
Thanks!
Segher
ction? Just the insert insns part, not all IOR.
Okay for trunk with the comments changed to the correct syntax, and
factoring masked insert out to a separate function pre-approved if you
want to do that. Thanks!
Segher
t, don't we have a problem
already?
Generic things. Please split this patch up when sending it again, it
does too many different things, and many of those are not obvious.
All such changes that aren't completely obvious (like the previous ones
were) should have some measurement. We are in stage1, and we will
notice (non-trivial) degradations, but if we can expect degradations
(like for this patch), it needs benchmarking.
Since you add !speed all over the place, maybe we should just have a
separate function that does !speed? It looks like quite a few things
will simplify.
Segher
Hi!
On Thu, Sep 17, 2020 at 08:10:22AM +0100, Richard Sandiford wrote:
> Alex Coplan writes:
> The combine parts LGTM otherwise, but Segher should have the
> final say.
I am doubtful this does not regress on many targets.
I'll test it, we'll see :-)
Segher
ay, to resources it does not know about), but you can use it
for anything you want executed approximately as written.
> UNSPEC_VOLATILEs can't be deleted.
(If they are executed at all, anyway ;-) )
Segher
Hi Jeff,
On Thu, Sep 17, 2020 at 05:12:17PM -0600, Jeff Law wrote:
> On 9/3/20 4:37 PM, Segher Boessenkool wrote:
> >> Apart from that, one P9 specific point is that the update form load isn't
> >> preferred, the reason is that the instruction can not retire until bo
-vec-length-full-6.c: Adjust.
The testcase part of course is okay for trunk, if this is the expected
(and good :-) ) code.Thanks,
Segher
On Fri, Sep 18, 2020 at 01:08:42PM +0930, Alan Modra wrote:
> On Thu, Sep 17, 2020 at 12:51:25PM -0500, Segher Boessenkool wrote:
> > > - if (CONST_INT_P (XEXP (x, 1))
> > > - && satisfies_constraint_I (XEXP (x, 1)))
> > > + if (!speed)
> >
ee_code code = TREE_CODE (TREE_TYPE(view_op0));
^
Missing space here ---/
Thanks,
Segher
gt; + operands[2]);
> + rtx sub_target
> + = simplify_gen_subreg (GET_MODE (operands[0]), target, V16QImode, 0);
> + emit_insn (gen_rtx_SET (operands[0], sub_target));
> + DONE;
> +}
Please handle this in rs6000_expand_vector_set, instead. It is fine
to call rs6000_vector_set_var early in that, for example.
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index dd750210758..7e82690d12d 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -5349,7 +5349,7 @@ (define_expand "xl_len_r"
>rtx rtx_vtmp = gen_reg_rtx (V16QImode);
>rtx tmp = gen_reg_rtx (DImode);
>
> - emit_insn (gen_altivec_lvsl_reg (shift_mask, operands[2]));
> + emit_insn (gen_altivec_lvsl_reg_di2 (shift_mask, operands[2]));
So this becomes
emit_insn (gen_altivec_lvsl_reg (DImode, shift_mask, operands[2]));
if you use parameterized names.
Tests... You don't need lp64 anywhere, but then you probably need to
disallow the 64-bit tests. That may be hard?
When you resend, please split it into separate patches for separate
things. Small patches are fine, no, *good* even!
Segher
this just to make the xor thing work? i386 has a peephole to
transform the mov to a xor for this (and the backend could just handle
it in its mov patterns, maybe a peephole was easier for i386, no
idea).
Segher
the libgcc
routines are just too slow? Is it because (the generic) libgcc does not
trap for MIN_INT / -1 ? Some other reason?
(I'm just curious; I cannot figure it out :-) )
Segher
at (it depends on the insn costs if that can
succeed). On gimple it is always cheaper, of course.
Segher
Hi!
On Thu, Sep 17, 2020 at 01:12:19PM +0930, Alan Modra wrote:
> On Wed, Sep 16, 2020 at 07:02:06PM -0500, Segher Boessenkool wrote:
> > > + /* Test both regs even though the one in the mask is
> > > + constrained to be equal to th
Please write (in comments) how much of each insn are expected, and
possibly for what function? Also, bonus points if you make this work
for 32 bit as well (it is almost required even).
Segher
On Mon, Sep 21, 2020 at 09:13:58AM -0500, Qing Zhao wrote:
> > On Sep 18, 2020, at 5:51 PM, Segher Boessenkool
> > wrote:
> >> B. Will provide a default definition in middle end to generate the
> >> zeroing insn for selected registers. Then
uot;): the unpatched compiler ICEs! (At least three
times, even).
during RTL pass: reload
/home/segher/src/kernel/kernel/cgroup/cgroup.c: In function 'rebind_subsystems':
/home/segher/src/kernel/kernel/cgroup/cgroup.c:1777:1: internal compiler error:
in lra_set_insn_recog_data,
On Mon, Sep 21, 2020 at 03:58:25PM -0500, Qing Zhao wrote:
> > On Sep 21, 2020, at 3:34 PM, Segher Boessenkool
> > wrote:
> > But you cannot *add* anything with this interface, and it cannot return
> > different results depending on which return insn this is. It is no
Hi Alex,
On Tue, Sep 22, 2020 at 08:40:07AM +0100, Alex Coplan wrote:
> On 21/09/2020 18:35, Segher Boessenkool wrote:
> Thanks for doing this testing. The results look good, then: no code size
> changes and no build regressions.
No *code* changes. I cannot test aarch64 likme this
e, and
it isn't (only function can be predicates, by definition).
"new_exit_prob" maybe?
> + /* Update BB counts in loop body.
> + COUNT = COUNT
> + COUNT = COUNT * exit_edge_probility
> + The COUNT=COUNT * old_exit_p / new_exit_p. */
Spaces around the "=" please?
Segher
volatile references, any intervening
> volatile insn might affect machine state. */
Confusingly stated, but essentially correct (it is possible we place
the volatile at I2, and everything would still be sequenced correctly,
but combine does not guarantee that).
> is_volatile_p = volatile_refs_p (PATTERN (insn))
> ? volatile_refs_p
> : volatile_insn_p;
Too much subtlety in there, heh.
Segher
+ tail calls. Tail calls don't count against crtl->is_leaf. */
> + for (insn = get_topmost_sequence ()->first; insn; insn = NEXT_INSN
> (insn))
> + if (CALL_P (insn))
> + break;
> + if (!insn)
> + return;
> +}
I don't think that get_topmost_sequence is correct.
Other than that this is fine for trunk (and backports). Thanks!
Segher
(ffi_call_LINUX64): Don't emit global
> entry when __PCREL__. Call using @notoc.
> (ffi_closure_LINUX64, ffi_go_closure_linux64): Likewise.
This is okay for trunk, and for backports (possibly expedited, talk
with Peter for what is wanted/needed for AT).
Thanks!
Segher
On Wed, Sep 23, 2020 at 10:00:01AM +0930, Alan Modra wrote:
> Hi Segher,
>
> On Tue, Sep 22, 2020 at 06:59:42PM -0500, Segher Boessenkool wrote:
> > On Tue, Sep 22, 2020 at 09:55:12AM +0930, Alan Modra wrote:
> > >if (!info->push_p)
> > > -return;
>
On Wed, Sep 23, 2020 at 09:28:33AM -0500, Qing Zhao wrote:
> > On Sep 22, 2020, at 5:37 PM, Segher Boessenkool
> > wrote:
> >> which is very similar to the unspec_volatile case we're talking about.
> >
> > So just like volatile memory accesses, they ha
ot use unspecs unless you have to: they hinder
optimisation much, and if that was your actual *goal*, you will often
find that they do not prevent every optimisation you wanted them to.
Segher
d.texi: Add 'd' for doubleword variant of
> vector insert instruction.
Pushed to trunk as trivial and obvious. Thanks Paul!
Segher
pc does _not_ have a VSX instruction
> like xxinsertw r34, r8, r12 where r8 denotes
> the vector element (or byte position or whatever).
vins[bhwd][v][lr]x does this. Those are Power10 instructions.
Segher
9,2,.LC0@toc@ha
mtvsrwz 32,5
mtvsrwz 33,6
addi 9,9,.LC0@toc@l
lxvw4x 45,0,9
xxspltw 32,32,1
xxspltw 33,33,1
vcmpequw 0,0,13
xxsel 34,34,33,32
blr
Segher
TI 0 "gpc_reg_operand" "=v")
> +(unspec:TI [(match_operand:TI 1 "gpc_reg_operand" "v")]
> + UNSPEC_EXTENDDITI2))]
> + "TARGET_POWER10"
> + "vextsd2q %0,%1"
> + [(set_attr "type" "exts")])
This should use something with sign_extend.
Okay for trunk. Thanks! But the unspecs really need to go sooner
rather than later (these are by far not the only ones, so :-( ).
Segher
ong const __D, int const __N)
> +{
> + __v2di result = (__v2di)__A;
> +
> + result [(__N & 0b1)] = __D;
Especially single-digit numbers look really goofy (like 0x0, but even
worse for binary somehow).
Anyway, okay for trunk, with or without those things improved. Thanks!
Segher
On Wed, Sep 23, 2020 at 05:12:45PM -0500, Paul A. Clarke wrote:
> Copied from gcc.target/i386.
Okay for trunk then. Thanks!
(I peeked, it is just fine ;-) )
Segher
t and zero/sign extend). */
> +
> +/* { dg-do compile {target power10_ok} } */
> +/* { dg-do run {target power10_hw} } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O0" } */
Please comment here what that -O0 is for? So that we still know when we
read it decades from now ;-)
> +/* { dg-final { scan-assembler-times {\mlxvrwx\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mlwax\M} 0 } } */
Maybe all of {\mlwa} here?
Segher
ly removed.
No, it cannot.
This is used for pdepd/pextd/cntlzdm/cnttzdm/cfuged, all of which do
need 64-bit registers to do anything sane.
This should really have defined some new builtin class, and I thought we
could just be tricky and take a massive shortcut. Bill has been hit by
this already as well, sigh :-(
Segher
ecause the RTL would be undefined!
> In ISA 3.1 (power10) the decision was made to only provide the "C" form on
> maximum and minimum.
... for quad precision.
Segher
ersion
> for GCC >= 4.5.
Okay for trunk (and whatever backports you want of course, if any).
Thanks!
Segher
On Fri, Sep 25, 2020 at 08:58:35AM +0200, Richard Biener wrote:
> On Thu, Sep 24, 2020 at 9:38 PM Segher Boessenkool
> wrote:
> > after which I get (-march=znver2)
> >
> > setg:
> > vmovd %edi, %xmm1
> > vmovd %esi, %xmm2
>
301 - 400 of 6091 matches
Mail list logo