*should* be volatile, or else the compiler might optimise it away
in unexpected cases, etc.
Segher
On Mon, Mar 09, 2020 at 07:42:20PM +0100, J.W. Jagersma wrote:
> On 2020-03-09 19:01, Segher Boessenkool wrote:
> > On Mon, Mar 09, 2020 at 01:54:53PM +0100, Richard Biener wrote:
> >> int foo = 0
> >> try
> >> {
> >>
r if you disallowed this option combination? Don't
allow an options that says ISA X insns are allowed, but ISA Y insns are
not, with Y < X. In this case X is 2.06 and Y is 2.02 or 2.03 or 2.04.
Segher
from nearby code: error ("%qs requires %qs", "-mdirect-move",
> "-mvsx");
Yes, that makes the translators' job easier (and forces more consistent
output quoting, etc.)
Segher
then we don't
> need to say whether the asm may clobber them before or after throwing.
Yeah. Which users will *never* get right, that is, it would be hard to
use any such interface correctly.
Segher
It would be nice if we could keep it cached after it has been resolved
once, this has potential for regressing performance if we don't? And
LD_BIND_NOW should keep working just as fast as it is now, too?
Segher
T_MOVE)
error ("%qs requires %qs", "-mdirect-move", "-mvsx");
rs6000_isa_flags &= ~OPTION_MASK_DIRECT_MOVE;
}
(and many other cases there), we could do this there as well (so, don't
allow -mvsx (maybe via a -mcpu= etc.) at the same time as -mno-fprnd).
Do you see problems with that?
Segher
On Thu, Mar 12, 2020 at 12:03:08PM -0600, Jeff Law wrote:
> On Sat, 2020-02-08 at 10:41 -0600, Segher Boessenkool wrote:
> > I don't think each stanza of code should use it's own "noop-ness", no.
> Richard's patch is actually better than mine in that regard as
On Thu, Mar 12, 2020 at 07:42:30PM +0100, J.W. Jagersma wrote:
> On 2020-03-12 16:32, Segher Boessenkool wrote:
> > On Thu, Mar 12, 2020 at 02:08:18PM +0100, Richard Biener wrote:
> >>> It wasn't clear from my message above, but: I was mostly worried about
> >>
On Thu, Mar 12, 2020 at 12:47:04PM -0600, Jeff Law wrote:
> On Thu, 2020-03-12 at 13:23 -0500, Segher Boessenkool wrote:
> > > else if (n_sets == 1
> > > -&& MEM_P (trial)
> > > +&& ! CALL_P (insn)
> &
The df dataflow solvers use the aux field in the basic_block struct,
although that is reserved for any use by passes. And not only that,
it is required that you set all such fields to NULL before calling
the solvers, or you quietly get wrong results.
This changes the solvers to use a local array
be thorough tests on many archs, showing it helps on
average, and it doesn't regress anything. I can do that for you, but
not right now).
The code needs more comments, and the commit message should say what is
done and why you made those choices.
In general, we should have *fewer* zero_extract, not more.
Segher
that crosses basic blocks.
Segher
Hi,
On Thu, Mar 12, 2020 at 10:29:45PM +, Segher Boessenkool wrote:
> The df dataflow solvers use the aux field in the basic_block struct,
> although that is reserved for any use by passes. And not only that,
> it is required that you set all such fields to NULL before calling
>
gt;
> - the source code can't make any assumption about the values bound
> to output operands when an exception is raised
And the easiest (and only feasible?) way to do this is for the compiler
to automatically make an input for every output as well, imo.
Segher
Hi!
On Fri, Mar 13, 2020 at 10:06:01AM +1030, Alan Modra wrote:
> On Thu, Mar 12, 2020 at 11:57:17AM -0500, Segher Boessenkool wrote:
> > On Thu, Mar 12, 2020 at 01:18:50PM +1030, Alan Modra wrote:
> > > With lazy PLT resolution the first load of a PLT entry may be a value
&
case of "=rm" and similar which have not been
> discussed so far, but are (in my experience) the most common operand
> constraint.
On some CISC targets, sure. Not common on load-store architectures
(like most things from the last 30+ years).
Segher
just fine;
3) The optimizers do not handle zero_extract very well at all (this
includes simplify-rtx, to start with).
sign_extract is nastier -- we really want to have a sign_extend that
works on separate bits, not as coarse as address units as we have now --
but it currently isn't handled much either.
Segher
On Fri, Mar 13, 2020 at 07:30:17AM +0100, Richard Biener wrote:
> On March 12, 2020 11:29:45 PM GMT+01:00, Segher Boessenkool
> wrote:
> >The df dataflow solvers use the aux field in the basic_block struct,
> >although that is reserved for any use by passes. And not only that,
Run tests should use vmx_hw, not just powerpc_altivec_ok. Committed.
Segher
gcc/testsuite/
PR target/94176
* gcc.target/powerpc/fold-vec-mule-misc.c: Use vmx_hw selector.
---
gcc/testsuite/ChangeLog | 5 +
gcc/testsuite/gcc.target/powerpc
f those matters (assuming that matches reality well, which it has to
anyway); latency matters. How many insns are zero_extract instead of
something else is not a good indicator of performance.
Segher
On Mon, Mar 16, 2020 at 05:47:03PM +, Richard Sandiford wrote:
> Segher Boessenkool writes:
> >> we do delete "x = 1" for f1. I think that's the expected behaviour.
> >> We don't yet delete the initialisation in f2, but I think in principle
&
sting
> > inline asm will have UB, with -fnon-call-exceptions. I think that is
> > an even less desirable option.
(I'm assuming we do not want that option).
> Normally, for SSA names in something like:
>
> _1 = foo ()
>
> the definition of _1 does not take place when foo throws.
But that is not how RTL works, afaik.
Segher
I admit the optab hack can immediately make it work. :)
But it opens up all kinds of other problems. To begin with, how is a
short vector mapped to a "real" vector?
We don't have ops on short integer types, either, for similar reasons.
Segher
Hi!
On Sat, Mar 14, 2020 at 09:30:02AM +1030, Alan Modra wrote:
> On Fri, Mar 13, 2020 at 10:40:38AM -0500, Segher Boessenkool wrote:
> > > Using a call-saved register to cache a load out of the PLT looks
> > > really silly
> >
> > Who said anything about usin
;canonicalisation" rules that are really just "this
works better on targets A and B" do not usually work well. Rules that
are predictable and that actually simplify the code might still need
all targets to update (and target maintainers will grumble, myself
included), but at least that is a way forwards (and not backwards or
sideways).
Segher
hink this is a difficult decision
to make, considering that you already get pretty bad performance with
that flag (if indeed it works correctly at all).
Segher
Hi!
On Thu, Mar 19, 2020 at 09:18:06AM +0100, Richard Biener wrote:
> On Wed, Mar 18, 2020 at 8:34 PM Segher Boessenkool
> wrote:
> > We don't have ops on short integer types, either, for similar reasons.
>
> How do you represent two vector input shuffles? The usu
ool global_init_p)
>rs6000_isa_flags &= ~OPTION_MASK_CRYPTO;
> }
>
> + if (!TARGET_FPRND && TARGET_VSX)
> +{
> + if (rs6000_isa_flags_explicit & OPTION_MASK_FPRND)
> + /* TARGET_VSX = 1 implies Power 7 and newer */
> + error ("%qs not compatible with Power 7 and newer", "-mno-fprnd");
> + rs6000_isa_flags &= ~OPTION_MASK_FPRND;
> +}
Please make such changes if you agree. Either way, okay for trunk.
Thank you, and sorry the review took so long.
Segher
> No, this does not mean an equality comparison with zero. I have mentioned
> this in my previous mail.
This should be simplified to
(set (reg:CC_NZ 66 cc)
(compare:CC_NZ (and:SI (reg:SI 103)
(const_int 1536))
(const_int 0)))
(but it isn't), and that is just *and3nr_compare0, which is a
"tst" instruction. If this is fixed (in simplify-rtx.c), it will work
as you want.
Segher
Hi!
On Thu, Mar 19, 2020 at 07:46:53PM -0500, Segher Boessenkool wrote:
> Please make such changes if you agree. Either way, okay for trunk.
Oh, and okay for backport to 9 next week :-)
Segher
u actually use ANY_EXTEND, which makes a lot more sense :-)
Did you see combine create a sign_extend, ever? Or do those just come
from combining other insns that already contain a sign_extend?
Segher
simpler (and more obviously correct) if it
was more explicit CC_REGNUM is a fixed register, and the code would use
it directly everywhere?
(Something for stage1 I suppose, if you / the aarch people want to do
this at all :-) )
This patch does look correct to me, fwiw.
Segher
CC_NZ (and:SI (lshiftrt:SI (reg:SI 102)
> (const_int 8 [0x8]))
> (const_int 6 [0x6]))
> (const_int 0 [0]))
> The reason is that it knows nothing about CC_NZ.
Yeah, maybe not in simplify-rtx.c, hrm. There is SELECT_CC_MODE for
these things, and combine knows about that (not many passes do).
Segher
;ll try it out, see what it does on other
targets. (It will have to wait for GCC 11 stage 1, of course).
Thanks!
Segher
p.s. Please use a correct mime type? application/octet-stream isn't
something I can reply to. Just text/plain is fine :-)
n verify the issue and the fix, it would be appreciated.
It looks correct, yes.
Is there some test that could catch this? And similar cases (*are* there
any similar builtins / macros / etc.?)
Okay for trunk either way. Thanks! Also okay for backporting, after
letting it simmer for a bit.
patch with those
fixes please? Make sure the changelog agrees with the patch (and don't
say "why" in changelog -- say that in the commit message. Which is
free form, so you have much more freedom to explain things in a useful
order).
Segher
rent functions is larger than this difference; to be dealt
> > > with separately.
> > >
> > > Tested cris-elf, x86_64-linux, powerpc64le-linux, 2/3 through
> > > aarch64-linux (unexpectedly slow).
> > >
> > > Ok to commit?
> >
> > No, sorry.
>
> Sigh. I'm very interested in what your investigation will show.
It shows we can change to use single_set here.
> > One thing that could work is allowing (unused) clobbers of fixed
> > registers (like you have here), or maybe some hook is needed to say this
> > register is like a flags register, or similar. That should work for you,
> > and not regress other targets, maybe even help a little? We'll see.
>
> Still, there is already TARGET_FLAGS_REGNUM (a "hooked"
> constant), so I take it you would be happy if we recognize a
> clobber of *just that*, in a parallel? (I'll take care of
> updating tm.texi of course.)
Most targets do not use cmpelim, and many *can not* define
TARGET_FLAGS_REGNUM, unfortunately.
I'll review the original patch again, to point out where it still needs
changing.
Thanks,
Segher
ust the type when calling single_set.
Don't change the name please. Changing it to take an rtx_insn* is fine,
we aren't likely to change back to testing the resulting patterns.
> I checked the original commit, c4c5ad1d6d1e1e a.k.a r263067 and
The history is years older (some of which is on gcc-patches@).
I'll make a simpler patch. Thanks!
Segher
p2 = gen_reg_rtx (DImode);
> +
> + rtx mask = GEN_INT (HOST_WIDE_INT_M1U << 32);
> + emit_insn (gen_anddi3 (op2, op1, mask));
Groovy :-)
So, it looks like you can remove the ? and ! alternatives, leaving just
the first alternative?
Segher
l to more than 25
insns total (which is what the "only small loops" does, sort of -- it
also avoids unrolling 3x a bit, yes), and don't unroll to more than 2
calls, and not to more than 4 branches (I'm making up those numbers, of
course, and PARAMS would be helpful). Some of this already does exist,
and might need retuning for us?
Segher
ested for by that power10_hw selector.
> + if ( !__builtin_cpu_supports ("mma"))
> +{
> + printf ("Error: __builtin_cpu_supports says mma not supported.\n");
> + ret++;
> +}
And for this, we probably want a mma_hw sooner rather than later.
Segher
perands[0]));
> + DONE;
> +}
> + [(set_attr "length" "12")
> + (set_attr "type" "vecfloat")
> + (set_attr "isa" "p8v")])
> +
No extra whiteline please.
Maybe change it back to just SI? It won't match often at all for QI or
HI anyway, it seems. Sorry for that detour. Should be good with the
above nits fixed :-)
Segher
On Wed, Jul 08, 2020 at 11:39:56AM +0800, Jiufu Guo wrote:
> Segher Boessenkool writes:
> > I am not happy about what is considered "a complex loop" here.
> For early exit, which may cause and *next* unrolled iterations may be
> not executed, then unroll may be not benif
I should know :-/
Please fix everything Will found (as always, thanks Will!) I don't see
more problems, so fourth time should be the charm? :-)
Segher
On Wed, Jul 08, 2020 at 10:55:59AM -0500, will schmidt wrote:
> On Tue, 2020-06-30 at 18:39 -0500, Segher Boessenkool wrote:
> > On Tue, Jun 30, 2020 at 12:57:45PM -0500, will schmidt wrote:
> > > Add support for the vmsumudm instruction and tie it into the
> > > v
Hi Jiufu,
On Thu, Jul 09, 2020 at 04:01:38PM +0800, Jiufu Guo wrote:
> Segher Boessenkool writes:
> >> But for each single condition, loop unrolling may still be helpful.
> >> While, if these conditions are all occur in a loop, it would be more
> >> possible
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96125.c
> @@ -0,0 +1,47 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
powerpc_vsx_ok is not the right test for -mcpu=power8 (it means p7).
Usually powerpc_p8vector_ok is used... not a great situation, but :-)
Segher
SX_MM [(match_operand:DI 1 "gpc_reg_operand" "b")]
> +UNSPEC_MTVSBM))]
> + "TARGET_POWER10"
> + "mtvsrm %0,%1";
> + [(set_attr "type" "vecsimple")])
This can take any "r", not just "b" I think? I.e. r0 is allowed here.
The rest looks great. Okay with those tweaks. Thanks!
Segher
allow it with CMODEL_SMALL; no other action in this block).
Put that
((rs6000_isa_flags & OPTION_MASK_MINIMAL_TOC) != 0)
check inside the block please, together with the CMODEL_SMALL check?
Segher
;& vsx_reg_sfsubreg_ok (operands[0], SFmode)"
Put this in the insn condition? And since this is just a predicate,
you can just use it instead of gpc_reg_operand.
(The split condition becomes "&& 1" then, not "").
Segher
On Thu, Jul 09, 2020 at 04:10:41PM -0500, Peter Bergner wrote:
> On 7/9/20 12:11 PM, Segher Boessenkool wrote:
> [snip]
> > So maybe we should just do all builtins always?
>
> I think that is the correct thing to do, but I think maybe that
> should wait for Bill'
gcc.target/powerpc/mma-double-test.c: New file.
Okay for trunk, and for GCC 10 backport as well. Thanks!
Segher
insn (target, gen_lowpart (V4SFmode, tmpV2DI));
(This is a good example of why: it isn't obvious from just seeing this
that the tmpV2DI is a variable, while the V4SFmode is a symbolic
constant).
Looks fine other than that :-)
Segher
arting with those strings).
> +/* { dg-final { scan-assembler-times {\mlwz\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
Okay for trunk with those changes (or post again if you prefer). Thanks!
Segher
m other similar defines.) Thanks.
Good question. I do not know.
Well... Since this define_insn* requires p8 *anyway*, we do not need
any of these sf_subreg things? We always know for each one if it should
be true or false.
> + "TARGET_NO_SF_SUBREG"
But here we should require p8 some other way, then.
> + (set_attr "isa" "p8v")])
(This isn't enough, unfortunately).
Segher
> says we get the same code before/after (which pass undoes that; RA?).
combine can do it just fine, in the simpler cases. RA can do more cases.
Segher
uot; { return 1 }
> default { return 0 }
> }
Hrm, do we want it to be named ppc_mma_hw? Why not mma_hw just like
most other things?
And keep all rs6000 keywords together please (also power10_hw).
(I also wonder why rs6000 does things differently here, it's the only
arch that uses this apparently, hrm).
Okay for trunk (and backport to 10) modulo those nits. Thanks!
Segher
ar / same as above.
This is okay for trunk (with that improved a bit, also the typos and
other doc things Will found). Thanks!
Segher
On Thu, Jul 09, 2020 at 11:02:39AM -0500, will schmidt wrote:
> > * config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_REPLACE_ELT,
> > P10_BUILTIN_VEC_REPLACE_UN): New.
>
> New what?
Just "New." is fine :-)
Segher
Hi!
On Wed, Jul 08, 2020 at 12:59:12PM -0700, Carl Love wrote:
> [PATCH 3/6] rs6000, Add vector replace builtin support
This is okay for trunk. Thanks!
Segher
rds optimizes the
> > > (reg:SI a) = (zero_extend:SI (reg:QI a))
> > > ... (subreg:QI (reg:SI a) 0) ...
> >
> > So the above isn't fixable? Because it would probably be the more
> > generic fix, right?
>
> I'm afraid it is not, CCing Segher on that. The question is
we should just
call the function check_effective_target_power10_hw and then everything
works as we want?
(For future consideration, of course :-) )
Segher
s6000_expand_ternop_builtin) [CODE_FOR_vsldb_v16qi,
> CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si, CODE_FOR_vsldb_v2di,
> CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi, CODE_FOR_vsrdb_v4si,
> CODE_FOR_vsrdb_v2di}: Add clauses.
"]" (not "}").
Okay for trunk. Thank you!
Segher
On Mon, Jul 13, 2020 at 07:29:00AM +0200, Hans-Peter Nilsson wrote:
> > From: Segher Boessenkool
> > Date: Tue, 7 Jul 2020 22:50:43 +0200
>
> > I'll make a simpler patch. Thanks!
>
> You're welcome. So, you'll take care of the updated patch
> your
u mean "will show whether" or is it already complete?
It did complete, yes (and didn't change a single resulting intruction).
So that was easy :-)
> > I'll review the original patch again, to point out where it still needs
> > changing.
>
> ...but if you're in progress with a single_set variant, I'm all
> for it.
Yup, it's pretty simple actually :-)
Thanks,
Segher
ludes
> the result of expand_compound_operation/make_compound_operation.)
2-2 is always reducing latency if the costs are equal (and sane ;-) ),
that is a large part of what makes 2-2 combinations useful. Originally
the output of i2 is input to i3, but not anymore in the new insns.
Segher
ich gets rid of one log_link. If the isnn_cost
stays the same, it always wins something else.
> Alternatives from the top of my head, one of:
...
5) Improve your target so that its insn_cost reflects ithe costs of
the insns better.
Can you share some typical examples where things are worse with the
current behaviour?
Segher
PARAM here. */
Okay for trunk with that. Thanks!
Segher
cint_operand" "n")]
> + UNSPEC_XXSPLTI32DX))]
> + "TARGET_POWER10"
> + "xxsplti32dx %x0,%2,%3"
> + [(set_attr "type" "vecsimple")])
(a space too much indent here)
> +;; Return 1 if op is a unsigned 1-bit constant integer.
> +(define_predicate "u1bit_cint_operand"
"an unsigned"
> +long long
> +rs6000_const_f32_to_i32 (rtx operand)
> +{
> + long long value;
> + const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (operand);
> +
> + gcc_assert (GET_MODE (operand) == SFmode);
> + REAL_VALUE_TO_TARGET_SINGLE (*rv, value);
> + return value;
> +}
Can this just return "int"? (Or "unsigned int"?)
The rest of the patch looks good.
Segher
ra indent both
before and after it; } always aligns exactly with the {.
Okay for GCC 8 with that cleaned up. Thank you!
Segher
st-model to conv-vectorize-[12].c
or even
rs6000/test: Add -fno-vect-cost-model to some tests
Okay for trunk. Thanks!
Segher
On Tue, Jul 14, 2020 at 04:33:42PM -0500, Segher Boessenkool wrote:
> > If combine only did lower-cost combinations (perhaps with
> > Richard Sandifords lower-size-when-tied suggestion), I guess
> > this wouldn't happen. 0:-)
>
> And we would regress (a LOT).
Li
on powerpc64-linux {-m32,-m64}. Committed.
Segher
2020-07-16 Hans-Peter Nilsson
Segher Boessenkool
PR target/93372
* combine.c (is_just_move): Take an rtx_insn* as argument. Use
single_set on it.
---
gcc/combine.c | 11 ++-
1 file changed, 6
s !dfp usages are basically disabling those tests completely.
> What we really want is to know whether the compiler is generating
> hardware instructions or calling the libcalls. For that, we need
> to test hard_dfp.
>
> This patch bootstrapped and regtested with no regressions
On Fri, Jul 17, 2020 at 04:18:55PM -0500, Peter Bergner wrote:
> On 7/17/20 3:23 PM, Segher Boessenkool wrote:
> > On ISA 3.0B and later you can do
> >
> > mffscdrni %3,7
> > drdpq %2,%1
> > mffscdrn %3,%3
> > drsp %0,%2
> >
> >
inal { scan-assembler-not {\msldi\M} } } */
> +/* { dg-final { scan-assembler-times {\mrldicr\M} 1 } } */
I'm not sure that works on older cpus? Please test there, and add
-mdejagnu-cpu=power8 to the dg-options if needed. Also test on BE please.
Okay for trunk with those last details taking care of. Thank you!
Segher
really test on a BE p9, but ideally you would
do that as well ;-) )
So, okay for trunk if all patches that are required for these tests have
been committed. Thanks!
Segher
*-linux anyway, it will work. But we should
probably have a selector for this, or alternatively, allow the option
always (the target cannot run the resulting code, but we have many other
options like that, starting with -mcpu=). David, what is your
preference?
The rs6000 parts of this patch are fine for trunk. Thanks!
Segher
o
not work with this insn) might be helpful still.
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> +/* { dg-final { scan-assembler-times {\msplati\M} 6 } } */
> +/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */
Those aren't existing instructions? How did this pass testing? I guess
the testcase was skipped?
Segher
ens here please?
> +/* { dg-final { scan-assembler-times {\msplati\M} 6 } } */
> +/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */
Same issue here as in 5/6.
Okay for trunk with those things fixed, and the testcases tested and
the expected assembler code corrected. Thanks!
Segher
On Tue, Jul 21, 2020 at 06:37:29PM -0400, David Edelsohn wrote:
> On Tue, Jul 21, 2020 at 5:54 PM Segher Boessenkool
> wrote:
> > always (the target cannot run the resulting code, but we have many other
> > options like that, starting with -mcpu=). David, what is your
> >
but that _is_ obviously the more generic name, so
that is good.
Thanks,
Segher
of machine code? Like,
per fetch group.
Can you not use ASM_OUTPUT_ALIGN_WITH_NOP (or ASM_OUTPUT_MAX_SKIP_ALIGN
even) then? GCC has infrastructure for that, already.
Segher
>loop_info)
> -rs6000_density_test (cost_data);
> +{
> + adjust_vect_cost (cost_data);
> + rs6000_density_test (cost_data);
> +}
^^^ consistency :-)
The rs6000 parts are fine for trunk, thanks!
Segher
/mma-double-test.c: Update storing results for
> correct little-endian ordering.
> * gcc.target/powerpc/mma-single-test.c: Likewise.
Okay for trunk. It's not going to benefit from any soak-in time other
than what you have tested already, so it is fine for 10 immediately as
well. Thanks!
Segher
TARGET_INSN_COSTS.
It is already printed in the generated asm with -dp? Not sure if you
want more detail than that.
'-dp'
Annotate the assembler output with a comment indicating which
pattern and alternative is used. The length and cost of each
instruction are also printed.
Segher
Peter saw, the analysis when to do or not do this isn't
as good as could be hoped for.
> This can be particularly bad for performance if you have classes with no call
> saved registers.
Do we though? Well, "d" for vectors, but there shouldn't be insns that
require that?
Segher
On Thu, Jul 23, 2020 at 02:29:00PM -0600, Jeff Law wrote:
> On Thu, 2020-07-23 at 15:19 -0500, Segher Boessenkool wrote:
> > On Thu, Jul 23, 2020 at 01:42:59PM -0600, Jeff Law wrote:
> > > On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> > >
ew cores tend to have big granules code size
> would blow. One advantage of the implemented algorithm is that even if
> slightly conservative it's impacting code size only where an high branch
> density shows up.
What is "big granules" for you?
Segher
t; +error ("%qs requires %qs", "-mpower10", "-mcpu=power10");
This still allows -mpower10 without corresponding -mcpu=. We should
just remove this command like flag (but keep the internal flag); for
power10 we can do that without any issues, it is new (some testcases
will need fixing, but it is that: fixing).
Segher
Hi!
On Fri, Jul 24, 2020 at 09:01:33AM +0200, Andrea Corallo wrote:
> Segher Boessenkool writes:
> >> Correct, it's a sliding window only because the real load address is not
> >> known to the compiler and the algorithm is conservative. I believe we
> >> cou
Hi Jozef,
On Fri, Jul 24, 2020 at 12:50:48PM +0100, Jozef Lawrynowicz wrote:
> On Thu, Jul 23, 2020 at 01:34:22PM -0500, Segher Boessenkool wrote:
> > On Thu, Jul 23, 2020 at 04:56:14PM +0100, Jozef Lawrynowicz wrote:
> > > + /* The returned cost must be relative to COSTS_N_
i (arg))
{
blalalala
(whichever reads best in this context).
> + // fix PR96247
Repeating that many times isn't helping the reader... It isn't
particularly useful even a single time, anyway? It is clear what this
does, and if anyone wants to see history, we have Git.
Segher
e generated compiler slower? It will
at least potentially have fewer inlining opportunities, but does that
matter?
Thanks for working on this,
Segher
On Fri, Jul 24, 2020 at 11:10:29AM -0500, Peter Bergner wrote:
> On 7/24/20 6:32 AM, Segher Boessenkool wrote:
> > On Thu, Jul 23, 2020 at 08:15:42PM -0500, Peter Bergner wrote:
> >> + /* If the user explicitly uses -mpower10, ensure our ISA flags are
> >> + compati
ithout very
well defined semantics even. Not without first being shown no
alternatives are acceptable, anyway :-)
Segher
)\mb.*\n\snop\n} } } */
(\m is a zero-width start-of-word match, like \< in grep; (?n) means .
does not match newlines (if you know Perl, it turns /m on and /s off --
the opposite of the defaults for Tcl).
(or you could do [^\n]* or even just \S* , no (?n) needed then).
Segher
00/rs6000.opt: Add -mblock-ops-unaligned-vsx.
> * doc/invoke.texi: Document -mblock-ops-unaligned-vsx.
> + if (TARGET_BLOCK_OPS_UNALIGNED_VSX)
> printf("TARGET_BLOCK_OPS_UNALIGNED_VSX\n");
Stray debug code?
Okay for trunk with those details taken care of. Thank you!
Segher
, and will happen later.
Tested on powerpc64-linux {-m32,-m64}; committed. (Testcase is from
Peter Bergner originally; all mangling is mine).
2020-07-24 Segher Boessenkool
* config/rs6000/rs6000.opt: Delete -mpower10.
gcc/testsuite/
* gcc.target/powerpc/pr95907.c: New.
---
gcc
1901 - 2000 of 6091 matches
Mail list logo