On Mon, Nov 18, 2024 at 09:32:29AM +0100, Andreas Krebbel wrote:
> Hi Stefan,
> 
> 
> On 11/12/24 10:35, Stefan Schulze Frielinghaus wrote:
> > > > +  rtx cond = gen_rtx_LTU (<MODE>mode, gen_rtx_REG (CCL1mode, 
> > > > CC_REGNUM), const0_rtx);
> > > > +  if (operands[4] == const0_rtx)
> > > > +    emit_insn (gen_add<mode>3_carry1_cc (operands[0], operands[2], 
> > > > operands[3]));
> > > > +  else
> > > If we would just generate the alc pattern with a carry in of 0, wouldn't 
> > > the
> > > other optimizers be able to figure out that the a normal add would do 
> > > here?
> > > This path does not seem to get exercised by your testcases.
> > The zero carry in can occur due to
> > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=a7aec76a74dd38524be325343158d3049b6ab3ac
> > which is why we have a special case just in order to emit an add with
> > carry out only.
> 
> Ok. I was hoping that this downgrade from add with carry in/out to add with
> carry out would happen somewhat "automatically", if the the carry in happens
> to be a constant zero. But probably not.
> 
> Your testcases invokes the pattern also with a constant 0 as carry in, but
> since you prevent inlining the pattern is never expanded with a const0_rtx.
> The testcase in the commit above is x86 specific, so it might make sense to
> add an invocation which triggers the code path explicitly. Just to be sure.

Ok, just for the curiosity, my take is that the middle end only emits an
u{add,sub}c optab with a constant zero if a pattern is detected as it
would occur e.g. for bitints, i.e., where multiple limbs are about to be
handled and the first one has a zero carry in.  Thus, just invoking the
test case with a zero results in an add-overflow only as expected.
Anyhow, I have added tests u{add,sub}c-3.c in order to test this path,
too.

> 
> > > create a CCU mode comparison result, which is currently consumed as CCL1 
> > > by
> > > the ALC pattern. This seems to be inconsistent. s390_emit_compare returns
> > > the condition, which I think needs to be fed into the alc_carry1_cc 
> > > pattern
> > > as input condition.
> > > 
> > > Also is LTU really the correct code here? CCU + LTU would expect CC1, but
> > > you want CC2 or CC3 for the carry in, so GTU should be the better choice.
> > > s390_alc_comparison should make sure that only valid combinations are
> > > accepted, CCU + GTU would be one of them.
> > I was coming up with my own condition since conditions created by
> > s390_emit_compare() are of void mode which is why the alc predicates
> > s390_alc_comparison() failed since these require GPR mode.  I've fixed
> > that by using PUT_MODE_RAW on those.  I think we could also remove the
> > mode from the match_operand which might be more appropriate.  I've done
> > it the latter way for sub<mode>3_slb_borrow1_cc.  Once we have settled
> > for one or the other version I will align uaddc/usubc.
> 
> The PLUS/MINUS arithmetic operations require both operands to have a proper
> integer mode. So I think dropping the mode from match_operand would be
> wrong. On the other hand, an IF_THEN_ELSE always requires the comparison to
> have VOIDmode, that's what s390_emit_compare is supposed to be used for. So
> taking what s390_emit_compare generates and changing the mode is the right
> thing here. Since PUT_MODE is always a bit ugly, we might also want go with
> extending s390_emit_compare with a target mode operand defaulting to
> VOIDmode instead.

Ok, I have added a mode operand to s390_emit_compare() and changed
everything accordingly.

> 
> > > > +/* { dg-do run } */
> > > > +/* { dg-options "-O2 -mzarch -save-temps -fdump-tree-optimized" }  */
> > > > +/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 2 "optimized" { 
> > > > target lp64 } } } */
> > > > +/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 1 "optimized" { 
> > > > target { ! lp64 } } } } */
> > > > +/* { dg-final { scan-assembler-times "\\talcr\\t" 1 } } */
> > > > +/* { dg-final { scan-assembler-times "\\talcgr\\t" 1 { target lp64 } } 
> > > > } */
> > > Your checks seem to rely on the testcase being compiled with at least
> > > -march=z13.
> 
> I think for the run test you would have to make sure that the test is not
> executed on machines older than z13 then.
> 
> /* { dg-do run { target { s390_useable_hw } } } */

Didn't know about this neat predicate.  Will probably use it more often
in the future :)

I will send a v2 shortly.

Cheers,
Stefan

Reply via email to