luoxhu <[email protected]> writes:
> + /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e:
> +
> + 73: r145:SI=r123:DI#0-0x1
> + 74: r144:DI=zero_extend (r145:SI)
> + 75: r143:DI=r144:DI+0x1
> + ...
> + 31: r135:CC=cmp (r123:DI,0)
> + 72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;clobber
> + scratch;clobber scratch;}
Minor, but it might be worth stubbing out the clobbers, since they're
not really necessary to understand the comment:
72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;...}
> +
> + r123:DI#0-0x1 is param count derived from loop->niter_expr equal to the
> + loop iterations, if loop iterations expression doesn't overflow, then
> + (zero_extend (r123:DI#0-1))+1 could be simplified to zero_extend only.
> + */
> + bool simplify_zext = false;
I think it'd be easier to follow if this was split out into
a subroutine, rather than having the simplify_zext variable.
> + rtx extop0 = XEXP (count, 0);
> + if (GET_CODE (count) == ZERO_EXTEND && GET_CODE (extop0) == PLUS)
This isn't valid: we can only do XEXP (count, 0) *after* checking
for a ZERO_EXTEND. (It'd be good to test the patch with
--enable-checking=yes,extra,rtl , which hopefully would have
caught this.)
> + {
> + rtx addop0 = XEXP (extop0, 0);
> + rtx addop1 = XEXP (extop0, 1);
> +
> + int nonoverflow = 0;
> + unsigned int_mode
> + = GET_MODE_PRECISION (as_a<scalar_int_mode> GET_MODE (addop0));
Heh. I wondered at first how on earth this compiled. It looked like
there was a missing "(...)" around the GET_MODE. But of course,
GET_MODE adds its own parentheses, so it all works out. :-)
Please add the "(...)" anyway though. We shouldn't rely on that.
"int_mode" seems a bit of a confusing name, since it's actually a precision
in bits rather than a mode.
> + unsigned HOST_WIDE_INT int_mode_max
> + = (HOST_WIDE_INT_1U << (int_mode - 1) << 1) - 1;
> + if (get_max_loop_iterations (loop, &iterations)
> + && wi::ltu_p (iterations, int_mode_max))
You could use GET_MODE_MASK instead of int_mode_max here.
For extra safety, it would be good to add a HWI_COMPUTABLE_P test,
to make sure that using HWIs is valid.
> + nonoverflow = 1;
> +
> + if (nonoverflow
Having the nonoverflow variable doesn't seem necessary. We could
just fuse the two "if" conditions together.
> + && CONST_SCALAR_INT_P (addop1)
> + && GET_MODE_PRECISION (mode) == int_mode * 2
This GET_MODE_PRECISION condition also shouldn't be necessary.
If we can prove that the subtraction doesn't wrap, we can extend
to any wider mode, not just to double the width.
> + && addop1 == GEN_INT (-1))
This can just be:
addop1 == constm1_rtx
There's then no need for the CONST_SCALAR_INT_P check.
Thanks,
Richard