On Wed, 11 Nov 2015, Prathamesh Kulkarni wrote:
> On 10 November 2015 at 20:11, Richard Biener <[email protected]> wrote:
> > On Mon, 9 Nov 2015, Prathamesh Kulkarni wrote:
> >
> >> On 4 November 2015 at 20:35, Richard Biener <[email protected]> wrote:
> >> >
> >> > Btw, did you investigate code gen differences on x86_64/i586? That
> >> > target expands all divisions/modulo ops via divmod, relying on CSE
> >> > solely as the HW always computes both div and mod (IIRC).
> >> x86_64 has optab_handler for divmod defined, so the transform won't
> >> take place on x86.
> >
> > Ok.
> >
> >> > +
> >> > + gassign *assign_stmt = gimple_build_assign (gimple_assign_lhs
> >> > (use_stmt), rhs);
> >> > + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt);
> >> >
> >> > Ick. Please use
> >> >
> >> > gimple_set_rhs_from_tree (use_stmt, res);
> >> Um there doesn't seem to be gimple_set_rhs_from_tree.
> >> I used gimple_assign_set_rhs_from_tree which requires gsi for use_stmt.
> >> Is that OK ?
> >
> > Yes.
> >
> >> > update_stmt (use_stmt);
> >> > if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))
> >> > cfg_changed = true;
> >> >
> >> > + free_dominance_info (CDI_DOMINATORS);
> >> >
> >> > do not free dominators.
> >>
> >> I have done the suggested changes in the attached patch.
> >> I have a few questions:
> >>
> >> a) Does the change to insert DIVMOD call before topmost div or mod
> >> stmt with matching operands
> >> look correct ?
> >
> > + /* Insert call-stmt just before the topmost div/mod stmt.
> > + top_bb dominates all other basic blocks containing div/mod stms
> > + so, the topmost stmt would be the first div/mod stmt with matching
> > operands
> > + in top_bb. */
> > +
> > + gcc_assert (top_bb != 0);
> > + gimple_stmt_iterator gsi;
> > + for (gsi = gsi_after_labels (top_bb); !gsi_end_p (gsi); gsi_next
> > (&gsi))
> > + {
> > + gimple *g = gsi_stmt (gsi);
> > + if (is_gimple_assign (g)
> > + && (gimple_assign_rhs_code (g) == TRUNC_DIV_EXPR
> > + || gimple_assign_rhs_code (g) == TRUNC_MOD_EXPR)
> > + && operand_equal_p (op1, gimple_assign_rhs1 (g), 0)
> > + && operand_equal_p (op2, gimple_assign_rhs2 (g), 0))
> > + break;
> >
> > Looks overly complicated to me. Just remember "topmost" use_stmt
> > alongside top_bb (looks like you'll no longer need top_bb if you
> > retail top_stmt). And then do
> >
> > gsi = gsi_for_stmt (top_stmt);
> >
> > and insert before that.
> Thanks, done in this patch. Does it look OK ?
> IIUC gimple_uid (stmt1) < gimple_uid (stmt2) can be used to check if
> stmt1 occurs before stmt2
> only if stmt1 and stmt2 are in the same basic block ?
> >
> >> b) Handling constants - I dropped handling constants in the attached
> >> patch. IIUC we don't want
> >> to enable this transform if there's a specialized expansion for some
> >> constants for div or mod ?
> >
> > See expand_divmod which has lots of special cases for constant operands
> > not requiring target support for div or mod.
> Thanks, would it be OK if I do this in follow up patch ?
Well, just not handle them like in your patch is fine.
> >
> >> I suppose this would also be target dependent and require a target hook ?
> >> For instance arm defines modsi3 pattern to expand mod when 2nd operand
> >> is constant and <= 0 or power of 2,
> >> while for other cases goes the expand_divmod() route to generate call
> >> to __aeabi_idivmod libcall.
> >
> > Ok, so it lacks a signed mod instruction.
> >
> >> c) Gating the divmod transform -
> >> I tried gating it on checks for optab_handlers on div and mod, however
> >> this doesn't enable transform for arm cortex-a9
> >> anymore (cortex-a9 doesn't have hardware instructions for integer div and
> >> mod).
> >> IIUC for cortex-a9,
> >> optab_handler (sdivmod_optab, SImode) returns CODE_FOR_nothing because
> >> HAVE_divsi3 is 0.
> >> However optab_handler (smod_optab, SImode) matches since optab_handler
> >> only checks for existence of pattern
> >> (and not whether the pattern gets matched).
> >> I suppose we should enable the transform only if the divmod, div, and
> >> mod pattern do not match rather than checking
> >> if the patterns exist via optab_handler ? For a general x % y, modsi3
> >> would fail to match but optab_handler(smod_optab, SImode ) still
> >> says it's matched.
> >
> > Ah, of course. Querying for an optab handler is just a cheap
> > guesstimate... Not sure how to circumvent this best (sub-target
> > enablement of patterns). RTL expansion just goes ahead (of course)
> > and sees if expansion eventually fails. Richard?
> >
> >> Should we define a new target hook combine_divmod, which returns true
> >> if transforming to divmod is desirable for that
> >> target ?
> >> The default definition could be:
> >> bool default_combine_divmod (enum machine_mode mode, tree op1, tree op2)
> >> {
> >> // check for optab_handlers for div/mod/divmod and libfunc for divmod
> >> }
> >>
> >> And for arm, it could be over-ridden to return false if op2 is
> >> constant and <= 0 or power of 2.
> >> I am not really sure if this is a good idea since I am replicating
> >> information from modsi3 pattern.
> >> Any change to the pattern may require corresponding change to the hook :/
> >
> > Yeah, I don't think that is desirable. Ideally we'd have a way
> > to query HAVE_* for CODE_FOR_* which would mean target-insns.def
> > support for all div/mod/divmod patterns(?) and queries...
> >
> > Not sure if what would be enough though.
> >
> > Note that the divmod check is equally flawed.
> >
> > I think with the above I'd enable the transform when
> >
> > + if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing
> > + || (optab_libfunc (divmod_optab, mode) != NULL_RTX
> > && optab_handler ([su]div_optab, mode) == CODE_FOR_nothing))
> > + return false;
> Um this fails for the arm backend (for cortex-a9) because
> optab_handler (divmod_optab, mode) != CODE_FOR_nothing is false
> optab_libfunc (divmod_optab, mode) != NULL_RTX is true.
> optab_handler (div_optab, mode) == CODE_FOR_nothing is true.
> which comes down to false || (true && true) which is true and we hit
> return false.
Oh, sorry to mess up the test - it was supposed to be inverted.
> AFAIU, we want the transform to be disabled if:
> a) optab_handler exists for divmod.
> b) optab_handler exists for div.
> c) optab_libfunc does not exist for divmod. */
>
> + if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing
> + || optab_handler (div_optab, mode) != CODE_FOR_nothing
> + || optab_libfunc (divmod_optab, mode) == NULL_RTX)
> + return false;
> Does that look correct ?
No. That will disable if we have a divmod optab. Instead try
if (! (optab_handler (divmod_optab, mode) != CODE_FOR_nothing
|| (optab_libfunc (divmod_optab, mode) != NULL_RTX
&& optab_handler ([su]div_optab, mode) == CODE_FOR_nothing)))
return false;
which is what I intended. If we have a divmod optab go ahead.
If we have a libfunc and not a div optab then as well.
> >
> > so we either will have a divmod instruction (hopefully not sub-target
> > disabled for us) or a libfunc for divmod and for sure no HW divide
> > instruction (HW mod can be emulated by HW divide but not the other
> > way around).
> >
> >> d) Adding effective-target-check for divmod: I just enabled it for
> >> arm*-*-* for now. I could additionally append more targets,
> >> not sure if this is the right approach.
> >
> > Looks good to me.
> Is this version OK if bootstrap/testing passes ?
Ok with adjusting the optab check like above.
Thanks,
Richard.