On 25 May 2016 at 12:52, Richard Biener <[email protected]> wrote:
> On Tue, 24 May 2016, Prathamesh Kulkarni wrote:
>
>> On 24 May 2016 at 19:39, Richard Biener <[email protected]> wrote:
>> > On Tue, 24 May 2016, Prathamesh Kulkarni wrote:
>> >
>> >> On 24 May 2016 at 17:42, Richard Biener <[email protected]> wrote:
>> >> > On Tue, 24 May 2016, Prathamesh Kulkarni wrote:
>> >> >
>> >> >> On 23 May 2016 at 17:35, Richard Biener <[email protected]>
>> >> >> wrote:
>> >> >> > On Mon, May 23, 2016 at 10:58 AM, Prathamesh Kulkarni
>> >> >> > <[email protected]> wrote:
>> >> >> >> Hi,
>> >> >> >> I have updated my patch for divmod (attached), which was originally
>> >> >> >> based on Kugan's patch.
>> >> >> >> The patch transforms stmts with code TRUNC_DIV_EXPR and
>> >> >> >> TRUNC_MOD_EXPR
>> >> >> >> having same operands to divmod representation, so we can cse
>> >> >> >> computation of mod.
>> >> >> >>
>> >> >> >> t1 = a TRUNC_DIV_EXPR b;
>> >> >> >> t2 = a TRUNC_MOD_EXPR b
>> >> >> >> is transformed to:
>> >> >> >> complex_tmp = DIVMOD (a, b);
>> >> >> >> t1 = REALPART_EXPR (complex_tmp);
>> >> >> >> t2 = IMAGPART_EXPR (complex_tmp);
>> >> >> >>
>> >> >> >> * New hook divmod_expand_libfunc
>> >> >> >> The rationale for introducing the hook is that different targets
>> >> >> >> have
>> >> >> >> incompatible calling conventions for divmod libfunc.
>> >> >> >> Currently three ports define divmod libfunc: c6x, spu and arm.
>> >> >> >> c6x and spu follow the convention of libgcc2.c:__udivmoddi4:
>> >> >> >> return quotient and store remainder in argument passed as pointer,
>> >> >> >> while the arm version takes two arguments and returns both
>> >> >> >> quotient and remainder having mode double the size of the operand
>> >> >> >> mode.
>> >> >> >> The port should hence override the hook expand_divmod_libfunc
>> >> >> >> to generate call to target-specific divmod.
>> >> >> >> Ports should define this hook if:
>> >> >> >> a) The port does not have divmod or div insn for the given mode.
>> >> >> >> b) The port defines divmod libfunc for the given mode.
>> >> >> >> The default hook default_expand_divmod_libfunc() generates call
>> >> >> >> to libgcc2.c:__udivmoddi4 provided the operands are unsigned and
>> >> >> >> are of DImode.
>> >> >> >>
>> >> >> >> Patch passes bootstrap+test on x86_64-unknown-linux-gnu and
>> >> >> >> cross-tested on arm*-*-*.
>> >> >> >> Bootstrap+test in progress on arm-linux-gnueabihf.
>> >> >> >> Does this patch look OK ?
>> >> >> >
>> >> >> > diff --git a/gcc/targhooks.c b/gcc/targhooks.c
>> >> >> > index 6b4601b..e4a021a 100644
>> >> >> > --- a/gcc/targhooks.c
>> >> >> > +++ b/gcc/targhooks.c
>> >> >> > @@ -1965,4 +1965,31 @@ default_optab_supported_p (int, machine_mode,
>> >> >> > machine_mode, optimization_type)
>> >> >> > return true;
>> >> >> > }
>> >> >> >
>> >> >> > +void
>> >> >> > +default_expand_divmod_libfunc (bool unsignedp, machine_mode mode,
>> >> >> > + rtx op0, rtx op1,
>> >> >> > + rtx *quot_p, rtx *rem_p)
>> >> >> >
>> >> >> > functions need a comment.
>> >> >> >
>> >> >> > ISTR it was suggested that ARM change to libgcc2.c__udivmoddi4
>> >> >> > style? In that
>> >> >> > case we could avoid the target hook.
>> >> >> Well I would prefer adding the hook because that's more easier -;)
>> >> >> Would it be ok for now to go with the hook ?
>> >> >> >
>> >> >> > + /* If target overrides expand_divmod_libfunc hook
>> >> >> > + then perform divmod by generating call to the
>> >> >> > target-specifc divmod
>> >> >> > libfunc. */
>> >> >> > + if (targetm.expand_divmod_libfunc !=
>> >> >> > default_expand_divmod_libfunc)
>> >> >> > + return true;
>> >> >> > +
>> >> >> > + /* Fall back to using libgcc2.c:__udivmoddi4. */
>> >> >> > + return (mode == DImode && unsignedp);
>> >> >> >
>> >> >> > I don't understand this - we know optab_libfunc returns non-NULL for
>> >> >> > 'mode'
>> >> >> > but still restrict this to DImode && unsigned? Also if
>> >> >> > targetm.expand_divmod_libfunc
>> >> >> > is not the default we expect the target to handle all modes?
>> >> >> Ah indeed, the check for DImode is unnecessary.
>> >> >> However I suppose the check for unsignedp should be there,
>> >> >> since we want to generate call to __udivmoddi4 only if operand is
>> >> >> unsigned ?
>> >> >
>> >> > The optab libfunc for sdivmod should be NULL in that case.
>> >> Ah indeed, thanks.
>> >> >
>> >> >> >
>> >> >> > That said - I expected the above piece to be simply a 'return true;'
>> >> >> > ;)
>> >> >> >
>> >> >> > Usually we use some can_expand_XXX helper in optabs.c to query if
>> >> >> > the target
>> >> >> > supports a specific operation (for example SImode divmod would use
>> >> >> > DImode
>> >> >> > divmod by means of widening operands - for the unsigned case of
>> >> >> > course).
>> >> >> Thanks for pointing out. So if a target does not support divmod
>> >> >> libfunc for a mode
>> >> >> but for a wider mode, then we could zero-extend operands to the
>> >> >> wider-mode,
>> >> >> perform divmod on the wider-mode, and then cast result back to the
>> >> >> original mode.
>> >> >> I haven't done that in this patch, would it be OK to do that as a
>> >> >> follow up ?
>> >> >
>> >> > I think that you should conservatively handle the div_optab query, thus
>> >> > if
>> >> > the target has a HW division in a wider mode don't use the divmod IFN.
>> >> > You'd simply iterate over GET_MODE_WIDER_MODE and repeat the
>> >> > if (optab_handler (div_optab, mode) != CODE_FOR_nothing) check, bailing
>> >> > out if that is available.
>> >> Done.
>> >> >
>> >> >> > + /* Disable the transform if either is a constant, since
>> >> >> > division-by-constant
>> >> >> > + may have specialized expansion. */
>> >> >> > + if (TREE_CONSTANT (op1) || TREE_CONSTANT (op2))
>> >> >> > + return false;
>> >> >> >
>> >> >> > please use CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2)
>> >> >> >
>> >> >> > + if (TYPE_OVERFLOW_TRAPS (type))
>> >> >> > + return false;
>> >> >> >
>> >> >> > why's that? Generally please first test cheap things (trapping,
>> >> >> > constant-ness)
>> >> >> > before checking expensive stuff (target_supports_divmod_p).
>> >> >> I added TYPE_OVERFLOW_TRAPS (type) based on your suggestion in:
>> >> >> https://www.mail-archive.com/[email protected]/msg78534.html
>> >> >> "When looking at TRUNC_DIV_EXPR you should also exclude
>> >> >> the case where TYPE_OVERFLOW_TRAPS (type) as that should
>> >> >> expand using the [su]divv optabs (no trapping overflow
>> >> >> divmod optab exists)."
>> >> >
>> >> > Ok, didn't remember that.
>> >> >
>> >> >> >
>> >> >> > +static bool
>> >> >> > +convert_to_divmod (gassign *stmt)
>> >> >> > +{
>> >> >> > + if (!divmod_candidate_p (stmt))
>> >> >> > + return false;
>> >> >> > +
>> >> >> > + tree op1 = gimple_assign_rhs1 (stmt);
>> >> >> > + tree op2 = gimple_assign_rhs2 (stmt);
>> >> >> > +
>> >> >> > + vec<gimple *> stmts = vNULL;
>> >> >> >
>> >> >> > use an auto_vec <gimple *> - you currently leak it in at least one
>> >> >> > place.
>> >> >> >
>> >> >> > + if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))
>> >> >> > + cfg_changed = true;
>> >> >> >
>> >> >> > note that this suggests you should check whether any of the stmts
>> >> >> > may throw
>> >> >> > internally as you don't update / transfer EH info correctly. So for
>> >> >> > 'stmt' and
>> >> >> > all 'use_stmt' check stmt_can_throw_internal and if so do not add it
>> >> >> > to
>> >> >> > the list of stmts to modify.
>> >> >> >
>> >> >> > Btw, I think you should not add 'stmt' immediately but when
>> >> >> > iterating over
>> >> >> > all uses also gather uses in TRUNC_MOD_EXPR.
>> >> >> >
>> >> >> > Otherwise looks ok.
>> >> >> Done changes in this version. I am gathering mod uses same time as div
>> >> >> uses,
>> >> >> so this imposes a constraint that mod dominates mod. I am not sure if
>> >> >> that's desirable.
>> >> >
>> >> > I think you also need a mod_seen variable now that you don't necessarily
>> >> > end up with 'stmt' in the vector of stmts. I don't see how there is a
>> >> > constraint that mod dominates mod - it's just that the top_stmt needs
>> >> > to dominate all other uses that can be replaced with replacing top_stmt
>> >> > with a divmod. It's just that the actual stmt set we choose may now
>> >> > depend on immediate uses order which on a second thought is bad
>> >> > as immediate uses order could be affected by debug stmts ... hmm.
>> >> >
>> >> > To avoid this please re-add the code adding 'stmt' to stmts immediately
>> >> > and add a use_stmt != stmt check to the immediate use processing loop
>> >> > so that we don't end up adding it twice.
>> >> Well I wonder what will happen for the following case:
>> >> t1 = x / y;
>> >> if (cond)
>> >> t2 = x % y;
>> >> else
>> >> t3 = x % y;
>> >>
>> >> Assuming stmt == top_stmt is "t2 = x % y" and use_stmt is "t3 = x % y",
>> >> use_stmt will not get added to stmts vector, since top_stmt and
>> >> use_stmt are not in same bb,
>> >> and bb's containing top_stmt and use_stmt don't dominate each other.
>> >> Not sure if this is practical case (I assume fre will hoist mod
>> >> outside if-else?)
>> >>
>> >> Now that we immediately add stmt to stmts vector, I suppose mod_seen
>> >> shall not be required ?
>> >
>> > In that case mod_seen is not needed. But the situation you say will
>> > still happen so I wonder if we'd need a better way of iterating over
>> > immediate uses, like first pushing all candidates into a worklist
>> > vector and then iterating over that until we find no more candidates.
>> >
>> > You can then also handle the case of more than one group of stmts
>> > (the pass currently doesn't iterate in any particular useful order
>> > over BBs).
>> IIUC, we want to perform the transform if:
>> i) there exists top_stmt with code trunc_div_expr/trunc_mod_expr and
>> having same operands as stmt.
>> ii) top_stmt dominates all other stmts with code
>> trunc_div_expr/trunc_mod_expr and having same operands as top_stmt.
>>
>> Firstly, we try to get to top_stmt if it exists, by iterating over uses of
>> stmt,
>> and then iterate over all uses of top_stmt and add them to stmts vector
>> only if top_stmt dominates all the stmts with same operands as top_stmt
>> and have code trunc_div_expr/trunc_mod_expr.
>>
>> /* Get to top_stmt. */
>> top_stmt = stmt;
>> top_bb = gimple_bb (stmt);
>>
>> FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op1)
>> {
>> if (use_stmt code is TRUNC_DIV_EXPR or TRUNC_MOD_EXPR
>> && use_stmt has same operands as stmt)
>> {
>> if (gimple_bb (use_stmt) dominates top_bb)
>> {
>> top_bb = gimple_bb (use_stmt);
>> top_stmt = use_stmt;
>> }
>> else if (gimple_bb (use_stmt) == top_stmt
>> && gimple_uid (use_stmt) < top_stmt)
>> top_stmt = use_stmt;
>> }
>> }
>>
>> /* Speculatively consider top_stmt as dominating all other
>> div_expr/mod_expr stmts with same operands as stmt. */
>> stmts.safe_push (top_stmt);
>>
>> /* Walk uses of top_stmt to ensure that all stmts are dominated by top_stmt.
>> */
>> top_op1 = gimple_assign_rhs1 (top_stmt);
>> FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, top_op1)
>> {
>> if (use_stmt code is TRUNC_DIV_EXPR or TRUNC_MOD_EXPR
>> && use_stmt has same operands as top_stmt)
>> {
>> if (use_stmt == top_stmt)
>> continue;
>>
>> /* No top_stmt exits, do not proceed with transform */
>> if (top_bb does not dominate gimple_bb (use_stmt))
>> return false;
>>
>> stmts.safe_push (use_stmt);
>> }
>> }
>>
>> For the case:
>> t1 = x / y;
>> if (cond)
>> t2 = x % y;
>> else
>> t3 = x % y;
>>
>> Assuming stmt is "t2 = x % y", it will walk uses of stmt and set
>> top_stmt to "t1 = x / y"
>> Then it will walk all uses of top_stmt:
>> "t2 = x % y" -> dominated by top_stmt
>> "t3 = x % y" -> dominated by top_stmt
>> Since all stmts are dominated by top_stmt, we add all three stmts to
>> vector of stmts and proceed with transform.
>>
>> For the case where, top_stmt dominates original stmt but not all stmts:
>>
>> if (cond)
>> t1 = x / y;
>> else
>> {
>> t2 = x % y;
>> return;
>> }
>>
>> t3 = x % y;
>>
>> Assuming stmt is "t3 = x % y",
>> Walking stmt uses will set top_stmt to "t1 = x / y";
>>
>> Walking immediate uses of top_stmt, we find that "t2 = x % y" is not
>> dominated by top_stmt,
>> and hence don't do the transform.
>>
>> Does this sound reasonable ?
>
> Yes, that's reasonable.
Thanks, I have attached patch that implements above approach.
Does it look OK ?
The patch does not still handle the following case:
int f(int x, int y)
{
extern int cond;
int q, r;
if (cond)
q = x % y;
else
q = x % y;
r = x % y;
return q + r;
}
In above case although the mod stmt is not dominated by either div
stmt, I suppose the transform
is still possible by inserting DIVMOD (x, y) before if-else ?
For the following test-case, I am surprised why CSE didn't take place before
widening_mul pass ?
int
f_1 (int x, int y)
{
int q = x / y;
int r1 = 0, r2 = 0;
if (cond)
r1 = x % y;
else
r2 = x % y;
return q + r1 + r2;
}
The input to widening_mul pass is:
f_1 (int x, int y)
{
int r2;
int r1;
int q;
int cond.0_1;
int _2;
int _11;
<bb 2>:
q_7 = x_5(D) / y_6(D);
cond.0_1 = cond;
if (cond.0_1 != 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
r1_9 = x_5(D) % y_6(D);
goto <bb 5>;
<bb 4>:
r2_10 = x_5(D) % y_6(D);
<bb 5>:
# r1_3 = PHI <r1_9(3), 0(4)>
# r2_4 = PHI <0(3), r2_10(4)>
_2 = r1_3 + q_7;
_11 = _2 + r2_4;
return _11;
}
Thanks,
Prathamesh
>
> Richard.
>
>> Thanks,
>> Prathamesh
>> >
>> > Richard.
>> >
>>
>>
>
> --
> Richard Biener <[email protected]>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB
> 21284 (AG Nuernberg)
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8c7f2a1..111f19f 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6963,6 +6963,12 @@ This is firstly introduced on ARM/AArch64 targets,
please refer to
the hook implementation for how different fusion types are supported.
@end deftypefn
+@deftypefn {Target Hook} void TARGET_EXPAND_DIVMOD_LIBFUNC (bool
@var{unsignedp}, machine_mode @var{mode}, @var{rtx}, @var{rtx}, rtx
*@var{quot}, rtx *@var{rem})
+Define this hook if the port does not have hardware div and divmod insn for
+the given mode but has divmod libfunc, which is incompatible
+with libgcc2.c:__udivmoddi4
+@end deftypefn
+
@node Sections
@section Dividing the Output into Sections (Texts, Data, @dots{})
@c the above section title is WAY too long. maybe cut the part between
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index f963a58..2c9a800 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4848,6 +4848,8 @@ them: try the first ones in this list first.
@hook TARGET_SCHED_FUSION_PRIORITY
+@hook TARGET_EXPAND_DIVMOD_LIBFUNC
+
@node Sections
@section Dividing the Output into Sections (Texts, Data, @dots{})
@c the above section title is WAY too long. maybe cut the part between
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index c867ddc..0cb59f7 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2276,6 +2276,48 @@ multi_vector_optab_supported_p (convert_optab optab,
tree_pair types,
#define direct_mask_store_optab_supported_p direct_optab_supported_p
#define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p
+/* Expand DIVMOD() using:
+ a) optab handler for udivmod/sdivmod if it is available.
+ b) If optab_handler doesn't exist, Generate call to
+ optab_libfunc for udivmod/sdivmod. */
+
+static void
+expand_DIVMOD (internal_fn, gcall *stmt)
+{
+ tree lhs = gimple_call_lhs (stmt);
+ tree arg0 = gimple_call_arg (stmt, 0);
+ tree arg1 = gimple_call_arg (stmt, 1);
+
+ gcc_assert (TREE_CODE (TREE_TYPE (lhs)) == COMPLEX_TYPE);
+ tree type = TREE_TYPE (TREE_TYPE (lhs));
+ machine_mode mode = TYPE_MODE (type);
+ bool unsignedp = TYPE_UNSIGNED (type);
+ optab tab = (unsignedp) ? udivmod_optab : sdivmod_optab;
+
+ rtx op0 = expand_normal (arg0);
+ rtx op1 = expand_normal (arg1);
+ rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+ rtx quotient, remainder;
+
+ /* Check if optab handler exists for [u]divmod. */
+ if (optab_handler (tab, mode) != CODE_FOR_nothing)
+ {
+ quotient = gen_reg_rtx (mode);
+ remainder = gen_reg_rtx (mode);
+ expand_twoval_binop (tab, op0, op1, quotient, remainder, unsignedp);
+ }
+ else
+ targetm.expand_divmod_libfunc (unsignedp, mode, op0, op1,
+ "ient, &remainder);
+
+ /* Wrap the return value (quotient, remainder) within COMPLEX_EXPR. */
+ expand_expr (build2 (COMPLEX_EXPR, TREE_TYPE (lhs),
+ make_tree (TREE_TYPE (arg0), quotient),
+ make_tree (TREE_TYPE (arg1), remainder)),
+ target, VOIDmode, EXPAND_NORMAL);
+}
+
/* Return true if FN is supported for the types in TYPES when the
optimization type is OPT_TYPE. The types are those associated with
the "type0" and "type1" fields of FN's direct_internal_fn_info
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e729d85..56a80f1 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -194,6 +194,9 @@ DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_SET, ECF_LEAF |
ECF_NOTHROW, NULL)
DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_COMPLEMENT, ECF_LEAF | ECF_NOTHROW, NULL)
DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_RESET, ECF_LEAF | ECF_NOTHROW, NULL)
+/* Divmod function. */
+DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL)
+
#undef DEF_INTERNAL_INT_FN
#undef DEF_INTERNAL_FLT_FN
#undef DEF_INTERNAL_OPTAB_FN
diff --git a/gcc/target.def b/gcc/target.def
index 6392e73..4496f9a 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4948,6 +4948,16 @@ Normally, this is not needed.",
bool, (const_tree field, machine_mode mode),
default_member_type_forces_blk)
+/* See tree-ssa-math-opts.c:divmod_candidate_p for conditions that gate
+ the divmod transform. */
+DEFHOOK
+(expand_divmod_libfunc,
+ "Define this hook if the port does not have hardware div and divmod insn
for\n\
+the given mode but has divmod libfunc, which is incompatible\n\
+with libgcc2.c:__udivmoddi4",
+ void, (bool unsignedp, machine_mode mode, rtx, rtx, rtx *quot, rtx *rem),
+ default_expand_divmod_libfunc)
+
/* Return the class for a secondary reload, and fill in extra information. */
DEFHOOK
(secondary_reload,
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 6b4601b..20327a6 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1965,4 +1965,31 @@ default_optab_supported_p (int, machine_mode,
machine_mode, optimization_type)
return true;
}
+/* Generate call to
+ DImode __udivmoddi4 (DImode op0, DImode op1, DImode *rem). */
+
+void
+default_expand_divmod_libfunc (bool unsignedp, machine_mode mode,
+ rtx op0, rtx op1,
+ rtx *quot_p, rtx *rem_p)
+{
+ gcc_assert (mode == DImode);
+ gcc_assert (unsignedp);
+
+ rtx libfunc = optab_libfunc (udivmod_optab, DImode);
+ gcc_assert (libfunc);
+
+ rtx remainder = assign_stack_temp (DImode, GET_MODE_SIZE (DImode));
+ rtx address = XEXP (remainder, 0);
+
+ rtx quotient = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
+ DImode, 3,
+ op0, GET_MODE (op0),
+ op1, GET_MODE (op1),
+ address, GET_MODE (address));
+
+ *quot_p = quotient;
+ *rem_p = remainder;
+}
+
#include "gt-targhooks.h"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 7687c39..dc5e8e7 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -254,4 +254,7 @@ extern void default_setup_incoming_vararg_bounds
(cumulative_args_t ca ATTRIBUTE
extern bool default_optab_supported_p (int, machine_mode, machine_mode,
optimization_type);
+extern void default_expand_divmod_libfunc (bool, machine_mode,
+ rtx, rtx, rtx *, rtx *);
+
#endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 81688cd..7b6d983 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -112,6 +112,9 @@ along with GCC; see the file COPYING3. If not see
#include "params.h"
#include "internal-fn.h"
#include "case-cfn-macros.h"
+#include "optabs-libfuncs.h"
+#include "tree-eh.h"
+#include "targhooks.h"
/* This structure represents one basic block that either computes a
division, or is a common dominator for basic block that compute a
@@ -184,6 +187,9 @@ static struct
/* Number of fp fused multiply-add ops inserted. */
int fmas_inserted;
+
+ /* Number of divmod calls inserted. */
+ int divmod_calls_inserted;
} widen_mul_stats;
/* The instance of "struct occurrence" representing the highest
@@ -3784,6 +3790,233 @@ match_uaddsub_overflow (gimple_stmt_iterator *gsi,
gimple *stmt,
return true;
}
+/* Return true if target has support for divmod. */
+
+static bool
+target_supports_divmod_p (optab divmod_optab, optab div_optab, machine_mode
mode)
+{
+ /* If target supports hardware divmod insn, use it for divmod. */
+ if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing)
+ return true;
+
+ /* Check if libfunc for divmod is available. */
+ if (optab_libfunc (divmod_optab, mode) != NULL_RTX)
+ {
+ /* If optab_handler exists for div_optab, perhaps in a wider mode,
+ we don't want to use the libfunc even if it exists for given mode. */
+ for (machine_mode div_mode = mode;
+ div_mode != VOIDmode;
+ div_mode = GET_MODE_WIDER_MODE (div_mode))
+ if (optab_handler (div_optab, div_mode) != CODE_FOR_nothing)
+ return false;
+
+ return true;
+ }
+
+ return false;
+}
+
+/* Check if stmt is candidate for divmod transform. */
+
+static bool
+divmod_candidate_p (gassign *stmt)
+{
+ tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+ enum machine_mode mode = TYPE_MODE (type);
+ optab divmod_optab, div_optab;
+
+ if (TYPE_UNSIGNED (type))
+ {
+ divmod_optab = udivmod_optab;
+ div_optab = udiv_optab;
+ }
+ else
+ {
+ divmod_optab = sdivmod_optab;
+ div_optab = sdiv_optab;
+ }
+
+ tree op1 = gimple_assign_rhs1 (stmt);
+ tree op2 = gimple_assign_rhs2 (stmt);
+
+ /* Disable the transform if either is a constant, since division-by-constant
+ may have specialized expansion. */
+ if (CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2))
+ return false;
+
+ /* Exclude the case where TYPE_OVERFLOW_TRAPS (type) as that should
+ expand using the [su]divv optabs. */
+ if (TYPE_OVERFLOW_TRAPS (type))
+ return false;
+
+ if (!target_supports_divmod_p (divmod_optab, div_optab, mode))
+ return false;
+
+ return true;
+}
+
+/* This function looks for:
+ t1 = a TRUNC_DIV_EXPR b;
+ t2 = a TRUNC_MOD_EXPR b;
+ and transforms it to the following sequence:
+ complex_tmp = DIVMOD (a, b);
+ t1 = REALPART_EXPR(a);
+ t2 = IMAGPART_EXPR(b);
+ For conditions enabling the transform see divmod_candidate_p().
+
+ The pass works in two phases:
+ 1) Walk through all immediate uses of stmt's operand and find a
+ TRUNC_DIV_EXPR with matching operands and if such a stmt is found add
+ it to stmts vector.
+ 2) Insert DIVMOD call before first div/mod stmt in top_bb (basic block that
+ dominates other div/mod stmts with same operands) and update entries in
+ stmts vector to use return value of DIMOVD (REALEXPR_PART for div,
+ IMAGPART_EXPR for mod). */
+
+static bool
+convert_to_divmod (gassign *stmt)
+{
+ if (!divmod_candidate_p (stmt))
+ return false;
+
+ tree op1 = gimple_assign_rhs1 (stmt);
+ tree op2 = gimple_assign_rhs2 (stmt);
+
+ imm_use_iterator use_iter;
+ gimple *use_stmt;
+ auto_vec<gimple *> stmts;
+
+ gimple *top_stmt = NULL;
+ basic_block top_bb = NULL;
+
+ /* Try to set top_stmt to "topmost" stmt
+ with code TRUNC_DIV_EXPR/TRUNC_MOD_EXPR having same operands as stmt. */
+
+ FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op1)
+ {
+ if (is_gimple_assign (use_stmt)
+ && (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR
+ || gimple_assign_rhs_code (use_stmt) == TRUNC_MOD_EXPR)
+ && operand_equal_p (op1, gimple_assign_rhs1 (use_stmt), 0)
+ && operand_equal_p (op2, gimple_assign_rhs2 (use_stmt), 0))
+ {
+ if (stmt_can_throw_internal (use_stmt))
+ continue;
+
+ basic_block bb = gimple_bb (use_stmt);
+
+ if (top_bb == NULL)
+ {
+ top_stmt = stmt;
+ top_bb = bb;
+ }
+ else if (bb == top_bb && gimple_uid (use_stmt) < gimple_uid
(top_stmt))
+ top_stmt = use_stmt;
+ else if (dominated_by_p (CDI_DOMINATORS, top_bb, bb))
+ {
+ top_bb = bb;
+ top_stmt = use_stmt;
+ }
+ }
+ }
+
+ if (top_stmt == NULL)
+ return false;
+
+ bool div_seen = false;
+ bool mod_seen = false;
+
+ tree top_op1 = gimple_assign_rhs1 (top_stmt);
+ tree top_op2 = gimple_assign_rhs2 (top_stmt);
+
+ stmts.safe_push (top_stmt);
+ if (gimple_assign_rhs_code (top_stmt) == TRUNC_DIV_EXPR)
+ div_seen = true;
+ else if (gimple_assign_rhs_code (top_stmt) == TRUNC_MOD_EXPR)
+ mod_seen = true;
+
+ /* Ensure that gimple_bb (use_stmt) is dominated by top_bb. */
+
+ FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, top_op1)
+ {
+ if (is_gimple_assign (use_stmt)
+ && (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR
+ || gimple_assign_rhs_code (use_stmt) == TRUNC_MOD_EXPR)
+ && operand_equal_p (top_op1, gimple_assign_rhs1 (use_stmt), 0)
+ && operand_equal_p (top_op2, gimple_assign_rhs2 (use_stmt), 0))
+ {
+ if (use_stmt == top_stmt)
+ continue;
+
+ if (stmt_can_throw_internal (use_stmt))
+ continue;
+
+ if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))
+ {
+ end_imm_use_stmt_traverse (&use_iter);
+ return false;
+ }
+
+ stmts.safe_push (use_stmt);
+ if (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR)
+ div_seen = true;
+ else if (gimple_assign_rhs_code (use_stmt) == TRUNC_MOD_EXPR)
+ mod_seen = true;
+ }
+ }
+
+ if (!(div_seen && mod_seen))
+ return false;
+
+ /* Create libcall to internal fn DIVMOD:
+ divmod_tmp = DIVMOD (op1, op2). */
+
+ gcall *call_stmt = gimple_build_call_internal (IFN_DIVMOD, 2, op1, op2);
+ tree res = make_temp_ssa_name (
+ build_complex_type (TREE_TYPE (op1)),
+ call_stmt, "divmod_tmp");
+ gimple_call_set_lhs (call_stmt, res);
+
+ /* Insert the call before top_stmt. */
+ gimple_stmt_iterator top_stmt_gsi = gsi_for_stmt (top_stmt);
+ gsi_insert_before (&top_stmt_gsi, call_stmt, GSI_SAME_STMT);
+
+ widen_mul_stats.divmod_calls_inserted++;
+
+ /* Update all statements in stmts.
+ if stmt is lhs = op1 TRUNC_DIV_EXPR op2, change to lhs =
REALPART_EXPR<divmod_tmp>
+ if stmt is lhs = op1 TRUNC_MOD_EXPR op2, change to lhs =
IMAGPART_EXPR<divmod_tmp>. */
+
+ bool cfg_changed = false;
+ for (unsigned i = 0; stmts.iterate (i, &use_stmt); ++i)
+ {
+ tree new_rhs;
+
+ switch (gimple_assign_rhs_code (use_stmt))
+ {
+ case TRUNC_DIV_EXPR:
+ new_rhs = fold_build1 (REALPART_EXPR, TREE_TYPE (op1), res);
+ break;
+
+ case TRUNC_MOD_EXPR:
+ new_rhs = fold_build1 (IMAGPART_EXPR, TREE_TYPE (op2), res);
+ break;
+
+ default:
+ gcc_unreachable ();
+ }
+
+ gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt);
+ gimple_assign_set_rhs_from_tree (&gsi, new_rhs);
+ update_stmt (use_stmt);
+
+ if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))
+ cfg_changed = true;
+ }
+
+ stmts.release ();
+ return cfg_changed;
+}
/* Find integer multiplications where the operands are extended from
smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR
@@ -3828,6 +4061,8 @@ pass_optimize_widening_mul::execute (function *fun)
bool cfg_changed = false;
memset (&widen_mul_stats, 0, sizeof (widen_mul_stats));
+ calculate_dominance_info (CDI_DOMINATORS);
+ renumber_gimple_stmt_uids ();
FOR_EACH_BB_FN (bb, fun)
{
@@ -3861,6 +4096,10 @@ pass_optimize_widening_mul::execute (function *fun)
match_uaddsub_overflow (&gsi, stmt, code);
break;
+ case TRUNC_MOD_EXPR:
+ cfg_changed = convert_to_divmod (as_a<gassign *> (stmt));
+ break;
+
default:;
}
}
@@ -3907,6 +4146,8 @@ pass_optimize_widening_mul::execute (function *fun)
widen_mul_stats.maccs_inserted);
statistics_counter_event (fun, "fused multiply-adds inserted",
widen_mul_stats.fmas_inserted);
+ statistics_counter_event (fun, "divmod calls inserted",
+ widen_mul_stats.divmod_calls_inserted);
return cfg_changed ? TODO_cleanup_cfg : 0;
}