On Tue, Sep 18, 2012 at 11:32 AM, Bin Cheng <bin.ch...@arm.com> wrote: > Ping.
I already approved your original patch upthread. Richard. >> -----Original Message----- >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] > On >> Behalf Of Bin.Cheng >> Sent: Tuesday, September 04, 2012 11:20 PM >> To: Richard Guenther; gcc-patches@gcc.gnu.org >> Cc: Richard Earnshaw >> Subject: Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold- >> const.c >> >> Sorry, I mis-sent this offline. >> >> On Tue, Sep 4, 2012 at 11:00 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >> >>> >> >>> It's not ok (I btw fail to see the patch in this thread). The >> >>> current way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should >> >>> instead be changed to always match the pattern >> >>> >> >>> LOGICAL_OP_NON_SHORT_CIRCUIT >> >>> && (BRANCH_COST (optimize_function_for_speed_p (cfun), >> >>> false) >= 2) >> >>> >> >>> and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1, >> >>> defined in defaults.h (and the docs updated). >> >>> >> >> >> >> That's not going to work for modern ARM cores. We want to set >> >> BRANCH_COST to 1 but still have it generate the non-short-circuit >> >> code (because conditional compares are really cheap. >> >> >> > >> > Hi Richard, >> > For now, LOGICAL_OP_NON_SHORT_CIRCUIT macro is defined as below, which >> > is duplicate of the BRANCH_COST condition. >> > >> > #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT >> > #define LOGICAL_OP_NON_SHORT_CIRCUIT \ >> > (BRANCH_COST (optimize_function_for_speed_p (cfun), \ >> > false) >= 2) >> > #endif >> > >> > Recently we measured performance on some ARM processors and found it >> > would be better to have non-short-circuit optimization while setting >> > BRANCH_COST to 1, which is impossible with present codes. So here >> > comes this patch as below: >> > >> > Index: gcc/fold-const.c >> > =================================================================== >> > --- gcc/fold-const.c (revision 189835) >> > +++ gcc/fold-const.c (working copy) >> > @@ -8443,9 +8443,7 @@ >> > if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0) >> > return tem; >> > >> > - if ((BRANCH_COST (optimize_function_for_speed_p (cfun), >> > - false) >= 2) >> > - && LOGICAL_OP_NON_SHORT_CIRCUIT >> > + if (LOGICAL_OP_NON_SHORT_CIRCUIT >> > && (code == TRUTH_AND_EXPR >> > || code == TRUTH_ANDIF_EXPR >> > || code == TRUTH_OR_EXPR >> > >> > The purpose is to remove the duplicate check on BRANCH_COST. >> > >> > As Andrew pointed out that the patch may change behavior if some >> > back-ends define the macro independent of BRANCH_COST. After looking >> > into the code, there are two uses of the macro in fold-const.c, each >> > controls one kind code transformation. The first use is: >> > >> > else if (LOGICAL_OP_NON_SHORT_CIRCUIT >> > && lhs != 0 && rhs != 0 >> > && (code == TRUTH_ANDIF_EXPR >> > || code == TRUTH_ORIF_EXPR) >> > && operand_equal_p (lhs, rhs, 0)) >> > >> > The second one is: >> > >> > if ((BRANCH_COST (optimize_function_for_speed_p (cfun), >> > false) >= 2) >> > && LOGICAL_OP_NON_SHORT_CIRCUIT >> > && (code == TRUTH_AND_EXPR >> > || code == TRUTH_ANDIF_EXPR >> > || code == TRUTH_OR_EXPR >> > || code == TRUTH_ORIF_EXPR)) >> > >> > I am not sure why the 2nd condition is designed in current way and >> > haven't found any useful changelog on it. >> > But considering back end can factor BRANCH_COST in >> > LOGICAL_OP_NON_SHORT_CIRCUIT or not, we can conclude that the behavior >> > will only be changed if some back-end want to control the two >> > transformations differently. So the problem becomes whether the 2nd >> > condition should be changed. Either way there is scenario cannot be >> > covered. >> > >> >> And for now, >> FTR, only two targets redefine L_O_N_S_C: mips and rs6000. Both set it to >> zero so won't be affected by this change. >> > > Hi Richard, > I have tried to explain the change, but I am not sure whether it is agreed > or... > > Thanks very much. > > >