On 09/06/17 14:51, Richard Earnshaw (lists) wrote: > On 06/09/17 13:44, Bernd Edlinger wrote: >> On 09/04/17 21:54, Bernd Edlinger wrote: >>> Hi Kyrill, >>> >>> Thanks for your review! >>> >>> >>> On 09/04/17 15:55, Kyrill Tkachov wrote: >>>> Hi Bernd, >>>> >>>> On 18/01/17 15:36, Bernd Edlinger wrote: >>>>> On 01/13/17 19:28, Bernd Edlinger wrote: >>>>>> On 01/13/17 17:10, Bernd Edlinger wrote: >>>>>>> On 01/13/17 14:50, Richard Earnshaw (lists) wrote: >>>>>>>> On 18/12/16 12:58, Bernd Edlinger wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> this is related to PR77308, the follow-up patch will depend on this >>>>>>>>> one. >>>>>>>>> >>>>>>>>> When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned >>>>>>>>> before reload, a mis-compilation in libgcc function >>>>>>>>> __gnu_satfractdasq >>>>>>>>> was discovered, see [1] for more details. >>>>>>>>> >>>>>>>>> The reason seems to be that when the *arm_cmpdi_insn is directly >>>>>>>>> followed by a *arm_cmpdi_unsigned instruction, both are split >>>>>>>>> up into this: >>>>>>>>> >>>>>>>>> [(set (reg:CC CC_REGNUM) >>>>>>>>> (compare:CC (match_dup 0) (match_dup 1))) >>>>>>>>> (parallel [(set (reg:CC CC_REGNUM) >>>>>>>>> (compare:CC (match_dup 3) (match_dup 4))) >>>>>>>>> (set (match_dup 2) >>>>>>>>> (minus:SI (match_dup 5) >>>>>>>>> (ltu:SI (reg:CC_C CC_REGNUM) >>>>>>>>> (const_int >>>>>>>>> 0))))])] >>>>>>>>> >>>>>>>>> [(set (reg:CC CC_REGNUM) >>>>>>>>> (compare:CC (match_dup 2) (match_dup 3))) >>>>>>>>> (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0)) >>>>>>>>> (set (reg:CC CC_REGNUM) >>>>>>>>> (compare:CC (match_dup 0) (match_dup 1))))] >>>>>>>>> >>>>>>>>> The problem is that the reg:CC from the *subsi3_carryin_compare >>>>>>>>> is not mentioning that the reg:CC is also dependent on the reg:CC >>>>>>>>> from before. Therefore the *arm_cmpsi_insn appears to be >>>>>>>>> redundant and thus got removed, because the data values are >>>>>>>>> identical. >>>>>>>>> >>>>>>>>> I think that applies to a number of similar pattern where data >>>>>>>>> flow is happening through the CC reg. >>>>>>>>> >>>>>>>>> So this is a kind of correctness issue, and should be fixed >>>>>>>>> independently from the optimization issue PR77308. >>>>>>>>> >>>>>>>>> Therefore I think the patterns need to specify the true >>>>>>>>> value that will be in the CC reg, in order for cse to >>>>>>>>> know what the instructions are really doing. >>>>>>>>> >>>>>>>>> >>>>>>>>> Bootstrapped and reg-tested on arm-linux-gnueabihf. >>>>>>>>> Is it OK for trunk? >>>>>>>>> >>>>>>>> I agree you've found a valid problem here, but I have some issues >>>>>>>> with >>>>>>>> the patch itself. >>>>>>>> >>>>>>>> >>>>>>>> (define_insn_and_split "subdi3_compare1" >>>>>>>> [(set (reg:CC_NCV CC_REGNUM) >>>>>>>> (compare:CC_NCV >>>>>>>> (match_operand:DI 1 "register_operand" "r") >>>>>>>> (match_operand:DI 2 "register_operand" "r"))) >>>>>>>> (set (match_operand:DI 0 "register_operand" "=&r") >>>>>>>> (minus:DI (match_dup 1) (match_dup 2)))] >>>>>>>> "TARGET_32BIT" >>>>>>>> "#" >>>>>>>> "&& reload_completed" >>>>>>>> [(parallel [(set (reg:CC CC_REGNUM) >>>>>>>> (compare:CC (match_dup 1) (match_dup 2))) >>>>>>>> (set (match_dup 0) (minus:SI (match_dup 1) (match_dup >>>>>>>> 2)))]) >>>>>>>> (parallel [(set (reg:CC_C CC_REGNUM) >>>>>>>> (compare:CC_C >>>>>>>> (zero_extend:DI (match_dup 4)) >>>>>>>> (plus:DI (zero_extend:DI (match_dup 5)) >>>>>>>> (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>>>>>> (set (match_dup 3) >>>>>>>> (minus:SI (minus:SI (match_dup 4) (match_dup 5)) >>>>>>>> (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] >>>>>>>> >>>>>>>> >>>>>>>> This pattern is now no-longer self consistent in that before the >>>>>>>> split >>>>>>>> the overall result for the condition register is in mode CC_NCV, but >>>>>>>> afterwards it is just CC_C. >>>>>>>> >>>>>>>> I think CC_NCV is correct mode (the N, C and V bits all correctly >>>>>>>> reflect the result of the 64-bit comparison), but that then >>>>>>>> implies that >>>>>>>> the cc mode of subsi3_carryin_compare is incorrect as well and >>>>>>>> should in >>>>>>>> fact also be CC_NCV. Thinking about this pattern, I'm inclined to >>>>>>>> agree >>>>>>>> that CC_NCV is the correct mode for this operation >>>>>>>> >>>>>>>> I'm not sure if there are other consequences that will fall out from >>>>>>>> fixing this (it's possible that we might need a change to >>>>>>>> select_cc_mode >>>>>>>> as well). >>>>>>>> >>>>>>> Yes, this is still a bit awkward... >>>>>>> >>>>>>> The N and V bit will be the correct result for the subdi3_compare1 >>>>>>> a 64-bit comparison, but zero_extend:DI (match_dup 4) (plus:DI ...) >>>>>>> only gets the C bit correct, the expression for N and V is a different >>>>>>> one. >>>>>>> >>>>>>> It probably works, because the subsi3_carryin_compare instruction sets >>>>>>> more CC bits than the pattern does explicitly specify the value. >>>>>>> We know the subsi3_carryin_compare also computes the NV bits, but >>>>>>> it is >>>>>>> hard to write down the correct rtl expression for it. >>>>>>> >>>>>>> In theory the pattern should describe everything correctly, >>>>>>> maybe, like: >>>>>>> >>>>>>> set (reg:CC_C CC_REGNUM) >>>>>>> (compare:CC_C >>>>>>> (zero_extend:DI (match_dup 4)) >>>>>>> (plus:DI (zero_extend:DI (match_dup 5)) >>>>>>> (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>>>>> set (reg:CC_NV CC_REGNUM) >>>>>>> (compare:CC_NV >>>>>>> (match_dup 4)) >>>>>>> (plus:SI (match_dup 5) (ltu:SI (reg:CC_C CC_REGNUM) >>>>>>> (const_int 0))) >>>>>>> set (match_dup 3) >>>>>>> (minus:SI (minus:SI (match_dup 4) (match_dup 5)) >>>>>>> (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>>>>> >>>>>>> >>>>>>> But I doubt that will work to set CC_REGNUM with two different modes >>>>>>> in parallel? >>>>>>> >>>>>>> Another idea would be to invent a CC_CNV_NOOV mode, that implicitly >>>>>>> defines C from the DImode result, and NV from the SImode result, >>>>>>> similar to the CC_NOOVmode, that also leaves something open what >>>>>>> bits it really defines? >>>>>>> >>>>>>> >>>>>>> What do you think? >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> Bernd. >>>>>> I think maybe the right solution is to invent a new CCmode >>>>>> that defines C as if the comparison is done in DImode >>>>>> but N and V as if the comparison is done in SImode. >>>>>> >>>>>> I thought maybe I would call it CC_NCV_CIC (CIC = Carry-In-Compare), >>>>>> furthermore I think the CC_NOOV should be renamed to CC_NZ (because >>>>>> only N and Z are set correctly), but in a different patch of course. >>>>>> >>>>>> Attached is a new version that implements the new CCmode. >>>>>> >>>>>> How do you like this new version? >>>>>> >>>>>> It seems to be able to build a cross-compiler at least. >>>>>> >>>>>> I will start a new bootstrap with this new patch, but that can take >>>>>> some >>>>>> time until I have definitive results. >>>>>> >>>>>> Is there still a chance that it can go into gcc-7 or should it wait >>>>>> for the next stage1? >>>>>> >>>>>> Thanks >>>>>> Bernd. >>>>> >>>>> I thought I should also look at where the subdi_compare1 amd the >>>>> negdi2_compare patterns are used, and look if the caller is fine with >>>>> not having all CC bits available. >>>>> >>>>> And indeed usubv<mode>4 turns out to be questionabe, because it >>>>> emits gen_sub<mode>3_compare1 and uses arm_gen_unlikely_cbranch (LTU, >>>>> CCmode) which is inconsistent when subdi3_compare1 no longer uses >>>>> CCmode. >>>>> >>>>> To correct this, the branch should use CC_Cmode which is always defined. >>>>> >>>>> So I tried to test this pattern, with the following test programs, >>>>> and found that the code actually improves when the branch uses CC_Cmode >>>>> instead of CCmode, both for SImode and DImode data, which was a bit >>>>> surprising. >>>>> >>>>> I used this test program to see how the usubv<mode>4 pattern works: >>>>> >>>>> cat test.c (DImode) >>>>> unsigned long long x, y, z; >>>>> int b; >>>>> void test() >>>>> { >>>>> b = __builtin_sub_overflow (y,z, &x); >>>>> } >>>>> >>>>> >>>>> unpatched code used 8 byte more stack than patched, >>>>> because the DImode subtraction is effectively done twice. >>>>> >>>>> cat test1.c (SImode) >>>>> unsigned long x, y, z; >>>>> int b; >>>>> void test() >>>>> { >>>>> b = __builtin_sub_overflow (y,z, &x); >>>>> } >>>>> >>>>> which generates (unpatched): >>>>> cmp r3, r0 >>>>> sub ip, r3, r0 >>>>> >>>>> instead of expected (patched): >>>>> subs r3, r3, r2 >>>>> >>>>> >>>>> The condition is extracted by ifconversion and/or combine >>>>> and complicates the resulting code instead of simplifying. >>>>> >>>>> I think this happens only when the branch and the subsi/di3_compare1 >>>>> is using the same CC mode. >>>>> >>>>> That does not happen when the CC modes disagree, as with the >>>>> proposed patch. All other uses of the pattern are already using >>>>> CC_Cmode or CC_Vmode in the branch, and these do not change. >>>>> >>>>> Attached is an updated version of the patch, that happens to >>>>> improve the code generation of the usubsi4 and usubdi4 pattern, >>>>> as a side effect. >>>>> >>>>> >>>>> Bootstrapped and reg-tested on arm-linux-gnueabihf. >>>>> Is it OK for trunk? >>>> >>>> I'm very sorry it has taken so long to review. >>>> I've been ramping up on the context recently now so I'll try to move >>>> this along... >>>> >>>> This patch looks mostly ok to me from reading the patterns and the >>>> discussion around it. >>>> I have one concern: >>>> >>>> >>>> (define_insn_and_split "negdi2_compare" >>>> - [(set (reg:CC CC_REGNUM) >>>> - (compare:CC >>>> + [(set (reg:CC_NCV CC_REGNUM) >>>> + (compare:CC_NCV >>>> (const_int 0) >>>> (match_operand:DI 1 "register_operand" "0,r"))) >>>> (set (match_operand:DI 0 "register_operand" "=r,&r") >>>> @@ -4647,8 +4650,12 @@ >>>> (compare:CC (const_int 0) (match_dup 1))) >>>> (set (match_dup 0) (minus:SI (const_int 0) >>>> (match_dup 1)))]) >>>> - (parallel [(set (reg:CC CC_REGNUM) >>>> - (compare:CC (const_int 0) (match_dup 3))) >>>> + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) >>>> + (compare:CC_NCV_CIC >>>> + (const_int 0) >>>> + (plus:DI >>>> + (zero_extend:DI (match_dup 3)) >>>> + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>> (set (match_dup 2) >>>> (minus:SI >>>> (minus:SI (const_int 0) (match_dup 3)) >>>> >>>> >>>> I was somewhat concerned with having the first operand of the COMPARE >>>> being a const_int 0 and the second being >>>> a complex expression as the RTL canonicalization rules usually require >>>> the complex operand going first if possible. >>>> Reading the RTL rules in rtl.texi I see it says this: >>>> "If one of the operands is a constant, it should be placed in the >>>> second operand and the comparison code adjusted as appropriate." >>>> So it seems that the pre-existing pattern that puts const_int 0 as the >>>> first operand already breaks that rule. >>>> I think we should fix that and update the use of condition code to a >>>> GEU rather than LTU as well. >>>> >>> >> >> Well, the sentence before that one is even more explicit: >> >> "Normally, @var{x} and @var{y} must have the same mode. Otherwise, >> @code{compare} is valid only if the mode of @var{x} is in class >> @code{MODE_INT} and @var{y} is a @code{const_int} or >> @code{const_double} with mode @code{VOIDmode}." >> >> So because the const_int 0 has VOIDmode the comparison is done >> in y-mode not x-mode. >> >> But unfortunately I see no way how to accomplish this, >> because this assumes that the compare can be easily swapped >> if the conditional instruction just uses one of GT/GE/LE/LT >> or GTU/GEU/LEU/LTU. But that is only the case for plain CCmode. >> >> And in this example we ask for "overflow", but while 0-X can >> overflow X-0 simply can't. And moreover there are non-symmetric >> modes like CC_NCVmode which only support LT/GE/LTU/GEU but not >> the swapped conditions GT/LE/GTU/LEU. >> >> I think the only solution would be to adjust the spec to >> reflect the implementation: >> >> Index: rtl.texi >> =================================================================== >> --- rtl.texi (revision 251752) >> +++ rtl.texi (working copy) >> @@ -2252,6 +2252,13 @@ >> If one of the operands is a constant, it should be placed in the >> second operand and the comparison code adjusted as appropriate. >> >> +There may be exceptions from this rule if the mode @var{m} carries >> +not enough information for the swapped comparison operator, or > > There may be exceptions _to_ ... if mode @var{m} does not carry enough... > >> +if we ask for overflow from the subtraction. > > Aren't we really trying to 'detect overflow' rather than 'ask' for it? > >> That means, while >> +0-X may overfow X-0 can never overflow. Under these conditions >> +a compare may have the constant expression at the left side. > > In these circumstances the constant will be in the first operand . > > (left and right don't really make sense for RTL). >> +Examples are the ARM negdi2_compare pattern and similar. >> + >> A @code{compare} specifying two @code{VOIDmode} constants is not valid >> since there is no way to know in what mode the comparison is to be >> performed; the comparison must either be folded during the compilation >> >> >> >> Please advise. >> >> Thanks >> Bernd. >> >> >>> >>> Hmmm... >>> >>> I think the compare is not a commutative operation, and swapping >>> the arguments will imply a different value in the flags. >>> >>> So if I write >>> (set (reg:CC_NCV CC_REGNUM) >>> (compare:CC_NCV >>> (const_int 0) >>> (reg:DI 123))) >>> >>> I have C,N,V set to the result of (0 - r123), C = usable for LTU or GEU, >>> N,V = usable for LT, GE >>> >>> But if I write >>> (set (reg:CC_NCV CC_REGNUM) >>> (compare:CC_NCV >>> (reg:DI 123) >>> (const_int 0))) >>> >>> I have C,N,V set to the result of (r123 - 0), but the expansion stays >>> the same and the actual value in the flags is defined by the expansion. >>> Of course there exists probably no matching expansion for that. >>> >>> Note that both LTU in the above hunk are in a parallel-stmt and operate >>> on the flags from the previous pattern, so changing these to GEU >>> will probably be wrong. >>> >>> Both (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)) in the negdi2_compare >>> use the flags from the previous (set (reg:CC CC_REGNUM) (compare:CC >>> (const_int 0) (match_dup 1)). >>> >>> One use of the resulting flags (I know of) is in negvdi3 where we >>> have: >>> >>> emit_insn (gen_negdi2_compare (operands[0], operands[1])); >>> arm_gen_unlikely_cbranch (NE, CC_Vmode, operands[2]); >>> >>> I think only 0-x can overflow while x-0 can never overflow. >>> >>> Of course the CC_NCV_CIC mode bends the definition of the RTL compare >>> a lot and I guess if this pattern is created by a splitter, this can >>> only be expanded by an exactly matching pattern, there is (hopefully) >>> no way how combine could mess with this pattern due to the exotic >>> CCmode. So while I think it would work to swap only the notation of >>> all CC_NCV_CIC patterns, _without_ changing the assembler-parts and the >>> consuming statements, that would make it quite hard to follow for the >>> human reader at least. >>> >>> What do you think? >>> >>> >>> Bernd. >
Attached is the patch with an update to the rtl.texi documentation. The code does not change, so I did no new bootstrap. Is it OK for trunk? Thanks Bernd.
2017-09-06 Bernd Edlinger <bernd.edlin...@hotmail.de> PR target/77308 * doc/rtl.texi: Update documentation. * config/arm/arm-modes.def (CC_NCV_CIC): New mode. * config/arm/arm.md (adddi3_compareV, *addsi3_compareV_upper, adddi3_compareC, *addsi3_compareC_upper, subdi3_compare1, subsi3_carryin_compare, subsi3_carryin_compare_const, negdi2_compare, *negsi2_carryin_compare, *arm_cmpdi_insn): Fix the CC reg dataflow. (usubv<mode>4): Use CC_Cmode for the branch. Index: gcc/config/arm/arm-modes.def =================================================================== --- gcc/config/arm/arm-modes.def (revision 244439) +++ gcc/config/arm/arm-modes.def (working copy) @@ -38,6 +38,8 @@ (used for DImode unsigned comparisons). CC_NCVmode should be used if only the N, C, and V flags are correct (used for DImode signed comparisons). + CC_NCV_CICmode defines N and V in SImode and C in DImode + (used for carryin_compare patterns). CCmode should be used otherwise. */ CC_MODE (CC_NOOV); @@ -44,6 +46,7 @@ CC_MODE (CC_Z); CC_MODE (CC_CZ); CC_MODE (CC_NCV); +CC_MODE (CC_NCV_CIC); CC_MODE (CC_SWP); CC_MODE (CCFP); CC_MODE (CCFPE); Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 244439) +++ gcc/config/arm/arm.md (working copy) @@ -669,17 +669,15 @@ (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC_V CC_REGNUM) (ne:CC_V - (plus:DI (plus:DI - (sign_extend:DI (match_dup 4)) - (sign_extend:DI (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (sign_extend:DI - (plus:SI (match_dup 4) (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) - (set (match_dup 3) (plus:SI (plus:SI - (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) - (const_int 0))))])] + (plus:DI (plus:DI (sign_extend:DI (match_dup 4)) + (sign_extend:DI (match_dup 5))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (sign_extend:DI + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0))))])] " { operands[3] = gen_highpart (SImode, operands[0]); @@ -713,13 +711,13 @@ [(set (reg:CC_V CC_REGNUM) (ne:CC_V (plus:DI - (plus:DI - (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) - (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (sign_extend:DI - (plus:SI (match_dup 1) (match_dup 2))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (plus:DI + (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) + (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (sign_extend:DI (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0)))))) (set (match_operand:SI 0 "register_operand" "=r") (plus:SI (plus:SI (match_dup 1) (match_dup 2)) @@ -748,17 +746,15 @@ (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC_C CC_REGNUM) (ne:CC_C - (plus:DI (plus:DI - (zero_extend:DI (match_dup 4)) - (zero_extend:DI (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (zero_extend:DI - (plus:SI (match_dup 4) (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) - (set (match_dup 3) (plus:SI - (plus:SI (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) - (const_int 0))))])] + (plus:DI (plus:DI (zero_extend:DI (match_dup 4)) + (zero_extend:DI (match_dup 5))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (zero_extend:DI + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0))))])] " { operands[3] = gen_highpart (SImode, operands[0]); @@ -777,17 +773,16 @@ [(set (reg:CC_C CC_REGNUM) (ne:CC_C (plus:DI - (plus:DI - (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) - (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (zero_extend:DI - (plus:SI (match_dup 1) (match_dup 2))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (plus:DI + (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) + (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (zero_extend:DI + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) (set (match_operand:SI 0 "register_operand" "=r") - (plus:SI - (plus:SI (match_dup 1) (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "adcs%?\\t%0, %1, %2" [(set_attr "conds" "set") @@ -1080,14 +1075,14 @@ "TARGET_32BIT" { emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], operands[2])); - arm_gen_unlikely_cbranch (LTU, CCmode, operands[3]); + arm_gen_unlikely_cbranch (EQ, CC_Cmode, operands[3]); DONE; }) (define_insn_and_split "subdi3_compare1" - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NCV CC_REGNUM) + (compare:CC_NCV (match_operand:DI 1 "register_operand" "r") (match_operand:DI 2 "register_operand" "r"))) (set (match_operand:DI 0 "register_operand" "=&r") @@ -1098,10 +1093,14 @@ [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (match_dup 2))) (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))]) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 4) (match_dup 5))) - (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_dup 4)) + (plus:DI (zero_extend:DI (match_dup 5)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (set (match_dup 3) + (minus:SI (minus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] { operands[3] = gen_highpart (SImode, operands[0]); operands[0] = gen_lowpart (SImode, operands[0]); @@ -1157,13 +1156,15 @@ ) (define_insn "*subsi3_carryin_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC (match_operand:SI 1 "s_register_operand" "r") - (match_operand:SI 2 "s_register_operand" "r"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) + (plus:DI + (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (minus:SI (match_dup 1) - (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (minus:SI (minus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "sbcs\\t%0, %1, %2" [(set_attr "conds" "set") @@ -1171,13 +1172,15 @@ ) (define_insn "*subsi3_carryin_compare_const" - [(set (reg:CC CC_REGNUM) - (compare:CC (match_operand:SI 1 "reg_or_int_operand" "r") - (match_operand:SI 2 "arm_not_operand" "K"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_operand:SI 1 "reg_or_int_operand" "r")) + (plus:DI + (zero_extend:DI (match_operand:SI 2 "arm_not_operand" "K")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (plus:SI (match_dup 1) - (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (minus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "sbcs\\t%0, %1, #%B2" [(set_attr "conds" "set") @@ -4634,8 +4637,8 @@ (define_insn_and_split "negdi2_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NCV CC_REGNUM) + (compare:CC_NCV (const_int 0) (match_operand:DI 1 "register_operand" "0,r"))) (set (match_operand:DI 0 "register_operand" "=r,&r") @@ -4647,8 +4650,12 @@ (compare:CC (const_int 0) (match_dup 1))) (set (match_dup 0) (minus:SI (const_int 0) (match_dup 1)))]) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) (match_dup 3))) + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (const_int 0) + (plus:DI + (zero_extend:DI (match_dup 3)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_dup 2) (minus:SI (minus:SI (const_int 0) (match_dup 3)) @@ -4707,12 +4714,14 @@ ) (define_insn "*negsi2_carryin_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) - (match_operand:SI 1 "s_register_operand" "r"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (const_int 0) + (plus:DI + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (minus:SI (const_int 0) - (match_dup 1)) + (minus:SI (minus:SI (const_int 0) (match_dup 1)) (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_ARM" "rscs\\t%0, %1, #0" @@ -7361,12 +7370,15 @@ "#" ; "cmp\\t%Q0, %Q1\;sbcs\\t%2, %R0, %R1" "&& reload_completed" [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 0) (match_dup 1))) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 3) (match_dup 4))) - (set (match_dup 2) - (minus:SI (match_dup 5) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] + (compare:CC (match_dup 0) (match_dup 1))) + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_dup 3)) + (plus:DI (zero_extend:DI (match_dup 4)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (set (match_dup 2) + (minus:SI (match_dup 5) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] { operands[3] = gen_highpart (SImode, operands[0]); operands[0] = gen_lowpart (SImode, operands[0]); Index: gcc/config/arm/arm-modes.def =================================================================== --- gcc/config/arm/arm-modes.def (revision 251752) +++ gcc/config/arm/arm-modes.def (working copy) @@ -38,6 +38,8 @@ (used for DImode unsigned comparisons). CC_NCVmode should be used if only the N, C, and V flags are correct (used for DImode signed comparisons). + CC_NCV_CICmode defines N and V in SImode and C in DImode + (used for carryin_compare patterns). CCmode should be used otherwise. */ CC_MODE (CC_NOOV); @@ -44,6 +46,7 @@ CC_MODE (CC_Z); CC_MODE (CC_CZ); CC_MODE (CC_NCV); +CC_MODE (CC_NCV_CIC); CC_MODE (CC_SWP); CC_MODE (CCFP); CC_MODE (CCFPE); Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 251752) +++ gcc/config/arm/arm.md (working copy) @@ -664,17 +664,15 @@ (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC_V CC_REGNUM) (ne:CC_V - (plus:DI (plus:DI - (sign_extend:DI (match_dup 4)) - (sign_extend:DI (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (sign_extend:DI - (plus:SI (match_dup 4) (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) - (set (match_dup 3) (plus:SI (plus:SI - (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) - (const_int 0))))])] + (plus:DI (plus:DI (sign_extend:DI (match_dup 4)) + (sign_extend:DI (match_dup 5))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (sign_extend:DI + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0))))])] " { operands[3] = gen_highpart (SImode, operands[0]); @@ -708,13 +706,13 @@ [(set (reg:CC_V CC_REGNUM) (ne:CC_V (plus:DI - (plus:DI - (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) - (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (sign_extend:DI - (plus:SI (match_dup 1) (match_dup 2))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (plus:DI + (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) + (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (sign_extend:DI (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0)))))) (set (match_operand:SI 0 "register_operand" "=r") (plus:SI (plus:SI (match_dup 1) (match_dup 2)) @@ -743,17 +741,15 @@ (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC_C CC_REGNUM) (ne:CC_C - (plus:DI (plus:DI - (zero_extend:DI (match_dup 4)) - (zero_extend:DI (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (zero_extend:DI - (plus:SI (match_dup 4) (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) - (set (match_dup 3) (plus:SI - (plus:SI (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) - (const_int 0))))])] + (plus:DI (plus:DI (zero_extend:DI (match_dup 4)) + (zero_extend:DI (match_dup 5))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (zero_extend:DI + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0))))])] " { operands[3] = gen_highpart (SImode, operands[0]); @@ -772,17 +768,16 @@ [(set (reg:CC_C CC_REGNUM) (ne:CC_C (plus:DI - (plus:DI - (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) - (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (zero_extend:DI - (plus:SI (match_dup 1) (match_dup 2))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (plus:DI + (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) + (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (zero_extend:DI + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) (set (match_operand:SI 0 "register_operand" "=r") - (plus:SI - (plus:SI (match_dup 1) (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "adcs%?\\t%0, %1, %2" [(set_attr "conds" "set") @@ -1075,14 +1070,14 @@ "TARGET_32BIT" { emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], operands[2])); - arm_gen_unlikely_cbranch (LTU, CCmode, operands[3]); + arm_gen_unlikely_cbranch (EQ, CC_Cmode, operands[3]); DONE; }) (define_insn_and_split "subdi3_compare1" - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NCV CC_REGNUM) + (compare:CC_NCV (match_operand:DI 1 "register_operand" "r") (match_operand:DI 2 "register_operand" "r"))) (set (match_operand:DI 0 "register_operand" "=&r") @@ -1093,10 +1088,14 @@ [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (match_dup 2))) (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))]) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 4) (match_dup 5))) - (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_dup 4)) + (plus:DI (zero_extend:DI (match_dup 5)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (set (match_dup 3) + (minus:SI (minus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] { operands[3] = gen_highpart (SImode, operands[0]); operands[0] = gen_lowpart (SImode, operands[0]); @@ -1152,13 +1151,15 @@ ) (define_insn "*subsi3_carryin_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC (match_operand:SI 1 "s_register_operand" "r") - (match_operand:SI 2 "s_register_operand" "r"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) + (plus:DI + (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (minus:SI (match_dup 1) - (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (minus:SI (minus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "sbcs\\t%0, %1, %2" [(set_attr "conds" "set") @@ -1166,13 +1167,15 @@ ) (define_insn "*subsi3_carryin_compare_const" - [(set (reg:CC CC_REGNUM) - (compare:CC (match_operand:SI 1 "reg_or_int_operand" "r") - (match_operand:SI 2 "arm_not_operand" "K"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_operand:SI 1 "reg_or_int_operand" "r")) + (plus:DI + (zero_extend:DI (match_operand:SI 2 "arm_not_operand" "K")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (plus:SI (match_dup 1) - (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (minus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "sbcs\\t%0, %1, #%B2" [(set_attr "conds" "set") @@ -4686,8 +4689,8 @@ (define_insn_and_split "negdi2_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NCV CC_REGNUM) + (compare:CC_NCV (const_int 0) (match_operand:DI 1 "register_operand" "0,r"))) (set (match_operand:DI 0 "register_operand" "=r,&r") @@ -4699,8 +4702,12 @@ (compare:CC (const_int 0) (match_dup 1))) (set (match_dup 0) (minus:SI (const_int 0) (match_dup 1)))]) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) (match_dup 3))) + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (const_int 0) + (plus:DI + (zero_extend:DI (match_dup 3)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_dup 2) (minus:SI (minus:SI (const_int 0) (match_dup 3)) @@ -4759,12 +4766,14 @@ ) (define_insn "*negsi2_carryin_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) - (match_operand:SI 1 "s_register_operand" "r"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (const_int 0) + (plus:DI + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (minus:SI (const_int 0) - (match_dup 1)) + (minus:SI (minus:SI (const_int 0) (match_dup 1)) (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_ARM" "rscs\\t%0, %1, #0" @@ -7438,12 +7447,15 @@ "#" ; "cmp\\t%Q0, %Q1\;sbcs\\t%2, %R0, %R1" "&& reload_completed" [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 0) (match_dup 1))) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 3) (match_dup 4))) - (set (match_dup 2) - (minus:SI (match_dup 5) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] + (compare:CC (match_dup 0) (match_dup 1))) + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_dup 3)) + (plus:DI (zero_extend:DI (match_dup 4)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (set (match_dup 2) + (minus:SI (match_dup 5) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] { operands[3] = gen_highpart (SImode, operands[0]); operands[0] = gen_lowpart (SImode, operands[0]); Index: gcc/doc/rtl.texi =================================================================== --- gcc/doc/rtl.texi (revision 251752) +++ gcc/doc/rtl.texi (working copy) @@ -2252,6 +2252,13 @@ If one of the operands is a constant, it should be placed in the second operand and the comparison code adjusted as appropriate. +There may be exceptions to this rule if the mode @var{m} does not +carry enough information for the swapped comparison operator, or +if we try to detect overflow from the subtraction. That means, while +0-X may overfow X-0 can never overflow. Under these conditions +a compare may have the constant expression at the first operand. +Examples are the ARM negdi2_compare pattern and similar. + A @code{compare} specifying two @code{VOIDmode} constants is not valid since there is no way to know in what mode the comparison is to be performed; the comparison must either be folded during the compilation