I have just switched to gcc 5.2 from 4.9.2 and the code quality does seem to
have improved significantly. For example, it now seems much better at using
ldp/stp and it seems to has stopped gratuitous use of the SIMD registers.
However, I still have a few whinges:-)
See attached copy.c / copy.s (This is a performance critical function from
OpenJDK)
pd_disjoint_words:
cmp x2, 8 <<< (1)
sub sp, sp, #64 <<< (2)
bhi .L2
cmp w2, 8 <<< (1)
bls .L15
.L2:
add sp, sp, 64 <<< (2)
(1) If count as a 64 bit unsigned is <= 8 then it is probably still <= 8 as a
32 bit unsigned.
Agreed. This could probably be done by the mid-end based on value range
propagation. Please can you file a report in gcc bugzilla?
Not sure how we can do this in VRP. It seems that this is generated
during the RTL expansion time. Maybe,it has to be done during expansion.
optimized tree looks like:
;; Function pd_disjoint_words (pd_disjoint_words, funcdef_no=0,
decl_uid=2763, cgraph_uid=0, symbol_order=0)
Removing basic block 13
pd_disjoint_words (HeapWord * from, HeapWord * to, size_t count)
{
long int t$b;
long int t$a;
struct unit t;
struct unit t;
struct unit t;
struct unit t;
struct unit t;
struct unit t;
long int _5;
<bb 2>:
switch (count_2(D)) <default: <L16>, case 0: <L18>, case 1: <L1>,
case 2: <L2>, case 3: <L4>, case 4: <L6>, case 5: <L8>, case 6: <L10>,
case 7: <L12>, case 8: <L14>>
<L1>:
_5 = *from_4(D);
*to_6(D) = _5;
goto <bb 12> (<L18>);
<L2>:
t$a_8 = MEM[(struct unit *)from_4(D)];
t$b_9 = MEM[(struct unit *)from_4(D) + 8B];
MEM[(struct unit *)to_6(D)] = t$a_8;
MEM[(struct unit *)to_6(D) + 8B] = t$b_9;
goto <bb 12> (<L18>);
<L4>:
t = MEM[(struct unit *)from_4(D)];
MEM[(struct unit *)to_6(D)] = t;
t ={v} {CLOBBER};
goto <bb 12> (<L18>);
<L6>:
t = MEM[(struct unit *)from_4(D)];
MEM[(struct unit *)to_6(D)] = t;
t ={v} {CLOBBER};
goto <bb 12> (<L18>);
<L8>:
t = MEM[(struct unit *)from_4(D)];
MEM[(struct unit *)to_6(D)] = t;
t ={v} {CLOBBER};
goto <bb 12> (<L18>);
<L10>:
t = MEM[(struct unit *)from_4(D)];
MEM[(struct unit *)to_6(D)] = t;
t ={v} {CLOBBER};
goto <bb 12> (<L18>);
<L12>:
t = MEM[(struct unit *)from_4(D)];
MEM[(struct unit *)to_6(D)] = t;
t ={v} {CLOBBER};
goto <bb 12> (<L18>);
<L14>:
t = MEM[(struct unit *)from_4(D)];
MEM[(struct unit *)to_6(D)] = t;
t ={v} {CLOBBER};
goto <bb 12> (<L18>);
<L16>:
_Copy_disjoint_words (from_4(D), to_6(D), count_2(D)); [tail call]
<L18>:
return;
}
Thanks,
Kugan
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain