Hi,
We'd like to note about CodeSourcery's patch for ARM backend, from which
GCC mainline can gain 4% on SPEC2K INT:
http://cgit.openembedded.org/openembedded/plain/recipes/gcc/gcc-4.5/linaro/gcc-4.5-linaro-r99369.patch
(also the patch is attached).
Originally, we noticed that GNU Go works 6
On 06/27/2012 07:55 PM, Ramana Radhakrishnan wrote:
> I must admit that I had been suggesting to Zhenqiang about turning
> this off by tightening the movsi_insn predicates rather than adding a
> split, but given that it appears to produce enough benefit in this
> case I don't have any reasons to
On 06/27/2012 07:53 PM, Richard Earnshaw wrote:
Please update the ChangeLog entry (it's not appropriate to mention
Sourcery G++) and add a comment as Steven has suggested.
Otherwise OK.
Updated.
Ok to commit now?
--
Best regards,
Dmitry
2009-05-29 Julian Brown
gcc/
* config/arm/arm
On 06/29/2012 06:31 PM, Ramana Radhakrishnan wrote:
Ok with this comment?
+;; Split symbol_refs at the later stage (after cprop), instead of
generating
+;; movt/movw pair directly at expand. Otherwise corresponding high_sum
+;; and lo_sum would be merged back into memory load at cprop. Howeve
Interesting but I would be a bit defensive and make sure that this
matches only if -ffast-math in the FP case. You are sort of relying on
the fact that vsub wouldn't be generated without ffast-math but I'd
rather be defensive about it . (This is in case it's not clear in the
non-intrinsics case)
Hi,
This series of patches solves few issues we found with Thumb-2
conditional insns. These fixes include:
1) Split if_then_else into cond_execs to generate only required minimum
of IT-blocks;
2) Grouping conditional insns of same INSN_PRIORITY to avoid excessive
splitting of IT-blocks;
3)
This patch adds splits for if_then_else into cond_execs. This helps
generating the minimum number of IT-blocks for two consequent
if_then_elses, e.g. one ITETE insn instead of two ITE insns, if
if_then_else were expanded directly into assembly code.
There are three splitters for the cases when b
more target hooks just to save correct can_issue_more value.
This has reduced code size by 144 bytes on SPEC2K INT with -O2 (no
regressions).
2011-12-29 Dmitry Melnik
gcc/
* config/arm/arm.c (arm_variable_issue, arm_sched_init, arm_sched_finish,
arm_sched_re
branch insn and code won't grow. This limit is applied for
each of converted conditional branches.
This reduces code size by 96 bytes on SPEC2K INT with -O2 (with +4 byte
regression on one test).
2011-12-29 Dmitry Melnik
gcc/
* config/arm/arm.h (MAX_CONDITIONAL_EXECUTE): New macro.
If one of branches has significantly greater probability than the other,
then it may be better to rely on CPU's branch prediction and block
reordering, than putting rarely executed instructions into the pipeline.
In this patch we set 10% frequency ratio as a cutoff.
On SPEC2K INT with -O2 this
After Thumb-2's peephole2 adds flag clobbering on suitable insns in
order to generate 16-bit encoding for them, if-conversion can't
transform these insns into cond_execs. In theory, if the instruction
were converted to conditional form, it would also use 16-bit encoding,
so the flag clobbering
This patch fixes few things in pipeline description of ARM Cortex-A8.
1) arm_no_early_alu_shift_value_dep() checks early dependence only for
one argument, ignoring the dependence on register used as shift amount.
For example, this function is used as a condition in bypass that sets
dep_cost=0
This patch adds two define_insn patterns for NEON vabd instruction to
make combine pass recognize expressions matching (vabs (vsub ...))
patterns as vabd.
This patch reduces code size of x264 binary from 649143 to 648343 (800
bytes, or 0.12%) and increases its performance on average by 2.5% on
This patch adds two define_insn patterns for NEON vabd instruction to
make combine pass recognize expressions matching (vabs (vsub ...))
patterns as vabd.
This patch reduces code size of x264 binary from 649143 to 648343 (800
bytes, or 0.12%) and increases its performance on average by 2.5% on
Hi All,
The attached patch changes the reload class for NEON constant vectors
from GENERAL_REGS to NO_REGS.
The issue was found on this code from libevas:
void
_op_blend_p_caa_dp(unsigned *s, unsigned* e, unsigned *d, unsigned c) {
while (d < e) {
*d = ( (*s) >> 8) & 0x00ff00ff)
15 matches
Mail list logo