There was some discussion a few weeks ago about some apps running slower
with FDO enabled.
I've recently investigated a similar situation using mainline. In my case,
the fact that the loop_optimize pass is disabled during FDO was the cause
of the slowdown. It appears that was recently disabled
>A more likely source of performance degradation is that loop unrolling
>is enabled when profiling, and loop unrolling is almost always a bad
>pessimization on 32 bits x86 targets.
To clarify, I was compiling with -funroll-loops and -fpeel-loops
enabled in both cases.
The FDO slowdown in my case
> Do you have specific testcase? It would be interesting to see if new
> optimizer can catch up at least on kill-loop branch.
Here is a simplified version of what I observed. In the non-FDO case,
the loop invariant load of the constant 32 is removed from the loop.
When FDO is enabled, the load r
>you may try adding -fmove-loop-invariants flag, which enables new
>invariant motion pass.
That cleaned up both my simplified test case, and the code it
originated from. It also cleaned up a few other cases where I
was noticing worse performance with FDO enabled. Thanks!!
Perhaps this option sh
Quick question on syntax in md files as I'm not finding the documentation
to explain it. If I see the following on an instruction definition:
(set_attr "type" "*")
What does * represent in this context as the value to assign to "type"?
Thanks.
Pete
I'm not entirely sure how gcc's CFG structure all fits together yet, so
I'll ask for some input on this one:
While looking through some dumps from a compile using -fprofile-use, I
noticed the following in the "jump" dump file:
Basic block 164 prev 163, next -2, loop_depth 0, count 1672, freq 148
Added a better subject line.. Pete.
[EMAIL PROTECTED] wrote on 09/30/2005 11:03:59 AM:
>
> I'm not entirely sure how gcc's CFG structure all fits together yet, so
> I'll ask for some input on this one:
>
> While looking through some dumps from a compile using -fprofile-use, I
> noticed the follo
I'm using store_data_bypass_p from recog.c as the guard for a define_bypass
within a machine description. I'm seeing the following warning/error that
I'd like to clean up.
cc1: warnings being treated as errors
insn-automata.c: In function 'internal_insn_latency':
insn-automata.c:53265: warning:
I've been looking a bit at how haifa_sched.c sorts the ready list and think
there may be some room for added flexibility and/or improvement. I'll
throw out a few ideas for discussion.
Currently, within the ready_sort macro in haifa-sched.c, the call to qsort
is passed "rank_for_schedule" to help