--- Comment #12 from changpeng dot fang at amd dot com 2010-08-30 16:41
---
Fixed!
--
changpeng dot fang at amd dot com changed:
What|Removed |Added
Status
--- Comment #11 from changpeng dot fang at amd dot com 2010-08-30 16:40
---
r163286 - in /branches/gcc-4_5-branch/gcc: Chan...
* From: cfang at gcc dot gnu dot org
* To: gcc-cvs at gcc dot gnu dot org
* Date: Mon, 16 Aug 2010 21:02:30 -
* Subject: r163286 - in
--- Comment #10 from changpeng dot fang at amd dot com 2010-08-30 16:39
---
r163207 - in /trunk/gcc: ChangeLog testsuite/Ch...
* From: cfang at gcc dot gnu dot org
* To: gcc-cvs at gcc dot gnu dot org
* Date: Thu, 12 Aug 2010 22:18:34 -
* Subject: r163207 - in
--- Comment #9 from changpeng dot fang at amd dot com 2010-08-30 16:37
---
Review approval for the trunk:
http://gcc.gnu.org/ml/gcc-patches/2010-08/msg00931.html
Review Approval for 4.5 branch:
http://gcc.gnu.org/ml/gcc-patches/2010-08/msg02112.html
--
http://gcc.gnu.org/bugzilla
--- Comment #5 from changpeng dot fang at amd dot com 2010-08-24 22:13
---
For the test case in comment #2, if we don't vectorize the loop, the
unroll_factor is incorrectly determined as 1, and insns-to-prefetch ratio
(4) will then prevent prefetching, and thus no perfor
--- Comment #4 from changpeng dot fang at amd dot com 2010-08-24 00:46
---
Ooops, the open64 generated code posted in last comment is for non-vectorized
loop, the vectorized one is similar:
.LBB23_f:
.loc1 7 0
movups 0(%r10),%xmm3# [0] id:65
--- Comment #3 from changpeng dot fang at amd dot com 2010-08-24 00:22
---
I checked with open64 and did not find any regression. And for the above
testcase, open64 generated 3 non-temporal prefetches. As a result, I am
guessing that we are just unlucky that the prefetch kicks out
--- Comment #2 from changpeng dot fang at amd dot com 2010-08-24 00:03
---
float f (float *x, float *y, float *z, unsigned n)
{
float ret = 0.0;
unsigned i;
for (i = 0; i < n; i++)
{
float diff = x[i] - y[i];
ret -= diff * diff * z[i];
}
return ret;
}
prefetching of vectorized loop
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http
UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45390
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45389
--- Comment #6 from changpeng dot fang at amd dot com 2010-08-23 18:59
---
Committed to trunk as Revision: 163475:
http://gcc.gnu.org/ml/gcc-cvs/2010-08/msg00688.html
Committed to 4.5 branch as Revision: 163483
http://gcc.gnu.org/ml/gcc-cvs/2010-08/msg00696.html
--
http
--- Comment #5 from changpeng dot fang at amd dot com 2010-08-20 22:48
---
I have a fix:
http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01625.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45260
--- Comment #2 from changpeng dot fang at amd dot com 2010-08-18 19:43
---
http://gcc.gnu.org/ml/gcc-cvs/2010-05/msg00406.html
Verified. If I back out the above change, the bug goes away.
So it is a duplicate of bug 44206
*** This bug has been marked as a duplicate of 44206
--- Comment #3 from changpeng dot fang at amd dot com 2010-08-18 19:43
---
*** Bug 45269 has been marked as a duplicate of this bug. ***
--
changpeng dot fang at amd dot com changed:
What|Removed |Added
--- Comment #4 from changpeng dot fang at amd dot com 2010-08-16 22:39
---
This bug should be related to VIEW_CONVERT_EXPR.
If I use the following statement to filter the prefetch, the bug will go away:
if (contains_view_convert_expr_p (ref))
return false;
Otherwise, the
at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45270
06 450.soplex: "verify_cgraph_node failed" with -
fprofile-generate
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45269
gram -combine
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu
--- Comment #3 from changpeng dot fang at amd dot com 2010-08-12 00:38
---
(In reply to comment #2)
> It was caused by revision 153878:
>
> http://gcc.gnu.org/ml/gcc-cvs/2009-11/msg00094.html
>
I think the same patch was also committed to 4.4 branch.
Maybe some prefetc
--- Comment #7 from changpeng dot fang at amd dot com 2010-08-10 21:44
---
(In reply to comment #5)
> (In reply to comment #1)
> > This patch should be a valid fix, because the recognition of the dot_prod
> > pattern is known to be fail at this point if the stmt is o
--- Comment #1 from changpeng dot fang at amd dot com 2010-08-09 17:52
---
This patch should be a valid fix, because the recognition of the dot_prod
pattern is known to be fail at this point if the stmt is outside the loop.
(I am not sure whether we should not see this case in the
ssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45241
ssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45239
--- Comment #4 from changpeng dot fang at amd dot com 2010-07-29 19:14
---
(In reply to comment #1)
> The misaligned indirect-refs will vanish soon.
>
I saw your patch that remove ALIGNED_INDIRECT_REF. Do you also plan to remove
MISALIGNED_INDIRECT_REF? Thanks.
--
--- Comment #5 from changpeng dot fang at amd dot com 2010-07-28 18:28
---
Thing is a little complicate if we change the code to:
a[i] = a[i+1] + beta * b[i];
The prefetch pass want to group a[i] and a[i+1], i.e. they have
the same base address with an offset of 4 bytes.
--
http
--- Comment #4 from changpeng dot fang at amd dot com 2010-07-28 18:22
---
Andrew's example is exactly what the prefetch sees for the test case (in the
bug description). Unfortunately, the prefetch pass could not recognize that
vect_pa.6_24 and vect_pa.20_38 are exactly the
--- Comment #2 from changpeng dot fang at amd dot com 2010-07-22 20:52
---
(In reply to comment #1)
> The misaligned indirect-refs will vanish soon.
>
>From the prefetching point of view, is there any reason that we can not
prefetch
for mis-aligned or indirect refs?
-
--- Comment #23 from changpeng dot fang at amd dot com 2010-07-21 21:30
---
Fixed
--
changpeng dot fang at amd dot com changed:
What|Removed |Added
Status
--- Comment #1 from changpeng dot fang at amd dot com 2010-07-21 18:26
---
The direct reason is that prefetching could not differentiate the base
addresses
of the vectorized load and store (of a[i]):
*vect_pa.6_24
*vect_pa.19_37
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45021
P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45022
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45021
--- Comment #1 from changpeng dot fang at amd dot com 2010-07-15 17:20
---
This is a piece of code that shows the two prefetches for b.
mulss %xmm4, %xmm5
addq$8, %rdx
prefetcht0 96(%r11)
prefetcht0 100(%r11)
subss %xmm2, %xmm1
at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44955
--- Comment #4 from changpeng dot fang at amd dot com 2010-07-15 01:50
---
Created an attachment (id=21205)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21205&action=view)
Do not unroll pre and post loops
I did a quick test on polyhedron before and after apply
--- Comment #20 from changpeng dot fang at amd dot com 2010-07-09 01:59
---
I submitted a patch for review to completely fix the problem. The patch is an
extension to Christian's speedup.patch. It splits the cost analysis into
three small functions and quits further prefet
--- Comment #19 from changpeng dot fang at amd dot com 2010-07-07 19:00
---
(In reply to comment #18)
> Changpeng, should this PR be closed now?
>
No. I am still looking at the dependence computation cost. I just found the
most of the time is spent in memory allocation and free
--- Comment #3 from changpeng dot fang at amd dot com 2010-07-06 18:35
---
Here is the impact of loop unrolling on the compilation time and code size
on polyhedron test_fpu.f90:
-O3 -ftree-vectorize -fno-prefetch-loop-arrays -fno-unroll-loops:
timing: 12.62s, size: 67069 bytes
-O3
--- Comment #2 from changpeng dot fang at amd dot com 2010-07-06 17:58
---
We also need to handle the post loop of unrolling. Suppose the unroll_factor
is 16, then the post-loop should have up to 15 iterations.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794
--- Comment #17 from changpeng dot fang at amd dot com 2010-07-02 23:58
---
(In reply to comment #15)
I have opened PR44794 for the unrolling of pre- and post-loop issue.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794
--- Comment #15 from changpeng dot fang at amd dot com 2010-07-01 00:34
---
Unrolling of the peeled loop is partially the reason for test_fpu.f90
compilation
time and code size increase. Vectorization peeled a few iteration of the the
loop, the prefetching and unrolling passes does not
--- Comment #14 from changpeng dot fang at amd dot com 2010-06-30 00:36
---
(In reply to comment #7)
> A good chunk of time seems to be spent in the RTL loop unroller, triggered
> by array prefetching (testing with -O3 -funroll-loops). Otherwise it might
> as well be just
--- Comment #13 from changpeng dot fang at amd dot com 2010-06-30 00:23
---
Here is the current status of this work:
patch1: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02956.html
patch2: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg03049.html
On my system with -O3 zero_sized_1.f90
--- Comment #12 from changpeng dot fang at amd dot com 2010-06-29 00:49
---
Created an attachment (id=21034)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21034&action=view)
Early return in miss rate computation
The attached patch improves the computation of miss rate.
--- Comment #11 from changpeng dot fang at amd dot com 2010-06-29 00:07
---
I have a patch that partially fixes the problem:
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02956.html
Note that for this test case, the compile time doubled even though
I don't compute the miss rate a
--- Comment #4 from changpeng dot fang at amd dot com 2010-06-25 17:08
---
(In reply to comment #3)
> Created an attachment (id=21001)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21001&action=view) [edit]
> Potential fix for compile time regression
>
> Here
--- Comment #4 from changpeng dot fang at amd dot com 2010-06-14 22:22
---
There is nothing wrong in the prefetch itself. The problem is
__builtin_prefetch call used for prefetch instruction. Whenever,
there is a non-local lable in the current function, the __builtin_prefetch
--- Comment #3 from changpeng dot fang at amd dot com 2010-06-14 18:28
---
Actually, the prefetching is for the following loop:
for (i = 0; i < p[2]; i++)
q[i] = 0;
I do not understand why unrolling of this loop affects other part of
the program that has long
--- Comment #2 from changpeng dot fang at amd dot com 2010-06-11 18:45
---
Bug 39398 looks similar but that one seems with except handling instead of
setjmp.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44503
--- Comment #1 from changpeng dot fang at amd dot com 2010-06-11 16:32
---
Created an attachment (id=20894)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20894&action=view)
prefetching for the while loop?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44503
oop-arrays
Product: gcc
Version: tree-ssa
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/sh
--- Comment #21 from changpeng dot fang at amd dot com 2010-06-08 16:23
---
Just for the record, non-constant step prefetching improves 459.GemsFDTD
by 5.5% (under -O3 + prefetch) on amd-linux64 systems. And the gains are
from the following set of loops:
NFT.fppized.f90:1268
--- Comment #19 from changpeng dot fang at amd dot com 2010-06-07 22:30
---
Created an attachment (id=20862)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20862&action=view)
Account prefetch_mod and unroll_factor for the computation of the prefetch
count
Ooops. Attached
--- Comment #17 from changpeng dot fang at amd dot com 2010-06-07 18:37
---
(In reply to comment #15)
> Created an attachment (id=20860)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20860&action=view) [edit]
> Don't consider effect of unrolling in the comp
--- Comment #16 from changpeng dot fang at amd dot com 2010-06-07 18:32
---
Created an attachment (id=20861)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20861&action=view)
Limit non-constant step prefetching only to the innermost loops
--
http://gcc.gnu.org/b
--- Comment #15 from changpeng dot fang at amd dot com 2010-06-07 18:30
---
Created an attachment (id=20860)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20860&action=view)
Don't consider effect of unrolling in the computation of insn-to-prefetch ratio
--
http:/
--- Comment #14 from changpeng dot fang at amd dot com 2010-06-07 18:27
---
Here is the current status of my investigation:
(1) 465.tonto regression (~9%):
The regressions mainly comes from loops which have array references with both
constant (prefetch_mod = 8) and non-constant
--- Comment #3 from changpeng dot fang at amd dot com 2010-06-04 23:29
---
(In reply to comment #2)
> Interesting! What's the difference between 17 and 18?
>
> int main()
> {
> double i;
> for(i=0; i<18; i+=1); /* gcc -O3, empty loop not
--- Comment #2 from changpeng dot fang at amd dot com 2010-06-04 23:15
---
Interesting! What's the difference between 17 and 18?
int main()
{
double i;
for(i=0; i<18; i+=1); /* gcc -O3, empty loop not removed */
}
int main()
{
double i;
fo
--- Comment #13 from changpeng dot fang at amd dot com 2010-06-01 19:59
---
(In reply to comment #12)
> Ok. So I will let you continue to look into that and wait for your results?
>
> Do you have any feedback on separate.patch and its influence on performance?
>
+ f
--- Comment #11 from changpeng dot fang at amd dot com 2010-06-01 17:40
---
(In reply to comment #10)
> Created an attachment (id=20783)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20783&action=view) [edit]
> experimental patch to have separa
--- Comment #9 from changpeng dot fang at amd dot com 2010-05-28 18:36
---
(In reply to comment #8)
> Looks like this is a fix to the regressions. That is, the regressions are
> actually caused by the wrong calculation. This bug could be considered fixed,
> even though pe
--- Comment #8 from changpeng dot fang at amd dot com 2010-05-28 18:30
---
(In reply to comment #4)
> Created an attachment (id=20767)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20767&action=view) [edit]
> Patch that makes loop invariant prefetches backend specf
--- Comment #7 from changpeng dot fang at amd dot com 2010-05-28 16:56
---
(In reply to comment #5)
> An alternative approach might be have different values for
> prefetch-min-insn-to-mem-ratio and min-insn-to-prefetch-ratio
> depending on constant/non-constant step size.
>
--- Comment #6 from changpeng dot fang at amd dot com 2010-05-28 16:46
---
(In reply to comment #4)
> Created an attachment (id=20767)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20767&action=view) [edit]
> Patch that makes loop invariant prefetches backend specfic
--- Comment #3 from changpeng dot fang at amd dot com 2010-05-27 23:51
---
I did a quick look at 434.zeusmp and found that prefetching for the following
simple loop is responsible:
linpck.f: 131:
c
ccode for increment not equal to 1
c
ix = 1
smax = abs(sx(1
--- Comment #2 from changpeng dot fang at amd dot com 2010-05-27 20:55
---
To me, non-constant step prefetching seems not fit into the existing
prefetching
framework. non-constant stride prevent any reuse analysis, and thus prefetching
is kind of blindly.
--
http://gcc.gnu.org
--- Comment #1 from changpeng dot fang at amd dot com 2010-05-27 20:49
---
The regressions are most likely from the patch that added non-constant step
prefetching:
* From: Andreas Krebbel
* To: Christian Borntraeger
* Cc: gcc-patches
* Date: Wed, 19 May 2010 12:40
dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44297
--- Comment #9 from changpeng dot fang at amd dot com 2010-05-24 22:47
---
(In reply to comment #8)
> -fgraphite-identity does iteration splitting for this case.
Do you know why it could not be vectorized after iteration
range splitting?
--
http://gcc.gnu.org/bugzi
--- Comment #6 from changpeng dot fang at amd dot com 2010-05-21 21:36
---
(In reply to comment #5)
> The fix introduced:
>
> FAIL: gcc.dg/tree-ssa/prefetch-7.c scan-assembler-times movnti 18
> FAIL: gcc.dg/tree-ssa/prefetch-7.c scan-tree-dump-times optimized "={nt}&
--- Comment #2 from changpeng dot fang at amd dot com 2010-05-18 19:39
---
I have a patch to fix the test cases:
http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01359.html
For prefetch-6.c, patch http://gcc.gnu.org/ml/gcc-cvs/2010-05/msg00567.html
applies the insn to prefetch ratio
--- Comment #7 from changpeng dot fang at amd dot com 2010-05-07 21:41
---
(In reply to comment #4)
> (In reply to comment #3)
> > Subject: Re: gcc should vectorize this loop
> > through "iteration range splitting"
> > You mean that the prob
--- Comment #3 from changpeng dot fang at amd dot com 2010-05-07 21:33
---
I just found that the test case in the same as (similar to) bug 35229.
The subject of this bug is wrong. Scalar expansion is not appropriate
for this case.
Actually the loop can be transform to:
void foo(int n
5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43543
chf...@pathscale:~/gcc$ cat foo.c
float a[100], b[100], c[100];
void foo(int n)
{
int i;
for(i=1; ihttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=43428
chf...@pathscale:~/gcc$ cat foo.c
float a[100][100], b[100][100];
void foo(int n)
{
int i, j;
for(j=0; jhttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=43427
chf...@pathscale:~/gcc$ cat foo.c
int a[100], b[100];
void foo(int n, int mid)
{
int i, t = 0;
for(i=0; ihttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=43425
chf...@pathscale:~/gcc$ cat foo.c
int a[100], b[100], c[100];
void foo(int n, int mid)
{
int i;
for(i=0; ihttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423
MED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43422
--- Comment #20 from changpeng dot fang at amd dot com 2010-03-18 17:24
---
(In reply to comment #19)
> Splitting critical edges for CDDCE will probably also solve this problem.
>
> Richard.
>
Yes, splitting critical edges is an enhancement to CDDCE and can solve this
pr
--- Comment #8 from changpeng dot fang at amd dot com 2010-03-17 21:22
---
Created an attachment (id=20133)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20133&action=view)
patch with the testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32824
--- Comment #18 from changpeng dot fang at amd dot com 2010-03-17 00:22
---
(In reply to comment #16)
> > In this case, the loop itself is "empty" and we can replace every use of the
> > phi with "n" (exit value of the iv).
>
> I don't thin
--- Comment #17 from changpeng dot fang at amd dot com 2010-03-17 00:18
---
(In reply to comment #8)
> And
>
> int foo (int b, int j)
> {
> if (b)
> {
> int i;
> for (i = 0; i<1000; ++i)
> ;
> j = b;
> }
> r
--- Comment #4 from changpeng dot fang at amd dot com 2010-03-02 21:56
---
I have verified that the patch proposed in bug 43209 did
fix this problem. I am going to checkin the change soon.
Thanks.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43238
--- Comment #5 from changpeng dot fang at amd dot com 2010-03-01 18:02
---
I have a fix for this problem. We should not decrease the cost if the cost is
infinite.
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 74dadf7..9accda9 100644
--- a/gcc/tree-ssa-loop
--- Comment #6 from changpeng dot fang at amd dot com 2010-02-26 19:06
---
>
> Actually it is a totally different case. Please file a new bug with that
> case;
> though there might already be a bug about that one.
>
I could not see the difference even though j i
--- Comment #4 from changpeng dot fang at amd dot com 2010-02-26 18:53
---
Here is another similar case but more general. We know that a(j) and a(i)
never access the same memory location. intel ifort can vectorize this
triangular
loop:
do 10 j = 1,n
do 20 i = j+1, n
--- Comment #2 from changpeng dot fang at amd dot com 2010-02-26 00:28
---
Subject: RE: gcc could not vectorize floating point
reduction statements
Thanks for pointing this out. Actually I am working on a fortran program and
found the
the reduction statement. The fortran code can
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43184
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: changpeng dot fang at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43182
--- Comment #15 from changpeng dot fang at amd dot com 2010-02-16 19:54
---
Hello,
I am not sure whether CD-DCE can fully replace remove_empty_loop. However,
I would prefer to keep remove_empty_loop pass. There are two reasons for
this proposal:
(1) remove_empty_loop was at level -O1
93 matches
Mail list logo