--- Comment #23 from changpeng dot fang at amd dot com 2010-07-21 21:30
---
Fixed
--
changpeng dot fang at amd dot com changed:
What|Removed |Added
Status|NE
--
steven at gcc dot gnu dot org changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfir
--- Comment #22 from borntraeger at de dot ibm dot com 2010-06-08 19:42
---
I bootstrapped with patches 0002 and 0003.
The results are also good.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44297
--- Comment #21 from changpeng dot fang at amd dot com 2010-06-08 16:23
---
Just for the record, non-constant step prefetching improves 459.GemsFDTD
by 5.5% (under -O3 + prefetch) on amd-linux64 systems. And the gains are
from the following set of loops:
NFT.fppized.f90:1268
NFT.fppized
--- Comment #20 from borntraeger at de dot ibm dot com 2010-06-08 05:51
---
both patches look sane. I will test both.
thank you for your work.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44297
--- Comment #19 from changpeng dot fang at amd dot com 2010-06-07 22:30
---
Created an attachment (id=20862)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20862&action=view)
Account prefetch_mod and unroll_factor for the computation of the prefetch
count
Ooops. Attached a wrong "
--- Comment #18 from rakdver at kam dot mff dot cuni dot cz 2010-06-07
20:24 ---
Subject: Re: Big spec cpu2006 prefetch regressions
on gcc 4.6 on x86
> --- Comment #14 from changpeng dot fang at amd dot com 2010-06-07 18:27
> ---
> Here is the current status of my in
--- Comment #17 from changpeng dot fang at amd dot com 2010-06-07 18:37
---
(In reply to comment #15)
> Created an attachment (id=20860)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20860&action=view) [edit]
> Don't consider effect of unrolling in the computation of insn-to-prefe
--- Comment #16 from changpeng dot fang at amd dot com 2010-06-07 18:32
---
Created an attachment (id=20861)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20861&action=view)
Limit non-constant step prefetching only to the innermost loops
--
http://gcc.gnu.org/bugzilla/show_bu
--- Comment #15 from changpeng dot fang at amd dot com 2010-06-07 18:30
---
Created an attachment (id=20860)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20860&action=view)
Don't consider effect of unrolling in the computation of insn-to-prefetch ratio
--
http://gcc.gnu.org/
--- Comment #14 from changpeng dot fang at amd dot com 2010-06-07 18:27
---
Here is the current status of my investigation:
(1) 465.tonto regression (~9%):
The regressions mainly comes from loops which have array references with both
constant (prefetch_mod = 8) and non-constant (prefet
--- Comment #13 from changpeng dot fang at amd dot com 2010-06-01 19:59
---
(In reply to comment #12)
> Ok. So I will let you continue to look into that and wait for your results?
>
> Do you have any feedback on separate.patch and its influence on performance?
>
+ for (; groups; gr
--- Comment #12 from borntraeger at de dot ibm dot com 2010-06-01 19:30
---
Ok. So I will let you continue to look into that and wait for your results?
Do you have any feedback on separate.patch and its influence on performance?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44297
--- Comment #11 from changpeng dot fang at amd dot com 2010-06-01 17:40
---
(In reply to comment #10)
> Created an attachment (id=20783)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20783&action=view) [edit]
> experimental patch to have separate values for min_insn_to_prefetch_r
--- Comment #10 from borntraeger at de dot ibm dot com 2010-05-31 08:58
---
Created an attachment (id=20783)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20783&action=view)
experimental patch to have separate values for min_insn_to_prefetch_ration
Changpeng,
thank you for the f
--- Comment #9 from changpeng dot fang at amd dot com 2010-05-28 18:36
---
(In reply to comment #8)
> Looks like this is a fix to the regressions. That is, the regressions are
> actually caused by the wrong calculation. This bug could be considered fixed,
> even though performance tuni
--- Comment #8 from changpeng dot fang at amd dot com 2010-05-28 18:30
---
(In reply to comment #4)
> Created an attachment (id=20767)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20767&action=view) [edit]
> Patch that makes loop invariant prefetches backend specfic
>
> Three ob
--- Comment #7 from changpeng dot fang at amd dot com 2010-05-28 16:56
---
(In reply to comment #5)
> An alternative approach might be have different values for
> prefetch-min-insn-to-mem-ratio and min-insn-to-prefetch-ratio
> depending on constant/non-constant step size.
>
It may be
--- Comment #6 from changpeng dot fang at amd dot com 2010-05-28 16:46
---
(In reply to comment #4)
> Created an attachment (id=20767)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20767&action=view) [edit]
> Patch that makes loop invariant prefetches backend specfic
>
Actually,
--- Comment #5 from borntraeger at de dot ibm dot com 2010-05-28 07:41
---
An alternative approach might be have different values for
prefetch-min-insn-to-mem-ratio and min-insn-to-prefetch-ratio
depending on constant/non-constant step size.
--
http://gcc.gnu.org/bugzilla/show_bug
--- Comment #4 from borntraeger at de dot ibm dot com 2010-05-28 07:24
---
Created an attachment (id=20767)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20767&action=view)
Patch that makes loop invariant prefetches backend specfic
Three observations:
1. the patch had a bug whic
--- Comment #3 from changpeng dot fang at amd dot com 2010-05-27 23:51
---
I did a quick look at 434.zeusmp and found that prefetching for the following
simple loop is responsible:
linpck.f: 131:
c
ccode for increment not equal to 1
c
ix = 1
smax = abs(sx(1))
--- Comment #2 from changpeng dot fang at amd dot com 2010-05-27 20:55
---
To me, non-constant step prefetching seems not fit into the existing
prefetching
framework. non-constant stride prevent any reuse analysis, and thus prefetching
is kind of blindly.
--
http://gcc.gnu.org/bugz
--- Comment #1 from changpeng dot fang at amd dot com 2010-05-27 20:49
---
The regressions are most likely from the patch that added non-constant step
prefetching:
* From: Andreas Krebbel
* To: Christian Borntraeger
* Cc: gcc-patches
* Date: Wed, 19 May 2010 12:40:51
24 matches
Mail list logo