--- Comment #16 from jv244 at cam dot ac dot uk 2009-09-29 18:59 ---
since graphite should be able to fix this PR, I tried it without luck:
> gfortran -ffast-math -O3 -march=native -fgraphite -floop-interchange
> -floop-block test.f90
test.f90: In function MAIN__:
test.f90:1:0: sorry
--- Comment #15 from jv244 at cam dot ac dot uk 2007-07-03 18:09 ---
current gfortran trunk is still about a factor of 8 slower than ifort:
> gfortran -O3 -ffast-math -ftree-vectorize -march=native test.f90
> ./a.out
12.98081100010.23998
> ifort -xT -O2 test.f90
>
--- Comment #14 from steven at gcc dot gnu dot org 2005-10-07 21:21 ---
I don't have time to work on these (new job), so unassigning.
--
steven at gcc dot gnu dot org changed:
What|Removed |Added
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-28
17:22 ---
Subject: Re: missing transformations lead to
poorly optimized code
On Fri, 28 Jan 2005, jv244 at cam dot ac dot uk wrote:
>
> --- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28 16:
On Fri, 28 Jan 2005, jv244 at cam dot ac dot uk wrote:
--- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28
16:31 ---
You could try "gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
-ftree-loop-linear -ftree-vectorize yourcode.f90" and see if it helps.
Unhappily, seem
--- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28 16:31
---
> You could try "gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
> -ftree-loop-linear -ftree-vectorize yourcode.f90" and see if it helps.
Unhappily, seems to make things slower:
multgen/basic_mult> g
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-28
16:23 ---
The -xN you add make ifort specialize the code for Pentium 4. So far,
nobody has cared to make GCC produce good code for the good old Pentium 4
so I would not be terribly surprised if we lose a lot just on t
--- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28 15:59
---
Hi Steven, I now ( gcc version 4.0.0 20050128 (experimental) )get the following,
where the first number is the timing.
multgen/basic_mult> gfortran -O3 -ffast-math mult.f90
multgen/basic_mult> ./a.out
59.030
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-23
20:00 ---
Joost, could you try this with CVS head? We should do a lot better
now. Could you also show the code ifc produces for your test case?
Maybe they have some option enabled by default that we have disabled
--
Bug 14741 depends on bug 19464, which changed state.
Bug 19464 Summary: [3.3/3.4/4.0 Regression] gcse causes poor register allocation
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19464
What|Old Value |New Value
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-23
13:31 ---
My patch for PR19464 will fix this.
--
What|Removed |Added
BugsThisDependsOn|
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-19
23:38 ---
We now get:
L33:
lfd f13,0(r11)
add r11,r11,r8
lfd f0,0(r10)
addi r10,r10,8
fmadd f0,f13,f0,f12
fmr f12,f0
bdnz L33
Which is much better, thanks Zdene
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-18
11:35 ---
The relevant part of the code looks like this:
do
{
k_2 = phi(...,k_1);
k_1 = k_2 + 1
} while (k_2 != endvalue)
/* k_1 unused outside of the loop */
Ivopts decide that it makes more sense to perform
13 matches
Mail list logo