Re: Fold and integer types with sub-ranges

2007-02-25 Thread Duncan Sands
On Saturday 24 February 2007 14:27:36 Richard Kenner wrote:
> > Sure - I wonder if there is a reliable way of testing whether we face
> > a non-base type in the middle-end.  I suppose TREE_TYPE (type) != NULL
> > won't work in all cases... (?)
> 
> That's the right way as far as I know.

Note that having TREE_TYPE(type)!=NULL does not imply that the type and the
base type are inequivalent.  For example, if you declare a type Int as follows:
subtype Int is Integer;
then TREE_TYPE(type_for_Int)=type_for_Integer, but the types are equivalent,
in particular they have the same TYPE_MIN_VALUE and TYPE_MAX_VALUE.

Ciao,

Duncan.


Re: Fold and integer types with sub-ranges

2007-02-25 Thread Richard Kenner
> Note that having TREE_TYPE(type)!=NULL does not imply that the type and the
> base type are inequivalent.  For example, if you declare a type Int as
> follows:
>   subtype Int is Integer;
> then TREE_TYPE(type_for_Int)=type_for_Integer, but the types are equivalent,
> in particular they have the same TYPE_MIN_VALUE and TYPE_MAX_VALUE.

True, but there's still no harm in using the base type in that case (and I
think the front end will).

I think there are two very different things that need to be tested:

(1) If fold wants to determine if it can safely remove a conversion, it needs
to explicitly test whether the constant can fit into the (sub)type; it
doesn't care whether it's a subtype of something or not.

(2) If we're *generating* arithmetic, we can check TREE_TYPE (type) and,
if nonzero, generate the conversions.  Then fold can get rid of them if they
turn out to be unnecessary.

I think the above is the simplest mechanism that also is precisely correct;
yes, we could do further checks in (2), but it seems unnecessary work: the
issue isn't if the bounds are the same, but if the constant fits in the bounds.


Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Daniel Berlin

On 2/24/07, Serge Belyshev <[EMAIL PROTECTED]> wrote:

I have compared 4.1.2 release (r121943) with three revisions of 4.2 on spec2k
on an 2GHz AMD Athlon64 box (in 64bit mode), detailed results are below.

In short, current 4.2 performs just as good as 4.1 on this target
with the exception of huge 80% win on 178.galgel. All other difference
lies almost in the noise.

results:

first number in each column is a runtime difference in %
between corresponding 4.2 revision and 4.1.2 (+ is better, - is worse).

second number is a +- confidence interval, i.e. according to my results,
current 4.2 does (82.0+-1.7)% better than 4.1.2 on 178.galgel.

(note some results are clearly noisy, but I've tried hard to avoid this --
I did three runs on a completely idle machine, wasting 14 hours of machine time 
in total).

r117890 -- 4.2 just before DannyB's aliasing fixes
r117891 -- 4.2 with aliasing fixes.
r122236 -- 4.2 current.



Uh, these are the wrong revisions.
117890 is correct, but 117891 is superseded by 117922, which will make
things worse than 117891 will.

This is why the current numbers are worse than the second column, my guess.

In particular, 117922 is goin


Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Daniel Berlin

On 2/25/07, Daniel Berlin <[EMAIL PROTECTED]> wrote:

On 2/24/07, Serge Belyshev <[EMAIL PROTECTED]> wrote:
> I have compared 4.1.2 release (r121943) with three revisions of 4.2 on spec2k
> on an 2GHz AMD Athlon64 box (in 64bit mode), detailed results are below.
>
> In short, current 4.2 performs just as good as 4.1 on this target
> with the exception of huge 80% win on 178.galgel. All other difference
> lies almost in the noise.
>
> results:
>
> first number in each column is a runtime difference in %
> between corresponding 4.2 revision and 4.1.2 (+ is better, - is worse).
>
> second number is a +- confidence interval, i.e. according to my results,
> current 4.2 does (82.0+-1.7)% better than 4.1.2 on 178.galgel.
>
> (note some results are clearly noisy, but I've tried hard to avoid this --
> I did three runs on a completely idle machine, wasting 14 hours of machine 
time in total).
>
> r117890 -- 4.2 just before DannyB's aliasing fixes
> r117891 -- 4.2 with aliasing fixes.
> r122236 -- 4.2 current.


Uh, these are the wrong revisions.
117890 is correct, but 117891 is superseded by 117922, which will make
things worse than 117891 will.

This is why the current numbers are worse than the second column, my guess.

In particular, 117922 is goin


grrr.
117922 is going to make all nonlocal loads and stores link together,
and 117891 will not.




Successful GCC 4.1.2 i686-pc-mingw32 build+install

2007-02-25 Thread JohnE / TDM

$ config.guess
i686-pc-mingw32

> gcc -v
Using built-in specs.
Target: i686-pc-mingw32
Configured with: ../gcc-4.1.2/configure --prefix=/mingw --enable-threads 
--disab

le-nls --enable-languages=c,c++ --disable-win32-registry
Thread model: win32
gcc version 4.1.2

$ uname -a
MINGW32_NT-5.1  1.0.10(0.46/3/2) 2004-03-15 07:17 
i686 unknown


Build command line:
make CFLAGS="-O2 -fomit-frame-pointer" CXXFLAGS="-mthreads 
-fno-omit-frame-pointer -O2" LDFLAGS=-s bootstrap 2>errlog.txt


Host system:
Win XP SP2, MinGW GCC 3.4.5, mSYS 1.0.10

With MinGW/GCC 3.4.5 as the host, ORIGINAL_LD_FOR_TARGET was incorrectly 
set in gcc/Makefile; once changed, the build was successful. (This error 
also occurs when building GCC 4.1.1 with MinGW/GCC 3.4.5.) Using GCC 
4.1.1 as the host, this error does not occur.


Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Vladimir N. Makarov

Serge Belyshev wrote:


I have compared 4.1.2 release (r121943) with three revisions of 4.2 on spec2k
on an 2GHz AMD Athlon64 box (in 64bit mode), detailed results are below.

In short, current 4.2 performs just as good as 4.1 on this target
with the exception of huge 80% win on 178.galgel. All other difference
lies almost in the noise.

results:

first number in each column is a runtime difference in %
between corresponding 4.2 revision and 4.1.2 (+ is better, - is worse).

second number is a +- confidence interval, i.e. according to my results,
current 4.2 does (82.0+-1.7)% better than 4.1.2 on 178.galgel.

(note some results are clearly noisy, but I've tried hard to avoid this --
I did three runs on a completely idle machine, wasting 14 hours of machine time 
in total).

 

I run SPEC2000 several times per week and always look at 3 runs (to be 
sure that is nothing wrong happened) but I never saw such big 
"confidence" intervals (as I understand that is difference between max 
and min of 3 runs divided by the score).  Although I should acknowledge 
that I never ran SPEC2000 on AMD machines and some processors generates 
less "confident intervals".  There are tests like art for which the 
difference between min and max can be big but geometric meaning makes 
the effect of such differences  smaller in  the  overall  score.  If the 
machine has only 512 Mb memory (even they write that it is enough for 
SPEC2000), the scores for some benchmark programs may be unstable.  Also 
if  the middle score (of 3 runs) for base or peak is bigger on a program 
even the best (max) scores for peak or base are the same,  usually the 
opposite happens on another benchmark program so it also makes the 
overall score smoother.


So I trust overall score SPEC2000 and on my evaluation the measure error 
of the overal score for Core2 Duo (which I usually use for Spec2000) is 
+-0.3%.  It would be better if you posted them (but probably something 
wrong happened on your machine during the run).


Although I must say you did a really big job, thank you.


r117890 -- 4.2 just before DannyB's aliasing fixes
r117891 -- 4.2 with aliasing fixes.
r122236 -- 4.2 current.

CINT2000 r117890 r117891 r122236

164.gzip-4.2 1.7-4.2 1.2-4.0 1.3
175.vpr  1.7 2.6 1.4 2.3 1.1 2.5
176.gcc -0.5 0.8-0.8 1.1-1.2 4.0
181.mcf -0.4 2.0-0.1 2.1-0.6 2.7
186.crafty  -0.4 6.4-1.3 7.0 0.8 4.4
197.parser   0.7 1.3 0.8 1.5-0.3 1.6
252.eon  8.8 3.710.6 9.4 6.9 4.7
253.perlbmk  2.7 1.0 3.4 1.4 3.0 1.9
254.gap -0.6 0.5-0.5 0.4-0.4 0.6
255.vortex   1.3 0.9 1.2 1.2 1.4 1.1
256.bzip20.6 1.6 0.9 1.6 0.4 1.7
300.twolf0.1 4.5 0.8 1.4-0.6 2.0


CFP2000

168.wupwise  0.2 22.00.1 22.12.2 13.6
171.swim-0.1 0.7-0.3 0.1-0.3 0.2
172.mgrid   -6.3 0.4-6.1 0.4-6.6 0.3
173.applu   -0.1 0.8 0.1 0.9-0.4 0.1
177.mesa 6.9 15.17.2 15.13.9 5.3
178.galgel  80.8 1.780.9 2.082.0 1.7
179.art  0.8 8.9-1.6 8.1-0.3 5.1
183.equake  -0.9 1.0-0.8 0.9-0.9 0.9
187.facerec  2.7 0.7 2.9 0.8 3.0 0.6
188.ammp-0.4 0.5-0.1 1.0-0.5 0.7
189.lucas   -0.8 0.5-0.7 0.6-0.4 0.6
191.fma3d1.1 2.1-0.9 2.3-1.0 2.2
200.sixtrack-0.7 0.4-0.7 0.5-1.3 0.4
301.apsi-3.0 1.4-2.7 1.1-3.1 0.3


remarks:

1. big jump on 178.galgel can be seen here too:
  http://www.suse.de/~aj/SPEC/amd64/CFP/sandbox-britten/178_galgel_big.png

2. even though I did three runs, most of the difference is noise,
  which means that one should treat single-run spec results with a *big* grain 
of salt.

3. on this AMD K8 machine the difference between 4.2 with aliasing fixes and 
4.2 w/o
  aliasing fixes lies completely in the noise (modulo small 2% 191.fma3d 
regression).
 





Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Serge Belyshev
"Vladimir N. Makarov" <[EMAIL PROTECTED]> writes:

> I run SPEC2000 several times per week and always look at 3 runs (to be
> sure that is nothing wrong happened) but I never saw such big
> "confidence" intervals (as I understand that is difference between max
> and min of 3 runs divided by the score). [...]

No, it is much more complex than that, I've used generally accepted
definition of a confidence interval, see 
http://en.wikipedia.org/wiki/Confidence_interval
which basically tells that with 95% probabilty (the confidence level I've 
choosed)
true value lies in this interval.

I've used conservative estimate of confidence intervals in this case
because I didn't assume gaussian distribution of numbers which I
reported as difference between two run times, and this estimate is somewhat
bigger than difference between max and min of 3 runs :)

> [...] If the machine has only 512 Mb memory (even they
> write that it is enough for SPEC2000), the scores for some benchmark
> programs may be unstable.  [...]

My box is equipped with 2Gigs of RAM so I believe this is not the case,
Also the computer was *absolutely* idle when it was running spec2k.
(booted with init=/bin/sh and no other processes were running).

And no,
> [...] acknowledge that I never ran SPEC2000 on AMD machines and some
> processors generates less "confident intervals". [...]
this is not the case, I'm absolutely sure.


Re: GCC 4.3 Stage 1 project survey

2007-02-25 Thread Dorit Nuzman
>
> I've been trying to keep the GCC_4.3_Release_Planning wiki page up to
> date, and I'd like to be sure I haven't missed anything going in.  I've
> moved several projects to the Completed section, and if I've done
> anything in error there, please correct it.
>
> So here's a survey of what's left:
>
> * AutovectBranchOptimizations
>
>   The first two subprojects seem to have been checked in, and the other
>   seven haven't.  It's hard to tell which of these subprojects are
>   targetted at stage 1 and which at stage 2.
>

The first three subprojects (1.1, 1.2, 2.1) have been checked in.
The following three subprojects (2.2, 2.3, 2.4) were targeted at stage 1,
AFAIK.
  Victor, Mircea - what are the status of those?
The last three subprojects (3.1, 3.2, 3.3) were targeted at (no earlier
than :-) stage 2.

thanks,
dorit



Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Jan Hubicka
> "Vladimir N. Makarov" <[EMAIL PROTECTED]> writes:
> 
> > I run SPEC2000 several times per week and always look at 3 runs (to be
> > sure that is nothing wrong happened) but I never saw such big
> > "confidence" intervals (as I understand that is difference between max
> > and min of 3 runs divided by the score). [...]
> 
> No, it is much more complex than that, I've used generally accepted
> definition of a confidence interval, see 
> http://en.wikipedia.org/wiki/Confidence_interval
> which basically tells that with 95% probabilty (the confidence level I've 
> choosed)
> true value lies in this interval.
> 
> I've used conservative estimate of confidence intervals in this case
> because I didn't assume gaussian distribution of numbers which I
> reported as difference between two run times, and this estimate is somewhat
> bigger than difference between max and min of 3 runs :)
> 
> > [...] If the machine has only 512 Mb memory (even they
> > write that it is enough for SPEC2000), the scores for some benchmark
> > programs may be unstable.  [...]
> 
> My box is equipped with 2Gigs of RAM so I believe this is not the case,
> Also the computer was *absolutely* idle when it was running spec2k.
> (booted with init=/bin/sh and no other processes were running).
> 
> And no,
> > [...] acknowledge that I never ran SPEC2000 on AMD machines and some
> > processors generates less "confident intervals". [...]
> this is not the case, I'm absolutely sure.

I am running SPEC on both AMD and Intel machines quite commonly and I
must say that there seems to be difference in between those two.  For P4
and Core I get results within something like 1-2 SPEC point (0.1%) of overall
SPEC score, for Athlon I was never able to get so close, the
difference tends to be up to one percent that is often more than
expected speedup I am looking for.

Of course it might be property of the boxes I have, but there is no
difference in setup of those machines, just it seems to be happening
this way.  Running the tests more times in sequence tends to stabilize
Athlon results, so what I often do is to simply configure peak runs to
do something interesting and use same base runs, since peak scores tends
to be slightly better than base scores even for identical binaries.
(that makes development easier, but not GCC better :)

Honza


Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Vladimir N. Makarov

Serge Belyshev wrote:


"Vladimir N. Makarov" <[EMAIL PROTECTED]> writes:

 


I run SPEC2000 several times per week and always look at 3 runs (to be
sure that is nothing wrong happened) but I never saw such big
"confidence" intervals (as I understand that is difference between max
and min of 3 runs divided by the score). [...]
   



No, it is much more complex than that, I've used generally accepted
definition of a confidence interval, see 
http://en.wikipedia.org/wiki/Confidence_interval
which basically tells that with 95% probabilty (the confidence level I've 
choosed)
true value lies in this interval.

I've used conservative estimate of confidence intervals in this case
because I didn't assume gaussian distribution of numbers which I
reported as difference between two run times, and this estimate is somewhat
bigger than difference between max and min of 3 runs :)

 

Well, you should have written all this the first time that you choosen 
95% probability (although it is most widely used probability for the 
confidence intervals) and did not use the normal distribution to permit 
people better interpret all this numbers.  Not all people know 
statistics or studied it long ago.


Still the numbers is a bit useless at least for me.  What I (and i guess 
most people) wanted just the overall score (with the confidence interval 
if you want although it would take more time).


On the other hand, it is good that you calculated and wrote the 
confidence intervals and I asked about them.   Now I understand that the 
machine (or may be all AMD machine according to Jan) can not be used to 
check gcc performance progress.  According to Proebsting's law (it is a 
pseudo law which is analog of Moor's law for compilers) compiler 
generates 2 times better code each 18 years or less than 4% in average 
every year.  Actually our progress from one release to another is 
sometimes less.  So I need a better tool to measure progress on small 
interval of time (a few months).  Fortunately I have it (a Core2 
machine, itanium and my ppc machine are also accurate but not as Core2).


I know a lot of people and I agree with them that there are different 
benchmarks and benchmarking is evil.  But it is better have some rules 
than nothing.  I like Spec because it is most acknowledged in compiler 
world and it is not easy to cheat it (as choosing one random bechmark 
like wethstone and report an improvement)



[...] If the machine has only 512 Mb memory (even they
write that it is enough for SPEC2000), the scores for some benchmark
programs may be unstable.  [...]
   



My box is equipped with 2Gigs of RAM so I believe this is not the case,
Also the computer was *absolutely* idle when it was running spec2k.
(booted with init=/bin/sh and no other processes were running).
 

It might be motherboard, chipset and a lot of other parameters that 
makes the machine is not good for gcc performance progress.




Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Vladimir N. Makarov

Jan Hubicka wrote:


I am running SPEC on both AMD and Intel machines quite commonly and I
must say that there seems to be difference in between those two.  For P4
and Core I get results within something like 1-2 SPEC point (0.1%) of overall
SPEC score, for Athlon I was never able to get so close, the
difference tends to be up to one percent that is often more than
expected speedup I am looking for.

Of course it might be property of the boxes I have, but there is no
difference in setup of those machines, just it seems to be happening
this way.  Running the tests more times in sequence tends to stabilize
Athlon results, so what I often do is to simply configure peak runs to
do something interesting and use same base runs, since peak scores tends
to be slightly better than base scores even for identical binaries.
(that makes development easier, but not GCC better :)
 


Interesting, Jan.  I did not know that AMD machines are so unstable.



Porting GCC to new architecture

2007-02-25 Thread Alexandre Tzannes
Hi,
I'm Alex Tzannes and I am porting GCC 4.0.2 to a new experimental parallel
architecture. Here's one issue I don't know how to go about. The ISA of
our machine is based on MIPS (so I made a copy of the MIPS back end and
have been modifying that). One conceptual difference is that the machine
can be either in serial mode or in parallel mode. The switch from serial
to parallel happens with a 'spawn' instruction, and the swich from
parallel to serial with a 'join' instruction. This is important because
the ISA instructions that can be used in serial and in parallel mode are
different !

For example let's say mvtg is an instruction that can only be used in
'serial mode'. I currently have a template in the back end (.md file) that
generates that instruction when it is needed. The only thing I don't know
how to check is that it is only generated in serial mode.

Instruction attributes do not seem like the solution because there seems
to be no other way to set them besides in the RTL rules in the .md file
(which is too late; I would want to read the information at that point).

Is there a way to annotate each RTL instruction and to check that during
code generation ? Has a similar problem been solved in some other
back-end?

Thank you,
Alex


Re: spec2k comparison of gcc 4.1 and 4.2 on AMD K8

2007-02-25 Thread Jan Hubicka
> Jan Hubicka wrote:
> 
> >I am running SPEC on both AMD and Intel machines quite commonly and I
> >must say that there seems to be difference in between those two.  For P4
> >and Core I get results within something like 1-2 SPEC point (0.1%) of 
> >overall
> >SPEC score, for Athlon I was never able to get so close, the
> >difference tends to be up to one percent that is often more than
> >expected speedup I am looking for.
> >
> >Of course it might be property of the boxes I have, but there is no
> >difference in setup of those machines, just it seems to be happening
> >this way.  Running the tests more times in sequence tends to stabilize
> >Athlon results, so what I often do is to simply configure peak runs to
> >do something interesting and use same base runs, since peak scores tends
> >to be slightly better than base scores even for identical binaries.
> >(that makes development easier, but not GCC better :)
> > 
> >
> Interesting, Jan.  I did not know that AMD machines are so unstable.

Well, rather than unstable, they seems to be more memory layout
sensitive I would say. (the differences are more or less reproducible,
not completely random, but independent on the binary itself. I can't
think of much else than memory layout to cause it).  I always wondered
if things like page coloring have chance to reduce this noise, but I
never actually got around trying it.

Honza


RS6000 call pattern clobbers

2007-02-25 Thread David Edelsohn
Richard,

While fixing ports in preparation for the new dataflow
infrastructure, we found a problem with the way that the rs6000 port
represents clobbers and uses of registers in call and sibcall patterns.
The patterns clobber and use the rs6000 link register as a match_scratch
with constraint of the link register class:

   (clobber (match_scratch:SI 0 "=l"))

instead of clobbering the link register hard register directly in the
early insn generation.  This style dates to the original rs6000 port.  A
naked use that starts as a pseudo causes problems for dataflow.

Do you remember why you wrote the call patterns this way?  Was
there a problem with reload and clobbers of hard registers in a register
class containing a single register or some other historical quirk?

Thanks, David



Strange behavior for scan-tree-dump testing

2007-02-25 Thread Mohamed Shafi

Hello all,

I added few testcases to the existing testsuite in gcc 4.1.1 for a
private target.
After running the testsuite i found out that all my test cases with
scan-tree-dump testing failed for one particular situation.
The values are scanned from gimple tree dump and its fails for cases like

b4 = 6.3e+1
c1 = 1.345286102294921875e+0

but it was not failing for other values in the same tree dump which
has values like

some_identifier = x.xe-1
some_identifier = x.xe0

The failures are only when the tree dump values are positive and
represented in the above format. I checked the tree dumps manually and
found out that all the values are proper and the scan lines in the
test-cases are also proper. this is way i used them

/* { dg-final { scan-tree-dump "b4 = 6.3e+1" "gimple" } } */

Why is this behavior ? For positive values should i be writing it in
some other way?

One other question is that i am getting "test for excess errors" Fails
for some cases which produce lot of warnings but otherwise proper.

Can anyone help me?

Thanks in advance.

Regards,
Shafi.