from:"tim"

Re: Rant about ChangeLog entries and commit messages

2007-12-02 Thread tim

On Sun, 2007-12-02 at 21:36 +0100, Eric Botcazou wrote:
> > I'd go even further, and say if the GNU coding standards say we
> > shouldn't be putting descriptions of why we are changing things in the
> > ChangeLog, than they should be changed and should be ignored on this
> > point until they do.  Pointing to them as the if they are The One True
> > Way seems very suspect to me.  After all, how else would they ever
> > improve if nobody tries anything different?
> 
> The people who wrote them presumably thought about these issues, too.
> 

Unfortunately they didn't document the "why" just the "what"!

Tim Josling

Re: Rant about ChangeLog entries and commit messages

2007-12-02 Thread tim

On Sun, 2007-12-02 at 09:26 -0500, Robert Kiesling wrote: 
> > I guess nobody really loves writing ChangeLog entries, but in my opinion 
> > there 
> > are quite effective "executive summaries" for the patches and helpful to 
> > the 
> > reader/reviewer.  Please let's not throw the baby with the bath's water.
> 
> If there's a mechanism to filter checkin messages to ChangeLog summaries, 
> I would be happy to use it - in cases of multiple packages, especially, it's 
> important to know what changes were made, when, and when the changes 
> propagated
> through packages and releases, and where they got to, occasionally.  Anybody
> know of a useful, built-in mechanism for this task?
> 

Personally I find it slow and inefficient tracing through why a given
change was made. It is just a slow process searching and sometimes I
don't bother because it is so inconvenient. The ChangeLog entries
provide little help and there does not seem to be a good alternative. If
there is a good alternative no-one has said what it is so far.

As people have pointed out, the RCSs pretty well cover the "what" these
days. And writing changelog entries, which largely duplicate this
information, is time-consuming and tedious. And there are of of little
to no value to me at least.

The coding standards do allow, in some cases, that giving some context
would be useful:

> See also what the GNU Coding Standards have to say about what goes in
> ChangeLogs; in particular, descriptions of the purpose of code and
> changes should go in comments rather than the ChangeLog, though a
> single line overall description of the changes may be useful above the
> ChangeLog entry for a large batch of changes.

I personally would strongly favour each ChangeLog entry having a single
line of context. This could be the PR number or a single line giving the
purpose of the change or what bigger change it is part of. 

As pointed out by Zach Weinberg in his paper "A Maintenance Programmer's
View of GCC", there are many impediments to contributing to GCC.

http://www.linux.org.uk/~ajh/gcc/gccsummit-2003-proceedings.pdf

Things are not much better than they were when Zach wrote his paper. This small 
change would be one positive step n the right direction, IMHO.

Tim Josling

Re: Build failure in dwarf2out

2006-10-29 Thread Tim Prince


Paul Thomas wrote:

I am being hit by this:

rf2out.c -o dwarf2out.o
../../trunk/gcc/dwarf2out.c: In function `file_name_acquire':
../../trunk/gcc/dwarf2out.c:7672: error: `files' undeclared (first use 
in this f

unction)
../../trunk/gcc/dwarf2out.c:7672: error: (Each undeclared identifier is 
reported

only once
../../trunk/gcc/dwarf2out.c:7672: error: for each function it appears in.)
../../trunk/gcc/dwarf2out.c:7672: error: `i' undeclared (first use in 
this funct

ion)


My guess is that the #define activating that region of code is 
erroneously triggered.  I am running the 2-day (on cygwin with a 
substandard BIOS) testsuite now.

Re: Call to arms: testsuite failures on various targets

2007-04-14 Thread Tim Prince


FX Coudert wrote:

Hi all,

I reviewed this afternoon the postings from the gcc-testresults 
mailing-list for the past month, and we have a couple of gfortran 
testsuite failures showing up on various targets. Could people with 
access to said targets (possibly maintainers) please file PRs in 
bugzilla for each testcase, reporting the error message and/or 
backtrace? (I'd be happy to be added to the Cc list of these)


* ia64-suse-linux-gnu: gfortran.dg/vect/vect-4.f90
FAIL: gfortran.dg/vect/vect-4.f90  -O  scan-tree-dump-times Alignment of 
access

forced using peeling 1
FAIL: gfortran.dg/vect/vect-4.f90  -O  scan-tree-dump-times Vectorizing 
an unali

gned access 1

This happens on all reported ia64 targets, including mine.  What is 
expected here?  There is no vectorization on ia64, no reason for 
peeling.  The compilation has no problem, and there is no report 
generated.  As far as I know, the vectorization options are ignored.
Without unrolling, of course, gfortran doesn't optimize the loop at all, 
but I assume that's a different question.

Re: Call to arms: testsuite failures on various targets

2007-04-14 Thread Tim Prince

-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-iv-4.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-iv-9.c scan-tree-dump-times vectorized 1 loops 2
FAIL: gcc.dg/vect/vect-reduc-dot-s16b.c scan-tree-dump-times vectorized 1 loops 0
FAIL: gcc.dg/vect/vect-reduc-dot-u16b.c scan-tree-dump-times vectorized 1 loops 
1
FAIL: gcc.dg/vect/vect-reduc-pattern-1a.c scan-tree-dump-times vectorized 1 
loops 0
FAIL: gcc.dg/vect/vect-reduc-pattern-1c.c scan-tree-dump-times vectorized 1 
loops 0
FAIL: gcc.dg/vect/vect-reduc-pattern-2a.c scan-tree-dump-times vectorized 1 
loops 0
FAIL: gcc.dg/vect/vect-widen-mult-u16.c scan-tree-dump-times vectorized 1 loops 
1
FAIL: gcc.dg/vect/wrapv-vect-reduc-pattern-2c.c scan-tree-dump-times vectorized 
1 loops 0
FAIL: gcc.dg/vect/no-section-anchors-vect-69.c scan-tree-dump-times Alignment 
of access forced using peeling 3

=== gcc Summary ===

# of expected passes42231
# of unexpected failures23
# of unexpected successes   2
# of expected failures  155
# of unresolved testcases   2
# of untested testcases 28
# of unsupported tests  374
/home/tim/src/gcc-4.3-20070413/ia64/gcc/xgcc  version 4.3.0 20070413 
(experimental)

=== gfortran tests ===


Running target unix

=== gfortran Summary ===

# of expected passes17438
# of expected failures  13
# of unsupported tests  20
/home/tim/src/gcc-4.3-20070413/ia64/gcc/testsuite/gfortran/../../gfortran  
version 4.3.0 20070413 (experimental)

=== g++ tests ===


Running target unix
FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct 
call.* AA transformation on insn
FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct 
call.* AA transformation on insn
FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct 
call.* AA transformation on insn
FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct 
call.* AA transformation on insn
FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct 
call.* AA transformation on insn
FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct 
call.* AA transformation on insn
FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct 
call.* AA transformation on insn

    === g++ Summary ===

# of expected passes13739
# of unexpected failures7
# of expected failures  79
# of unsupported tests  119
/home/tim/src/gcc-4.3-20070413/ia64/gcc/testsuite/g++/../../g++  version 4.3.0 
20070413 (experimental)

=== objc tests ===


Running target unix

=== objc Summary ===

# of expected passes1810
# of expected failures  7
# of unsupported tests  25
/home/tim/src/gcc-4.3-20070413/ia64/gcc/xgcc  version 4.3.0 20070413 
(experimental)

=== libgomp tests ===


Running target unix

=== libgomp Summary ===

# of expected passes1566
=== libstdc++ tests ===


Running target unix
XPASS: 26_numerics/headers/cmath/c99_classification_macros_c.cc (test for 
excess errors)
XPASS: 27_io/fpos/14320-1.cc execution test

=== libstdc++ Summary ===

# of expected passes4859
# of unexpected successes   2
# of expected failures  27

Compiler version: 4.3.0 20070413 (experimental) 
Platform: ia64-unknown-linux-gnu
configure flags: --enable-languages='c c++ fortran objc' --enable-bootstrap 
--enable-maintainer-mode --disable-libmudflap --prefix=/usr/local/gcc43
EOF
Mail -s "Results for 4.3.0 20070413 (experimental) testsuite on 
ia64-unknown-linux-gnu" [EMAIL PROTECTED] &&

Re: Where is gstdint.h

2007-04-22 Thread Tim Prince


[EMAIL PROTECTED] wrote:

Where is gstdint.h ? Does it acctually exist ?

libdecnumber seems to use it.

decimal32|64|128.h's include decNumber.h which includes deccontext.h 
which includes gstdint.h


When you configure libdecnumber (e.g. by running top-level gcc 
configure), gstdint.h should be created, by modifying .  Since 
you said nothing about the conditions where you had a problem, you can't 
expect anyone to fix it for you.  If you do want it fixed, you should at 
least file a complete PR.  As it is more likely to happen with a poorly 
supported target, you may have to look into it in more detail than that.
When this happened to me, I simply made a copy of stdint.h to get over 
the hump.

Re: Where is gstdint.h

2007-04-22 Thread Tim Prince


[EMAIL PROTECTED] wrote:

[EMAIL PROTECTED] wrote:

Where is gstdint.h ? Does it acctually exist ?

libdecnumber seems to use it.

decimal32|64|128.h's include decNumber.h which includes deccontext.h 
which includes gstdint.h


When you configure libdecnumber (e.g. by running top-level gcc 
configure), gstdint.h should be created, by modifying .  
Since you said nothing about the conditions where you had a problem, 
you can't expect anyone to fix it for you.  If you do want it fixed, 
you should at least file a complete PR.  As it is more likely to 
happen with a poorly supported target, you may have to look into it in 
more detail than that.
When this happened to me, I simply made a copy of stdint.h to get over 
the hump.


Thanks for prompt reply.

I am doing 386 build. I could not find it in my build directory, but it 
is there after all. Sorry, not used to finding files in Linux.


Aaron

You can't expect people to guess which 386 build you are doing.  Certain 
386 builds clearly are not in the "poorly supported" category, others 
may be.

Re: Where is gstdint.h

2007-04-23 Thread Tim Prince


[EMAIL PROTECTED] wrote:

Tim Prince wrote:

[EMAIL PROTECTED] wrote:

Where is gstdint.h ? Does it acctually exist ?

libdecnumber seems to use it.

decimal32|64|128.h's include decNumber.h which includes deccontext.h 
which includes gstdint.h


When you configure libdecnumber (e.g. by running top-level gcc 
configure), gstdint.h should be created, by modifying .  
Since you said nothing about the conditions where you had a problem, 
you can't expect anyone to fix it for you.  If you do want it fixed, 
you should at least file a complete PR.  As it is more likely to 
happen with a poorly supported target, you may have to look into it in 
more detail than that.
When this happened to me, I simply made a copy of stdint.h to get over 
the hump.


This might happen when you run the top level gcc configure in its own 
directory.  You may want to try to make a new directory elsewhere and 
run configure there.


pwd
.../my-gcc-source-tree
mkdir ../build
cd ../build
../my-gcc-source-tree/configure
make


If you're suggesting trying to build in the top level directory to see 
if the same problem occurs, I would expect other problems to arise.  If 
it would help diagnose the problem, and the problem persists for a few 
weeks, I'd be willing to try it.

Re: Effects of newly introduced -mpcX 80387 precision flag

2007-04-29 Thread Tim Prince


[EMAIL PROTECTED] wrote:



I just (re-)discovered these tables giving maximum known errors in some 
libm functions when extended precision is enabled:


http://people.inf.ethz.ch/gonnet/FPAccuracy/linux/summary.html

and when the precision of the mantissa is set to 53 bits (double 
precision):


http://people.inf.ethz.ch/gonnet/FPAccuracy/linux64/summary.html

This is from 2002, and indeed, some of the errors in double-precision 
results are hundreds or thousands of times bigger when the precision is 
set to 53 bits.


This isn't very helpful.  I can't find an indication of whose libm is 
being tested, it appears to be an unspecified non-standard version of 
gcc, and a lot of digging would be needed to find out what the tests 
are.  It makes no sense at all for sqrt() to break down with change in 
precision mode.
Extended precision typically gives a significant improvement in accuracy 
of complex math functions, as shown in the Celefunt suite from TOMS.
The functions shown, if properly coded for SSE2, should be capable of 
giving good results, independent of x87 precision mode. I understand 
there is continuing academic research.
Arguments have been going on for some time on whether to accept 
approximate SSE2 math libraries.  I personally would not like to see new 
libraries without some requirement for readable C source and testing.
I agree that it would be bad to set 53-bit mode blindly for a library 
which expects 64-bit mode, but it seems a serious weakness if such a 
library doesn't take care of precision mode itself.
The whole precision mode issue seems somewhat moot, now that years have 
passed since the last CPUs were made which do not support SSE2, or the 
equivalent in other CPU families.

Re: Effects of newly introduced -mpcX 80387 precision flag

2007-04-29 Thread Tim Prince


[EMAIL PROTECTED] wrote:


On Apr 29, 2007, at 1:01 PM, Tim Prince wrote:





It makes no sense at all for sqrt() to break down with change in 
precision mode.


If you do an extended-precision (80-bit) sqrt and then round the result 
again to a double (64-bit) then those two roundings will increase the 
error, sometimes to > 1/2 ulp.


To give current results on a machine I have access to, I ran the tests 
there on


vendor_id   : AuthenticAMD
cpu family  : 15
model   : 33
model name  : Dual Core AMD Opteron(tm) Processor 875

using

euler-59% gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/pkgs/gcc-4.1.2
Thread model: posix
gcc version 4.1.2

on an up-to-date RHEL 4.0 server (so whatever libm is offered there), 
and, indeed, the only differences that it found were in 1/x, sqrt(x), 
and Pi*x because of double rounding.  In other words, the code that went 
through libm gave identical answers whether running on sse, x87 
(extended precision), or x87 (double precision).


I don't know whether there are still math libraries for which Gonnet's 
2002 results prevail.


Double rounding ought to be avoided by -mfpmath=sse and permitting 
builtin_sqrt to do its thing, or by setting 53-bit precision.  The 
latter disables long double.  The original URL showed total failure of 
sqrt(); double rounding only brings error of .5 ULP, as usually assessed.
I don't think the 64-/53-bit double rounding of sqrt can be detected, 
but of course such double rounding of * can be measured.  With Pi, you 
have various possibilities, according to precision of the Pi value 
(including the possibility of the one supplied by the x87 instruction) 
as well as the 2 choices of arithmetic precision mode.

Re: Successfull Build of gcc on Cygwin WinXp SP2

2007-04-30 Thread Tim Prince


[EMAIL PROTECTED] wrote:



Cygcheck version 1.90
Compiled on Jan 31 2007

How do I get a later version of Cygwin ?



1.90 is the current release version.  It seems unlikely that later trial 
versions have a patch for the stdio.h conflict with C99, or changes 
headers to avoid warnings which by default are fatal.
If you want a newer cygwin.dll, read the cygwin mail list archive for 
hints, but it doesn't appear to be relevant.

Re: Successfull Build of gcc on Cygwin WinXp SP2

2007-05-01 Thread Tim Prince


[EMAIL PROTECTED] wrote:

James,


On 5/1/07, Aaron Gray <[EMAIL PROTECTED]> wrote:

Hi James,

> Successfully built latest gcc on Win XP SP2 with cvs built cygwin.

I was wondering whether you could help to get me to the same point 
please.



You will need to use Dave Korns patch for newlib.

http://sourceware.org/ml/newlib/2007/msg00292.html


I am getting the following :-

$ patch newlib/libc/include/stdio.h fix-gcc-bootstrap-on-cygwin-patch.diff
patching file newlib/libc/include/stdio.h
Hunk #1 succeeded at 475 (offset 78 lines).
Hunk #2 FAILED at 501.
Hunk #3 FAILED at 521.
2 out of 3 hunks FAILED -- saving rejects to file 
newlib/libc/include/stdio.h.rej


I had to apply the relevant changes manually to the cygwin . 
It doesn't appear to match the version for which Dave made the patch.

Auslaender bevorzugt

2005-05-15 Thread tim . costello

Lese selbst:
http://www.npd.de/npd_info/deutschland/2005/d0305-14.html

Jetzt weiss man auch, wie es dazu kommt, dass Drogen, Waffen & Handy's in die 
Haende der Knacki's gelangen!

Re: What happend to bootstrap-lean?

2005-12-17 Thread Tim Prince


Gabriel Dos Reis wrote:

Andrew Pinski <[EMAIL PROTECTED]> writes:

| > 
| > On Fri, 16 Dec 2005, Paolo Bonzini wrote:

| > > Yes.  "make bubblestrap" is now called simply "make".
| > 
| > Okay, how is "make bootstrap-lean" called these days? ;-)
| > 
| > In fact, bootstrap-lean is still documented in install.texi and 
| > makefile.texi, but it no longer seems to be present in the Makefile

| > machinery.  Could we get this back?
| 
| bootstrap-lean is done by doing the following (which I feel is the wrong way):

| Configure with --enable-bootstrap=lean
| and then do a "make bootstrap"

Hmm, does that mean that I would have to reconfigure GCC if I wanted
to do "make bootstrap-lean" after a previous configuration and build?

I think the answer must be "no", but I'm not sure.

-- Gaby

I've not been able to find another way to rebuild (on SuSE 9.2, for 
example) after applying the weekly patch file.  I'm hoping that 
suggestion works.

Re: Fwd: Windows support dropped from gcc trunk

2015-10-14 Thread Tim Prince

On 10/14/2015 11:36 AM, Steve Kargl wrote:
> On Wed, Oct 14, 2015 at 11:32:52AM -0400, Tim Prince wrote:
>> Sorry if someone sees this multiple times; I think it may have been
>> stopped by ISP or text mode filtering:
>>
>> Since Sept. 26, the partial support for Windows 64-bit has been dropped
>> from gcc trunk:
>> winnt.c apparently has problems with seh, which prevent bootstrapping,
>> and prevent the new gcc from building libraries.
>> libgfortran build throws a fatal error on account of lack of support for
>> __float128, even if a working gcc is used.
>> I didn't see any notification about this; maybe it wasn't a consensus
>> decision?
>> There are satisfactory pre-built gfortran 5.2 compilers (including
>> libgomp, although that is off by default and the testsuite wants acc as
>> well as OpenMP) available in cygwin64 (test version) and (apparently)
>> mingw-64.
>>
> The last comment to winnt.c is
>
> 2015-10-02  Kai Tietz  
>
> PR target/51726
> * config/i386/winnt.c (ix86_handle_selectany_attribute): Handle
> selectany within this function without need to keep attribute.
> (i386_pe_encode_section_info): Remove selectany-code.
>
> Perhaps, contact Kai.
>
> I added gcc@gcc.gnu.org as this technically isn't a Fortran issue.
test suite reports hundred of new ICE instances, all referring to this
seh_unwind_emit function:

/cygdrive/c/users/tim/tim/tim/src/gnu/gcc1/gcc/testsuite/gcc.c-torture/compile/2127-1.c:
In function 'foo':^M
/cygdrive/c/users/tim/tim/tim/src/gnu/gcc1/gcc/testsuite/gcc.c-torture/compile/2127-1.c:7:1:
internal compiler error: in i386_pe_seh_unwind_emit, at
config/i386/winnt.c:1137^M
Please submit a full bug report,^M
I will file a bugzila if that is what is wanted, but I wanted to know if
there is a new configure option required.
As far as I know there were always problems with long double for Windows
targets, but the refusal of libgfortran to build on account of it is new.
Thanks,
Tim

New CA mirror

2016-08-09 Thread Tim Semeijn

Hey,

We have added a new mirror in Canada.

IP address is being geolocated in the US but it is actually Canadian. If
it has to be listed as a US mirror please let me know.

Could you please add it to the list?

---

Canada, Quebec: http://ca.mirror.babylon.network/gcc/ |
ftp://ca.mirror.babylon.network/gcc/ |
rsync://ca.mirror.babylon.network/gcc/, thanks to Tim Semeijn
(noc@babylon.network) at Babylon Network.

---

Thanks in advance!

-- 
Tim Semeijn
Babylon Network

PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5



signature.asc
Description: OpenPGP digital signature

Maintenance ca.mirror.babylon.network

2016-08-27 Thread Tim Semeijn

Dear,

The storage of ca.mirror.babylon.network is not functioning properly and
we will rebuild the storage platform in the following days. The mirror
might become temporarily unavailable during this process.

Once the storage has been rebuild I will inform you straight away.

I hope to have informed you sufficiently and if you have any questions
please let me know.

Best regards,

-- 
Tim Semeijn
Babylon Network

PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5



signature.asc
Description: OpenPGP digital signature

Re: question about -ffast-math implementation

2014-06-02 Thread Tim Prince


On 6/2/2014 3:00 AM, Andrew Pinski wrote:

On Sun, Jun 1, 2014 at 11:09 PM, Janne Blomqvist
 wrote:

On Sun, Jun 1, 2014 at 9:52 AM, Mike Izbicki  wrote:

I'm trying to copy gcc's behavior with the -ffast-math compiler flag
into haskell's ghc compiler.  The only documentation I can find about
it is at:

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

I understand how floating point operations work and have come up with
a reasonable list of optimizations to perform.  But I doubt it is
exhaustive.

My question is: where can I find all the gory details about what gcc
will do with this flag?  I'm perfectly willing to look at source code
if that's what it takes.


In addition to the official documentation, a nice overview is at

https://gcc.gnu.org/wiki/FloatingPointMath

Useful, thanks for the pointer


Though for the gory details and authoritative answers I suppose you'd
have to look into the source code.


Also, are there any optimizations that you wish -ffast-math could
perform, but for various architectural reasons they don't fit into
gcc?


There are of course a (nearly endless?) list of optimizations that
could be done but aren't (lack of manpower, impractical, whatnot). I'm
not sure there are any interesting optimizations that would be
dependent on loosening -ffast-math further?
I find it difficult to remember how to reconcile differing treatments by 
gcc and gfortran under -ffast-math; in particular, with respect to 
-fprotect-parens and -freciprocal-math.  The latter appears to comply 
with Fortran standard.


(One thing I wish wouldn't be included in -ffast-math is
-fcx-limited-range; the naive complex division algorithm can easily
lead to comically poor results.)


Which is kinda interesting because the Google folks have been trying
to turn on  -fcx-limited-range for C++ a few times now.

Intel tried to add -complex-limited-range as a default under -fp-model 
fast=1 but that was shown to be unsatisfactory.


Now, with the introduction of omp simd directives and pragmas, we have 
disagreement among various compilers on the relative roles of the 
directives and the fast-math options.
I've submitted PR60117 hoping to get some insight on whether omp simd 
should disable optimizations otherwise performed by -ffast-math.


Intel made the directives over-ride the compiler line fast (or 
"no-fast") settings locally, so that complex-limited-range might be in 
effect inside the scope of the directive (no matter whether you want 
it).  They made changes in the current beta compiler, so it's no longer 
practical to set standard-compliant options but discard them by pragma 
in individual for loops.



--
Tim Prince

New French mirror

2014-07-02 Thread Tim Semeijn

Hi,

I have set up a French gcc mirror. It is located in Roubaix, France. It
is reachable through http, ftp and rsync:

http://mirror.bbln.nl/gcc
ftp://mirror.bbln.nl/gcc
rsync://mirror.bbln.nl/gcc

This mirror is provided by BBLN. Could you add it to the mirrorlist?

If you have any questions please let me know!

Best regards,

Tim Semeijn
BBLN

Rearrangement mirror servers

2014-10-27 Thread Tim Semeijn

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear GCC Mirror Admin,

I have rearranged our mirror setup which means we offer additional
mirror servers mirroring GCC. Could you please make the following changes:

- - Remove the current 'mirror.bbln.org' entry as mirror

- - Please add the following three mirrors:

Located in Gravelines, France

http://mirror-fr1.bbln.org/gcc
https://mirror-fr1.bbln.org/gcc
ftp://mirror-fr1.bbln.org/gcc
rsync://mirror-fr1.bbln.org/gcc


Located in Roubaix, France

http://mirror-fr2.bbln.org/gcc
https://mirror-fr2.bbln.org/gcc
ftp://mirror-fr2.bbln.org/gcc
rsync://mirror-fr2.bbln.org/gcc


Located in Amsterdam, The Netherlands

http://mirror-nl1.bbln.org/gcc
https://mirror-nl1.bbln.org/gcc
ftp://mirror-nl1.bbln.org/gcc
rsync://mirror-nl1.bbln.org/gcc

As contact for these mirrors you can list: BBLN (n...@bbln.org)

Thanks in advance!

- -- 
Tim Semeijn
pgp 0x08CE9B4D
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQEcBAEBAgAGBQJUTqSFAAoJEB4F+FYIzptNRQ4H/imDU1bWiveW0tBkj+YsHS8P
2EXCwoMTLbzASdDZkyqP0dspp3AmvhuypHbmYnjAbbLBYHtUASCxdp0j43fDwPUZ
3V9xDKA8mEvkYWS1WLXbSXxJhjbbURDaj6vGNnl+xHtRpfS1Z4HQix64qXhKX/EE
pmK/YkSXh9b1FXyjviMXzJcdK4ehAulUdOBz5n50mCgK50bpS/CGvUGQUM16rz3M
0J3SuGpbjhqNZ/KVJeQzS7O9e+/3aXNpxV9/qZaIDGm18Nu2hd+BuK+m5zO6m0XY
WMaSiuMI9QmEVruaWnuamIl0NJ6nGmfmfF3g5xn29R0lvDWKaFfpzkXUndiTihc=
=9hVd
-END PGP SIGNATURE-


0x08CE9B4D.asc
Description: application/pgp-keys

Partial inline on recursive functions?

2015-04-06 Thread Tim Shen

Hi there,

I found the C++11 code below:
  int Fib(int n) {
  if (n <= 1) return n;
  return [&] { return Fib(n-2) + Fib(n-1); }();
  }

is ~2x faster than normal one:
  int Fib(int n) {
  if (n <= 1) return n;
  return Fib(n-2) + Fib(n-1);
  }

I tested them with "-std=c++11 -O3/-O2" using trunk and the first
version is ~2x (1.618x in theory?) faster. However, the first version
has larger binary size (101k compared to 3k, which is for the second
version). Clang produces 4k for the first version (with similar speed
improvement) though.

My guess is that the first `if (n <= 1) return n;` is easier to inline
into the caller side, since the returned expression is a call to a
separated function. It's translated to something like (ignoring
linkage difference):
  int foo(int n);
  int Fib(int n) {
  if (n <= 1) {
  return n;
  }
  return foo(n);
  }
  int foo(int n) {
  return Fib(n-2) + Fib(n-1);
  };

After inline optimizations, it's translated to:
  int foo(int n);
  int Fib(int n) {
  if (n <= 1) {
  return n;
  }
  return foo(n);
  }
  int foo(int n) {
  return (n-2<=1) ? n-2 : foo(n-2) + (n-1<=1) ? n-1 : foo(n-1);
  };

As a result, the maximum depth of the stack reduces by 1, since all
boundary checkings (if (n <= 1) return n;) are done by the caller
side, which may eliminate unnecessary function call overhead.

To me the optimization should be: For a given recursive function A,
split it into function B and C, so that A is equivalent to { B();
return C(); }, where B should be easy to inline (e.g. no recursive
calls) and C may not.

Is it possible/reasonable to do such an optimization? I hope it can help. :)

Thanks!


-- 
Regards,
Tim Shen

Mirror Changes

2015-04-16 Thread Tim Semeijn

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hey,

We have changed our company name, hostnames and contact information.
Please remove the current BBLN mirror (mirror.bbln.org) and replace it
with our three new ones:

- ---

http://mirror0.babylon.network/gcc/
https://mirror0.babylon.network/gcc/
ftp://mirror0.babylon.network/gcc/
rsync://mirror0.babylon.network/gcc/

Location: Gravelines, France
Contact: Tim Semeijn (noc@babylon.network) at Babylon Network

- ---

http://mirror1.babylon.network/gcc/
https://mirror1.babylon.network/gcc/
ftp://mirror1.babylon.network/gcc/
rsync://mirror1.babylon.network/gcc/

Location: Roubaix, France
Contact: Tim Semeijn (noc@babylon.network) at Babylon Network

- ---

http://mirror2.babylon.network/gcc/
https://mirror2.babylon.network/gcc/
ftp://mirror2.babylon.network/gcc/
rsync://mirror2.babylon.network/gcc/

Location: Amsterdam, The Netherlands
Contact: Tim Semeijn (noc@babylon.network) at Babylon Network

- ---

I will also send this e-mail from noc@babylon.network to confirm the
request.

Thanks in advance!

- -- 
Tim Semeijn
pgp 0x08CE9B4D
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQEcBAEBAgAGBQJVMEW+AAoJEB4F+FYIzptNcH8H/RUVkmAsUcgNe82nQN+wU76K
SJBDD85pf01BqcoSIRpDc8HsjSU7FxhrDmr0u9BXRlIE0XFvRws3YvQqMl3yhTf0
fqATCq/b2KEGnc/mElmn6hO8H2wRMsjh7xPkVq2ysYJoNdBiyZJbZrL6i4TxAUf2
dXFTR18t5VAEOWCREfnkE9dzr4WcFiyiIVP2BeK0086WMLhF++TezuC5pwhlt1nW
/D+ncBKJG1bE8poDt6LeHjeqpL+3CdIFLq/NQdNUj5ETyWmbeDV0p7kDgydDLcrA
KOyfokNhTSof5p1IkccAsHPVStyq8B9DvLTPtODpuvqYZ0Iut0FJ7a1sszRScaU=
=XOBV
-END PGP SIGNATURE-

Confirmation Mirror Changes

2015-04-16 Thread Tim Semeijn

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hey,

[[[ This is a confirmation of the request sent from n...@bbln.org ]]]

We have changed our company name, hostnames and contact information.
Please remove the current BBLN mirror (mirror.bbln.org) and replace it
with our three new ones:

- ---

http://mirror0.babylon.network/gcc/
https://mirror0.babylon.network/gcc/
ftp://mirror0.babylon.network/gcc/
rsync://mirror0.babylon.network/gcc/

Location: Gravelines, France
Contact: Tim Semeijn (noc@babylon.network) at Babylon Network

- ---

http://mirror1.babylon.network/gcc/
https://mirror1.babylon.network/gcc/
ftp://mirror1.babylon.network/gcc/
rsync://mirror1.babylon.network/gcc/

Location: Roubaix, France
Contact: Tim Semeijn (noc@babylon.network) at Babylon Network

- ---

http://mirror2.babylon.network/gcc/
https://mirror2.babylon.network/gcc/
ftp://mirror2.babylon.network/gcc/
rsync://mirror2.babylon.network/gcc/

Location: Amsterdam, The Netherlands
Contact: Tim Semeijn (noc@babylon.network) at Babylon Network

- ---

Thanks in advance!

- -- 
Tim Semeijn
Babylon Network
pgp 0x5B8A4DDF
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBAgAGBQJVMEXGAAoJEIZioqpbik3fcIUP/2Nqh4RB8GbHMu3YaXz48DPy
BvBSlh5HRKIqwNnnT6dR0LPp/Se9bU8rUq86c/hfecr7kqldru1bY39aHTY/LjqE
poI3ACELS9KeGqP/qvdX0jifaJTw2PWwaSsVnxOUzlpJlQP9AT3q9QvgTTuyxkWu
TU8mdUAy8gfzvDQ2aD2s1sCFbic85DymLlKEKo2KH7r4xRFvNPY5dhTSGKbuOwtR
DxdAoA7a8jJaD0/Ke6fgdCjq5g/O80kNkzW13F4ZHpQ38Kjs64MDB1FsVUm57SML
5KZ0liKkyHsJQaz6NA/oZkCJccaxN0LoFHbgXg3t49oVyquv3cw/lMxxgi7//HzS
uw6iMVHvJ7dVuk3MnTaqRfufYeiRhe3gqZ1E/SDPJGgYchxvgpntK7UT7/+jJRll
8kAJ/yty7kgmC6i8ZMlFCpdgOR7tJ2vOJ8c5s+T0/NKbUXY5etazgHUXe7XhOz+T
7L9woHYQ/Q6q3+JQqEF3kmuZLA7fmr8KjJvGhpw35RIjJkeeKKW+QbAn4W91b8Bi
yTG3/crmc2mqvYfM43zm3gzYxb7sv8d1ZIB2TYdYDnQiO5QhyQqHBZQuhcxYZ+PT
FAkyl+StCMc7Ucn7kY2+eMQzuFLEst/4UIBewtTqnX70s8o3rYTPtMIgf1qTpEY/
ZKf9FOq7t5U8ZkMxO3yE
=RjV6
-END PGP SIGNATURE-

Re: [wwwdocs] PATCH for Re: Confirmation Mirror Changes

2015-04-23 Thread Tim Semeijn

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Dear Gerald,

Thanks for processing the patch!

Best regards,

On 4/23/15 11:49 PM, Gerald Pfeifer wrote:
> On Fri, 17 Apr 2015, Tim Semeijn wrote:
>> We have changed our company name, hostnames and contact
>> information. Please remove the current BBLN mirror
>> (mirror.bbln.org) and replace it with our three new ones:
> 
> The patch below implements those changes:
> 
> - Replace mirror.bbln.org by mirror1.babylon.network. - Add
> mirror0.babylon.network, France, Gravelines. - Add
> mirror2.babylon.network, The Netherlands, Amsterdam.
> 
> Applied.
> 
> If you have any further changes, suggestion a patch against 
> https://gcc.gnu.org/mirrors.html would be great.
> 
> Gerald
> 
> Index: mirrors.html 
> ===
>
> 
RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v
> retrieving revision 1.229 diff -u -r1.229 mirrors.html ---
> mirrors.html  7 Apr 2015 18:17:46 -   1.229 +++ mirrors.html  23
> Apr 2015 21:39:08 - @@ -19,11 +19,16 @@ Canada:  href="http://gcc.skazkaforyou.com";>http://gcc.skazkaforyou.com,
> thanks to Sergey Ivanov (mirrors at skazkaforyou.com) 
> France (no snapshots):  href="ftp://ftp.lip6.fr/pub/gcc/";>ftp.lip6.fr, thanks to
> ftpmaint at lip6.fr France, Brittany:  href="ftp://ftp.irisa.fr/pub/mirrors/gcc.gnu.org/gcc/";>ftp.irisa.fr,
> thanks to ftpmaint at irisa.fr +France, Gravelines: +   href="http://mirror0.babylon.network/gcc/";>http://mirror0.babylon.netw
ork/gcc/
> | +   href="ftp://mirror0.babylon.network/gcc/";>ftp://mirror0.babylon.networ
k/gcc/
> | +   href="rsync://mirror0.babylon.network/gcc/">rsync://mirror0.babylon.ne
twork/gcc/,
>
> 
+  thanks to Tim Semeijn (noc@babylon.network) at Babylon Network.
> France, Roubaix: -   href="http://mirror.bbln.org/gcc/";>http://mirror.bbln.org/gcc/
> | -   href="ftp://mirror.bbln.org/gcc";>ftp://mirror.bbln.org/gcc | -
>  href="rsync://mirror.bbln.org/gcc">rsync://mirror.bbln.org/gcc,
>
> 
- -  thanks to Tim Semeijn (n...@bbln.org) and BBLN.
> +   href="http://mirror1.babylon.network/gcc/";>http://mirror1.babylon.netw
ork/gcc/
> | +   href="ftp://mirror1.babylon.network/gcc/";>ftp://mirror1.babylon.networ
k/gcc/
> | +   href="rsync://mirror1.babylon.network/gcc/">rsync://mirror1.babylon.ne
twork/gcc/,
>
> 
+  thanks to Tim Semeijn (noc@babylon.network) at Babylon Network.
> France, Versailles:  href="ftp://ftp.uvsq.fr/pub/gcc/";>ftp.uvsq.fr, thanks to
> ftpmaint at uvsq.fr Germany, Berlin:  href="ftp://ftp.fu-berlin.de/unix/languages/gcc/";>ftp.fu-berlin.de
,
> thanks to ftp at fu-berlin.de Germany:  href="ftp://ftp.gwdg.de/pub/misc/gcc/";>ftp.gwdg.de, thanks to
> emoenke at gwdg.de @@ -34,6 +39,11 @@ Japan:  href="ftp://ftp.dti.ad.jp/pub/lang/gcc/";>ftp.dti.ad.jp, thanks
> to IWAIZAKO Takahiro (ftp-admin at dti.ad.jp) Japan:  href="http://ftp.tsukuba.wide.ad.jp/software/gcc/";>ftp.tsukuba.wide.ad
.jp,
> thanks to Kohei Takahashi (tsukuba-ftp-servers at
> tsukuba.wide.ad.jp) Latvia, Riga:  href="http://mirrors.webhostinggeeks.com/gcc/";>mirrors.webhostinggeeks
.com/gcc/,
> thanks to Igor (whg.igp at gmail.com) +The Netherlands,
> Amsterdam: +   href="http://mirror2.babylon.network/gcc/";>http://mirror2.babylon.netw
ork/gcc/
> | +   href="ftp://mirror2.babylon.network/gcc/";>ftp://mirror2.babylon.networ
k/gcc/
> | +   href="rsync://mirror2.babylon.network/gcc/">rsync://mirror2.babylon.ne
twork/gcc/,
>
> 
+  thanks to Tim Semeijn (noc@babylon.network) at Babylon Network.
> The Netherlands, Nijmegen:  href="ftp://ftp.nluug.nl/mirror/languages/gcc";>ftp.nluug.nl,
> thanks to Jan Cristiaan van Winkel (jc at ATComputing.nl) 
> Russia:  href="http://mirrors-ru.go-parts.com/gcc/";>http://mirrors-ru.go-parts.
com/gcc
>
> 
- -- 
Tim Semeijn
Babylon Network
pgp 0x5B8A4DDF
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQIcBAEBCgAGBQJVOWv/AAoJEIZioqpbik3fS9kP/2D3QLnkjI8jo9Xokd29W1O7
D/c6IFNNDmJNqv9iYuleGjm3HjHxMtMZALkcZPGTi0EMUQiZAyqC7mJlNOA2Fft7
7bneKXJ2bLic3ksaSXT1X9iBzUDkgKXywDn1rsvm/UkVLDbw16fP49Bdnoq2hHR9
0LVnuZwGYHQz6gPiARCFahK6f+kgJVhp5pPWtAm+7KWRPZd0P8Li3plS+opYsUR6
rF57Web5r+g5pLCO1NBukirt6xkEAsM9D+EuzKvyqFyp0vVnBqgPWcB/7jw3nc8p
k0wnYaexa6P7JTbEDgKvyUQPYqIuIDjaUvQdVnT37C2yaaJjW1axTSXnJN7FMxHx
0PXTHkrnjw4j0ClfSUz2/o0ujEb14hbt15BC9H4FdHa8vBL0rL6t4GONEl0Tsad/
JYyciDkJuQDDry6aL1V9EfOUSJrH/2NoJKAtYg5zuKhqWaABVQzWah9EGCqjIgZD
zXBELDYIHZuGi7LpOTEoj3jADkEnDufXYBrC9rGNvlivMgH3t26MZe2zcI9isfuw
InogYh35Xlu5t7468mw3LUxU58v4h6R5V+movPbmuCtuTl2ueLvnzTAEkk2KAOH8
HGwRI6ymI+D3Wc0ZUt21Qt75jjgQKp7yAsy7DO2Sy9Xhc2jJS0JFwBGZGPmYjKP+
vOZmdjbx4MVu1PDOgPQk
=MoX3
-END PGP SIGNATURE-

add command line option to gcc

2019-09-05 Thread Tim Rice



I have a use case where I would like gcc to accept -Kthread
and act as if it was passed -pthread. So -Kthread would
be a synonym for -pthread.

I am having trouble figuring out how the option processing is handled.
Possibly in gcc/gcc.c but I am stumped here.

Any pointers would be welcome.

Thanks.

-- 
Tim RiceMultitalents
t...@multitalents.net

Re: add command line option to gcc

2019-09-06 Thread Tim Rice

On Fri, 6 Sep 2019, Jonathan Wakely wrote:

> On Fri, 6 Sep 2019 at 04:26, Tim Rice wrote:
> >
> >
> > I have a use case where I would like gcc to accept -Kthread
> > and act as if it was passed -pthread. So -Kthread would
> > be a synonym for -pthread.
> 
> For a specific target, or universally?

Likely only useful for UnixWare (and OpenServer 6).

> 
> > I am having trouble figuring out how the option processing is handled.
> > Possibly in gcc/gcc.c but I am stumped here.
> 
> You could use "specs" to tell the driver to use -pthread when -Kthread
> is given e.g.
> 
> %{Kthread: -pthread}
> 
> This can either be hardcoded into the 'gcc' driver program (which
> would be done in gcc/gcc.c or in a per-target file under gcc/config)
> or provided in a specs file with the -specs option (see the manual).

Ok, I'll go down this path and see how it works out.

Thanks.
 
> The quick and dirty way to test that would be to dump the current
> specs to a file with 'gcc -dumpspecs > kthread.spec' and then edit the
> file so that everywhere you see %{pthread: xxx} you add %{Kthread:
> xxx} to make it do the same thing. Then you can run gcc
> -specs=kthread.spec -Kthread ...
> 

-- 
Tim RiceMultitalents(707) 456-1146
t...@multitalents.net

Remove ca.mirror.babylon.network

2017-07-15 Thread Tim Semeijn

We will soon decommission our Canadian mirror due to restructuring.
Please remove the following server from the mirror list:

ca.mirror.babylon.network/gcc

Our French mirrors will remain active.

Thanks!

-- 
Tim Semeijn
Babylon Network

PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5



signature.asc
Description: OpenPGP digital signature

Remove *.mirror.babylon.network

2018-03-20 Thread Tim Semeijn

Dear,

For the foreseeable future we will not be able to provide our mirrors
anymore. Could you please remove:

nl.mirror.babylon.network
fr.mirror.babylon.network

Thanks!

-- 
Tim Semeijn
Babylon Network

PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5



signature.asc
Description: OpenPGP digital signature

Re: Vector permutation only deals with # of vector elements same as mask?

2011-02-11 Thread Tim Prince


On 2/11/2011 7:30 AM, Bingfeng Mei wrote:

Thanks. Another question. Is there any plan to vectorize
the loops like the following ones?

 for (i=127; i>=0; i--) {
x[i] = y[i] + z[i];
 }



When I last tried, the Sun compilers could vectorize such loops 
efficiently (for fairly short loops), with appropriate data definitions. 
 The Sun compilers didn't peel for alignment, to improve performance on 
longer loops, as gcc and others do.  For a case with no data overlaps 
(float * __restrict__ x, ,y,z,  or Fortran), loop reversal can do 
the job. gcc has some loop reversal machinery, but I haven't seen it 
used for vectorization.  In a simple case like this, some might argue 
there's no reason to write a backward loop when it could easily be 
reversed in source code, and compilers have been seen to make mistakes 
in reversal.


--
Tim Prince

Re: numerical results differ after irrelevant code change

2011-05-08 Thread Tim Prince

On 5/8/2011 8:25 AM, Michael D. Berger wrote:

-Original Message-
From: Robert Dewar [mailto:de...@adacore.com]
Sent: Sunday, May 08, 2011 11:13
To: Michael D. Berger
Cc: gcc@gcc.gnu.org
Subject: Re: numerical results differ after irrelevant code change

[...]

This kind of result is quite expected on an x86 using the old
style (default) floating-point (becauae of extra precision in
intermediate results).

How does the extra precision lead to the variable result?
Also, is there a way to prevent it?  It is a pain in regression testing.

If you don't need to support CPUs over 10 years old, consider 
-march=pentium4 -mfpmath=sse or use the 64-bit OS and gcc.
Note the resemblance of your quoted differences to DBL_EPSILON from 
.  That's 1 ULP relative to 1.0.  I have a hard time imagining 
the nature of real applications which don't need to tolerate differences 
of 1 ULP.

--
Tim Prince

ARM abort() core files

2011-10-25 Thread Hammer, Tim


About 3 years ago (August, 2008) there was a discussion here about (not) 
getting a backtrace from abort(3) on ARM:
 http://gcc.gnu.org/ml/gcc/2008-08/msg00060.html 

That thread discussed why core files generated from a call to abort() do not 
have a stack to review and some possible approaches for modifying the compiler. 
I can find no resolution to that discussion nor any other discussion of the 
topic.

Our toolchain vendor is currently only providing GCC 4.4 and I have verified 
the issue still exists. They informed me today that GCC 4.6 "may be better 
about it". Can someone confirm that a change has been made and where I can find 
more information about it?

Thanks!
--
.Tim 
Tim D. Hammer
Software Developer
Global Business & Services Group
Xerox Corporation
M/S 0111-01A
800 Phillips Road
Webster, NY 14580

Phone: 585/427-1684
Fax:  585/231-5596
Mail: tim.ham...@xerox.com

Re: Profiling gcc itself

2011-11-20 Thread Tim Prince


 On 11/20/2011 11:10 AM, Basile Starynkevitch wrote:

On Sun, 20 Nov 2011 03:43:20 -0800
Jeff Evarts  wrote:


I posted this question at irc://irc.oftc.net/#gcc and they suggested
that I pose it here instead.

I do some "large-ish" builds (linux, gcc itself, etc) on a too-regular
basis, and I was wondering what could be done to speed things up. A
little printf-style checking hints to me that I might be spending the
majority of my time in CPP rather g++, gasm, ld, etc. Has anyone
(ever, regularly, or recently) built gcc (g++, gcpp) with profiling
turned on? Is it hard? Did you get good results?

I'm not sure the question belongs to gcc@gcc.gnu.org, perhaps 
gcc-h...@gcc.gnu.org might
be a better place.
If you choose to follow such advice, explaining whether other facilities 
already in gcc, e.g.

http://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
apply to your situation may be useful.


--
Tim Prince

Re: C Compiler benchmark: gcc 4.6.3 vs. Intel v11 and others

2012-01-19 Thread Tim Prince

On 1/19/2012 9:27 AM, willus.com wrote:

On 1/19/2012 2:59 AM, Richard Guenther wrote:

On Thu, Jan 19, 2012 at 7:37 AM, Marc Glisse wrote:

On Wed, 18 Jan 2012, willus.com wrote:

For those who might be interested, I've recently benchmarked gcc 4.6.3
(and 3.4.2) vs. Intel v11 and Microsoft (in Windows 7) here:

http://willus.com/ccomp_benchmark2.shtml

http://en.wikipedia.org/wiki/Microsoft_Windows_SDK#64-bit_development

For the math functions, this is normally more a libc feature, so you
might
get very different results on different OS. Then again, by using
-ffast-math, you allow the math functions to return any random value,
so I
can think of ways to make it even faster ;-)

Also for math functions you can simply substitute the Intel compilers one
(GCC uses the Microsoft ones) by linking against libimf. You can also
make
use of their vectorized variants from GCC by specifying -mveclibabi=svml
and link against libimf (the GCC autovectorizer will then use the
routines
from the Intel compiler math library). That makes a huge difference for
code using functions from math.h.

Richard.

--
Marc Glisse

Thank you both for the tips. Are you certain that with the flags I used
Intel doesn't completely in-line the math2.h functions at the compile
stage? gcc? I take it to use libimf.a (legally) I would have to purchase
the Intel compiler?
In-line math functions, beyond what gcc does automatically (sqrt...) are
possible only with x87 code; those aren't vectorizable nor remarkably
fast, although quality can be made good (with care).

As Richard said, the icc svml library is the one supporting the fast
vector math functions. There is also an arch-consistency version of
svml (different internal function names) which is not as fast but may
give more accurate results or avoid platform-dependent bugs.

Yes, the Intel library license makes restrictions on usage:
http://software.intel.com/en-us/articles/faq-intel-parallel-composer-redistributable-package/?wapkw=%28redistributable+license%29

You might use it for personal purposes under terms of this linux license:
http://software.intel.com/en-us/articles/Non-Commercial-license/?wapkw=%28non-commercial+license%29

It isn't supported in the gcc context. Needless to say, I don't speak
for my employer.

--
Tim Prince

Re: C Compiler benchmark: gcc 4.6.3 vs. Intel v11 and others

2012-01-19 Thread Tim Prince


On 1/19/2012 9:24 PM, willus.com wrote:

On 1/18/2012 10:37 PM, Marc Glisse wrote:

On Wed, 18 Jan 2012, willus.com wrote:


For those who might be interested, I've recently benchmarked gcc
4.6.3 (and 3.4.2) vs. Intel v11 and Microsoft (in Windows 7) here:

http://willus.com/ccomp_benchmark2.shtml


http://en.wikipedia.org/wiki/Microsoft_Windows_SDK#64-bit_development

For the math functions, this is normally more a libc feature, so you
might get very different results on different OS. Then again, by using
-ffast-math, you allow the math functions to return any random value,
so I can think of ways to make it even faster ;-)


I use -ffast-math all the time and have always gotten virtually
identical results to when I turn it off. The speed difference is
important for me.
The default for the Intel compiler is more aggressive than gcc 
-ffast-math -fno-cx-limited-range, as long as you don't use one of the 
old buggy mathinline.h header files.  For a fair comparison, you need 
detailed attention to comparable options.  If you don't set gcc 
-ffast-math, you will want icc -fp-model-source.
It's good to have in mind what you want from the more aggressive 
options, e.g. auto-vectorization of sum reduction.

If you do want gcc -fcx-limited range, icc spells it -complex-limited-range.

--
Tim Prince

Re: weird optimization in sin+cos, x86 backend

2012-02-05 Thread Tim Prince


 On 02/05/2012 11:08 AM, James Courtier-Dutton wrote:

Hi,

I looked at this a bit closer.
sin(1.0e22) is outside the +-2^63 range, so FPREM1 is used to bring it
inside the range.
So, I looked at FPREM1 a bit closer.

#include
#include

int main (void)
{
  long double x, r, m;

  x = 1.0e22;
// x = 5.26300791462049950360708478127784;<- This is what the answer
should be give or take 2PI.
  m =  M_PIl * 2.0;
  r = remainderl(x, m);   // Utilizes FPREM1

  printf ("x = %.17Lf\n", x);
  printf ("m = %.17Lf\n", m);
  printf ("r = %.17Lf\n", r);

  return 1;
}

This outputs:
x = 100.0
m = 6.28318530717958648
r = 2.66065232182161996

But, r should be
5.26300791462049950360708478127784... or
-1.020177392559086973318201985281...
according to wolfram alpha and most arbitrary maths libs I tried.

I need to do a bit more digging, but this might point to a bug in the
cpu instruction FPREM1

Kind Regards

James
As I recall, the remaindering instruction was documented as using a 
66-bit rounded approximation fo PI, in case that is what you refer to.


--
Tim Prince

How to figure out the gcc -dP output?

2009-07-24 Thread Tim Crook

Hello there.

I am trying to track down a problem with gcc 4.1 which has to do with inlining 
and templates on PowerPC. Is there any documentation I can look related to the 
output generated with -fdump? I am getting extraneous lwz (load word and zero 
extend) instructions inserted when calling various methods - after $toc (r2) 
has been switched to the destination method's global data, just before the 
method call with the bctrl instruction. This lwz instruction causes a crash on 
IBM AIX when 32-bit shared libraries are loaded non-contiguously in memory. It 
looks like various code blocks are not being combined correctly when code is 
inlined - the extra lwz is being left behind.

I have figured out that turning off gcse optimizations will stop this behavior, 
but doing this causes a performance hit. I would prefer not to upgrade the 
compiler at this time. With the compiler dump using -fdump, I am looking for a 
better way to work around this problem.

Tim Crook.

RE: How to figure out the gcc -dP output?

2009-07-28 Thread Tim Crook

Thanks David.

I thought -mmininal-toc might have been a better workaround as well :-) .

Is there a Bugzilla number for this issue?

-Original Message-
From: David Edelsohn [mailto:dje@gmail.com] 
Sent: Tuesday, July 28, 2009 9:46 AM
To: Tim Crook
Subject: Re: How to figure out the gcc -dP output?

Tim,

I do not fully understand the complete explanation of the original problem.  
You mention extraneous lwz and TOC.  I think you are referring to a bug in GCC 
4.1 that incorrectly emitted loads after the TOC already had been changed for 
an indirect call.  GCSE probably is producing code that requires a constant and 
GCC needs to place that constant in the TOC.  The late creation of the TOC 
reference is not scheduled correctly.

GCSE is an optimization.  -mminimal-toc is an option to avoid TOC overflow.  
Both of these are work-arounds to the problem.  Disabling GCSE probably will 
slow down the application.  -mminimal-toc probably will have less of a 
performance impact.

As I mentioned to Chris when I spoke with him last week, I would recomment 
upgrading to a newer version of GCC because GCC 4.1 no longer is maintained.  
Many bug fixes, such as one for the problem you are encountering, are 
incorporated into newer releases.

David

> I found a possible compiler workaround, compiling with -mminimal-toc. Would
> I get better performance by using this, instead of turning off gcse?

On Fri, Jul 24, 2009 at 4:34 PM, Tim Crook wrote:
> Hello there.
>
> I am trying to track down a problem with gcc 4.1 which has to do with 
> inlining and templates on PowerPC. Is there any documentation I can look 
> related to the output generated with -fdump? I am getting extraneous lwz 
> (load word and zero extend) instructions inserted when calling various 
> methods - after $toc (r2) has been switched to the destination method's 
> global data, just before the method call with the bctrl instruction. This lwz 
> instruction causes a crash on IBM AIX when 32-bit shared libraries are loaded 
> non-contiguously in memory. It looks like various code blocks are not being 
> combined correctly when code is inlined - the extra lwz is being left behind.
>
> I have figured out that turning off gcse optimizations will stop this 
> behavior, but doing this causes a performance hit. I would prefer not to 
> upgrade the compiler at this time. With the compiler dump using -fdump, I am 
> looking for a better way to work around this problem.
>
> Tim Crook.
>

Re: Failure building current 4.5 snapshot on Cygwin

2009-08-23 Thread Tim Prince


Eric Niebler wrote:

Angelo Graziosi wrote:

Eric Niebler wrote:
I am running into the same problem (cannnot build latest snapshot on 
cygwin). I have built and installed the latest binutils from head 
(see attached config.log for details). But still the build fails. Any 
help?


This is strange! Recent snapshots (4.3, 4.4, 4.5) build OB both on 
Cygwin-1.5 and 1.7. In 1.5 I have build the same binutils of 1.7.






I've attached objdir/intl/config.log.

It says you have triggered cross compilation mode, without complete 
setup.  Also, it says you are building in a directory below your source 
code directory, which I always used to do myself, but stopped on account 
of the number of times I've seen this criticized.
The only new build-blocking problem I've run into in the last month is 
the unsupported autoconf test, which has a #FIXME comment.  I had to 
comment it out.

Re: [4.4] Strange performance regression?

2009-10-14 Thread Tim Prince


Joern Rennecke wrote:

Quoting Mark Tall :


Joern Rennecke wrote:

But at any rate, the subject does not agree with
the content of the original post.  When we talk
about a 'regression' in a particular gcc version,
we generally mean that this version is in some
way worse than a previous version of gcc.


Didn't the original poster indicate that gcc 4.3 was faster than 4.4 ?
 In my book that is a regression.


He also said that it was a different machine, Core 2 Q6600 vs
some kind of Xeon Core 2 system with a total of eight cores.
As different memory subsystems are likely to affect the code, it
is not an established regression till he can reproduce a performance
drop going from an older to a current compiler on the same or
sufficiently similar machines, under comparable load conditions -
which generally means that the machine must be idle apart from the
benchmark.
Ian's judgment in diverting to gcc-help was born out when it developed 
that -funroll-loops was wanted.  This appeared to confirm his suggestion 
that it might have had to do with loop alignments.
As long as everyone is editorializing, I'll venture say this case raises 
the suspicion that gcc might benefit from better default loop 
alignments, at least for that particular CPU.  However, I've played a 
lot of games on Core i7 with varying unrolling etc. I find the behavior 
of current gcc entirely satisfactory, aside from the verbosity of the 
options required.

Re: Whole program optimization and functions-only-called-once.

2009-11-15 Thread Tim Prince


Toon Moene wrote:

Richard Guenther wrote:


On Sun, Nov 15, 2009 at 8:07 AM, Toon Moene  wrote:



Steven Bosscher wrote:



At least CPROP, LCM-PRE, and HOIST (i.e. all passes in gcse.c), and
variable tracking.


Are they covered by a --param ?  At least that way I could teach them 
to go

on indefinitely ...



I think most of them are.  Maybe we should diagnose the cases where
we hit these limits.


That would be a good idea.  One other compiler I work with frequently 
(the Intel Fortran compiler) does just that.  However, either it doesn't 
have or their marketing department doesn't want you to know about knobs 
to tweak these decisions :-)


Both gfortran and ifort have a much longer list of adjustable limits on 
in-lining than most customers are willing to study or test.

Re: On the x86_64, does one have to zero a vector register before filling it completely ?

2009-11-28 Thread Tim Prince


Toon Moene wrote:

H.J. Lu wrote:

On Sat, Nov 28, 2009 at 3:21 AM, Toon Moene  wrote:

L.S.,

Due to the discussion on register allocation, I went back to a hobby of
mine: Studying the assembly output of the compiler.

For this Fortran subroutine (note: unless otherwise told to the Fortran
front end, reals are 32 bit floating point numbers):

 subroutine sum(a, b, c, n)
 integer i, n
 real a(n), b(n), c(n)
 do i = 1, n
c(i) = a(i) + b(i)
 enddo
 end

with -O3 -S (GCC: (GNU) 4.5.0 20091123), I get this (vectorized) loop:

   xorps   %xmm2, %xmm2
   
.L6:
   movaps  %xmm2, %xmm0
   movaps  %xmm2, %xmm1
   movlps  (%r9,%rax), %xmm0
   movlps  (%r8,%rax), %xmm1
   movhps  8(%r9,%rax), %xmm0
   movhps  8(%r8,%rax), %xmm1
   incl%ecx
   addps   %xmm1, %xmm0
   movaps  %xmm0, 0(%rbp,%rax)
   addq$16, %rax
   cmpl%ebx, %ecx
   jb  .L6

I'm not a master of x86_64 assembly, but this strongly looks like 
%xmm{0,1}
have to be zero'd (%xmm2 is set to zero by xor'ing it with itself), 
before

they are completely filled with the mov{l,h}ps instructions ?



I think it is used to avoid partial SSE register stall.


You mean there's no movaps (%r9,%rax), %xmm0 (and mutatis mutandis for 
%xmm1) instruction (to copy 4*32 bits to the register) ?



If you want those, you must request them with -mtune=barcelona.

Re: On the x86_64, does one have to zero a vector register before filling it completely ?

2009-11-28 Thread Tim Prince


Richard Guenther wrote:

On Sat, Nov 28, 2009 at 4:26 PM, Tim Prince  wrote:

Toon Moene wrote:

H.J. Lu wrote:

On Sat, Nov 28, 2009 at 3:21 AM, Toon Moene  wrote:

L.S.,

Due to the discussion on register allocation, I went back to a hobby of
mine: Studying the assembly output of the compiler.

For this Fortran subroutine (note: unless otherwise told to the Fortran
front end, reals are 32 bit floating point numbers):

subroutine sum(a, b, c, n)
integer i, n
real a(n), b(n), c(n)
do i = 1, n
   c(i) = a(i) + b(i)
enddo
end

with -O3 -S (GCC: (GNU) 4.5.0 20091123), I get this (vectorized) loop:

  xorps   %xmm2, %xmm2
  
.L6:
  movaps  %xmm2, %xmm0
  movaps  %xmm2, %xmm1
  movlps  (%r9,%rax), %xmm0
  movlps  (%r8,%rax), %xmm1
  movhps  8(%r9,%rax), %xmm0
  movhps  8(%r8,%rax), %xmm1
  incl%ecx
  addps   %xmm1, %xmm0
  movaps  %xmm0, 0(%rbp,%rax)
  addq$16, %rax
  cmpl%ebx, %ecx
  jb  .L6

I'm not a master of x86_64 assembly, but this strongly looks like
%xmm{0,1}
have to be zero'd (%xmm2 is set to zero by xor'ing it with itself),
before
they are completely filled with the mov{l,h}ps instructions ?


I think it is used to avoid partial SSE register stall.



You mean there's no movaps (%r9,%rax), %xmm0 (and mutatis mutandis for
%xmm1) instruction (to copy 4*32 bits to the register) ?


If you want those, you must request them with -mtune=barcelona.


Which would then get you movups (%r9,%rax), %xmm0 (unaligned move).
generic tuning prefers the split moves, AMD Fam10 and above handle
unaligned moves just fine.

Correct, the movaps would have been used if alignment were recognized.
The newer CPUs achieve full performance with movups.
Do you consider Core i7/Nehalem as included in "AMD Fam10 and above?"

Re: On the x86_64, does one have to zero a vector register before filling it completely ?

2009-11-28 Thread Tim Prince


Toon Moene wrote:

Toon Moene wrote:


Tim Prince wrote:

 > If you want those, you must request them with -mtune=barcelona.

OK, so it is an alignment issue (with -mtune=barcelona):

.L6:
movups  0(%rbp,%rax), %xmm0
movups  (%rbx,%rax), %xmm1
incl%ecx
addps   %xmm1, %xmm0
movaps  %xmm0, (%r8,%rax)
addq$16, %rax
cmpl%r10d, %ecx
jb  .L6


Once this problem is solved (well, determined how it could be solved), 
we go on to the next, the extraneous induction variable %ecx.


There are two ways to deal with it:

1. Eliminate it with respect to the other induction variable that
   counts in the same direction (upwards, with steps 16) and remember
   that induction variable's (%rax) limit.

or:

2. Count %ecx down from %r10d to zero (which eliminates %r10d as a loop
   carried register).

g77 avoided this by coding counted do loops with a separate loop counter 
counting down to zero - not so with gfortran (quoting):


/* Translate the simple DO construct.  This is where the loop variable
   has integer type and step +-1.  We can't use this in the general case
   because integer overflow and floating point errors could give
   incorrect results.
   We translate a do loop from:

   DO dovar = from, to, step
  body
   END DO

   to:

   [Evaluate loop bounds and step]
   dovar = from;
   if ((step > 0) ? (dovar <= to) : (dovar => to))
{
  for (;;)
{
  body;
   cycle_label:
  cond = (dovar == to);
  dovar += step;
  if (cond) goto end_label;
}
  }
   end_label:

   This helps the optimizers by avoiding the extra induction variable
   used in the general case.  */

So either we teach the Fortran front end this trick, or we teach the 
loop optimization the trick of flipping the sense of a (n otherwise 
unused) induction variable 


This would have paid off more frequently in i386 mode, where there is a 
possibility of integer register pressure in loops small enough for such 
an optimization to succeed.
This seems to be among the types of optimizations envisioned for 
run-time binary interpretation systems.

Re: Graphite and Loop fusion.

2009-11-30 Thread Tim Prince


Toon Moene wrote:


REAL, ALLOCATABLE ::  A(:,:), B(:,:), C(:,:), D(:,:), E(:,:), F(:,:)

! ... READ IN EXTEND OF ARRAYS ...

READ*,N

! ... ALLOCATE ARRAYS

ALLOCATE(A(N,N),B(N,N),C(N,N),D(N,N),E(N,N),F(N,N))

! ... READ IN ARRAYS

READ*,A,B

C = A + B
D = A * C
E = B * EXP(D)
F = C * LOG(E)

where the four assignments all have the structure of loops like:

DO I = 1, N
   DO J = 1, N
  X(J,I) = OP(A(J,I), B(J,I))
   ENDDO
ENDDO

Obviously, this could benefit from loop fusion, by combining the four 
assignments in one loop.


Provided that it were still possible to vectorize suitable portions, or 
N is known to be so large that cache locality outweighs vectorization. 
This raises the question of progress on vector math functions, as well 
as the one about relative alignments (or ignoring them in view of recent 
CPU designs).

GCC 4.3.3 Configure and Build for DDRescue

2009-12-08 Thread Tim Murdoch

Hello,

I'll begin by stating my knowledge of Unix is almost non-existent.
Using the basic skills that I learned many years ago, I'm currently
trying to rescue a near dead hard drive with DDRescue.  First, I need
to install a C++ compiler, which I have downloaded (v4.3.3) and
unzipped to my Mac.  I've read through the instructions to configure
and build, but am unable to decipher what I need.  Is there a command
that will perform a very basic configure and build of GCC 4.3.3?

Thanks much.

Re: Need an assembler consult!

2009-12-29 Thread Tim Prince


FX wrote:

Hi all,

I have picked up what seems to be a simple patch from PR36399, but I don't know 
enough assembler to tell whether it's fixing it completely or not.

The following function:

#include 
__m128i r(__m128 d1, __m128 d2, __m128 d3, __m128i r, int t, __m128i s) {return 
r+s;}

is compiled by Apple's GCC into:

pushl   %ebp
movl%esp, %ebp
subl$72, %esp
movaps  %xmm0, -24(%ebp)
movaps  %xmm1, -40(%ebp)
movaps  %xmm2, -56(%ebp)
movdqa  %xmm3, -72(%ebp) #
movdqa  24(%ebp), %xmm0  #
paddq   -72(%ebp), %xmm0 #
leave
ret

Instead of lines marked with #, FSF's GCC gives:

movdqa  40(%ebp), %xmm1
movdqa  8(%ebp), %xmm0
paddq   %xmm1, %xmm0


By fixing SSE_REGPARM_MAX in config/i386/i386.h (following Apple's compiler 
value), I get GCC now generates:

movdqa  %xmm3, -72(%ebp)
movdqa  24(%ebp), %xmm0
movdqa  -72(%ebp), %xmm1
paddq   %xmm1, %xmm0

The first two lines are identical to Apple, but the last two don't. They seem 
OK to me, but I don't know enough assembler to be really sure. Could someone 
confirm the two are equivalent?


Apparently the same as far as what is returned in xmm0.

Re: The "right way" to handle alignment of pointer targets in the compiler?

2010-01-01 Thread Tim Prince


Benjamin Redelings I wrote:

Hi,

I have been playing with the GCC vectorizer and examining assembly code 
that is produced for dot products that are not for a fixed number of 
elements.  (This comes up surprisingly often in scientific codes.)  So 
far, the generated code is not faster than non-vectorized code, and I 
think that it is because I can't find a way to tell the compiler that 
the target of a double* is 16-byte aligned.






 From Pr 27827 - http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827 :
 "I just quickly glanced at the code, and I see that it never uses 
"movapd" from memory, which is a key to getting decent performance."


How many people would take advantage of special machinery for some old 
CPU, if that's your goal?


simplifying your example to

double f3(const double* p_, const double* q_,int n)
{
  double sum = 0;
  for(int i=0; iOn CPUs introduced in the last 2 years, movupd should be as fast as 
movapd, and -mtune=barcelona should work well in general, not only in 
this example.
The bigger difference in performance, for longer loops, would come with 
further batching of sums, favoring loop lengths of multiples of 4 (or 8, 
with unrolling).  That alignment already favors a fairly long loop.


As you're using C++, it seems you could have used inner_product() rather 
than writing out a function.


My Core I7 showed matrix multiply 25x25 times 25x100 producing 17Gflops 
with gfortran in-line code.  g++ produces about 80% of that.

Re: The "right way" to handle alignment of pointer targets in the compiler?

2010-01-02 Thread Tim Prince


Benjamin Redelings I wrote:

Thanks for the information!


Here are several reasons (there are more) why gcc uses 64-bit loads by 
default:
1) For a single dot product, the rate of 64-bit data loads roughly 
balances the latency of adds to the same register. Parallel dot products 
(using 2 accumulators) would take advantage of faster 128-bit loads.
2) run-time checks to adjust alignment, if possible, don't pay off for 
loop counts < about 40.
3) several obsolete CPU architectures implemented 128-bit loads by pairs 
of 64-bit loads.
4) 64-bit loads were generally more efficient than movupd, prior to 
barcelona.


In the case you quote, with parallel dot products, 128-bit loads would 
be required so as to show much performance gain over x87.

Re: adding -fnoalias ... would a patch be accepted ?

2010-01-05 Thread Tim Prince


torbenh wrote:

can you please explain, why you reject the idea of -fnoalias ? 
msvc has declspec(noalias) icc has -fnoalias 

msvc needs it because it doesn't implement restrict and supports 
violation of typed aliasing rules as a default.  ICL needs it for msvc 
compatibility, but has better alternatives. gcc can't copy the worst 
features of msvc.

Re: speed of double-precision divide

2010-01-24 Thread Tim Prince


Steve White wrote:

 I was under the misconception that each of these SSE operatons
was meant to be accomplished in a single clock cycle (although I knew there
are various other issues.)



Current CPU architectures permit an SSE scalar or parallel multiply and 
add instruction to be issued on each clock cycle.  Completion takes at 
least 4 cycles for add, significantly more for multiply.
The instruction timing tables quote throughput (how many cycles between 
issue) and latency (number of cycles to complete an individual operation).
An even more common misconception than yours is that the extra time 
taken to complete multiply, compared with the time of add, would 
disappear with fused multiply-add instructions.
SSE divide, as has been explained, is not pipelined.   The best way to 
speed up a loop with divide is with vectorization, barring situations 
such as the one you brought up where divide may not actually be a 
necessary part of the algorithm.

Re: Support for export keyword to use with C++ templates ?

2010-02-02 Thread Tim Prince


On 2/2/10 7:19 PM, Richard Kenner wrote:

I see that what I need is an assignment for all future changes. If my
employer is not involved with any contributions of mine, the employer
disclaimer is not needed, right ?
 

It's safest to have it.  The best way to prove that your employer is
not involved with any contributions of yours is with such a disclaimer.
   
Some employers have had a formal process for approving assignment of 
own-time contributions, as well as assignments as part of their 
business, and lack of either form of assignment indicates the employer 
has forbidden them.


--
Tim Prince

Re: Starting an OpenMP parallel section is extremely slow on a hyper-threaded Nehalem

2010-02-11 Thread Tim Prince


On 2/11/2010 2:00 AM, Edwin Bennink wrote:

Dear gcc list,


I noticed that starting an OpenMP parallel section takes a significant 
amount of time on Nehalem cpu's with hyper-threading enabled.


If you think a question might be related to gcc, but don't know which 
forum to use, gcc-help is more appropriate.  As your question is whether 
there is a way to avoid anomalous behaviors when an old Ubuntu is run on 
a CPU released after that version of Ubuntu, an Ubuntu forum might be 
more appropriate.  A usual way is to shut off HyperThreading in the BIOS 
when running on a distro which has trouble with it.  I do find your 
observation interesting.
As far as I know, the oldest distro which works well on Core I7 is 
RHEL5.2 x86_64, which I run, with updated gcc and binutils, and HT 
disabled, as I never run applications which could benefit from HT.


--
Tim Prince

Re: Change x86 default arch for 4.5?

2010-02-18 Thread Tim Prince


On 2/18/2010 4:54 PM, Joe Buck wrote:


But maybe I didn't ask the right question: can any x86 experts comment on
recently made x86 CPUs that would not function correctly with code
produced by --with-arch=i486?  Are there any?

   
All CPUs still in production are at least SSE3 capable, unless someone 
can come up with one of which I'm not aware.  Intel compilers made the 
switch last year to requiring SSE2 capability for the host, as well as 
in the default target options, even for 32-bit.  All x86_64 or X64 CPUs 
for which any compiler was produced had SSE2 capability, so it is 
required for those 64-bit targets.


--
Tim Prince

Re: [RFH] A simple way to figure out the number of bits used by a long double

2010-02-26 Thread Tim Prince


On 2/26/2010 5:44 AM, Ed Smith-Rowland wrote:
Huh.  I would have *sworn* that sizeof(long double) was 10 not 16 even 
though we know it was 80 bits.


As you indicated before, sizeof gives the amount of memory displaced by 
the object, including padding.  In my experience with gcc, sizeof(long 
double) is likely to be 12 on 32-bit platforms, and 16 on 64-bit 
platforms.  These choices are made to preserve alignment for 32-bit and 
128-bit objects respectively, and to improve performance in the 64-bit 
case, for hardware which doesn't like to straddle cache lines.
It seems the topic would have been more appropriate for gcc-help, if 
related to gcc, or maybe comp.lang.c, if a question about implementation 
in accordance with standard C.


--
Tim Prince

Re: legitimate parallel make check?

2010-03-09 Thread Tim Prince


On 3/9/2010 4:28 AM, IainS wrote:



It would be nice to allow the apparently independent targets [e.g. 
gcc-c,fortran,c++ etc.] to be (explicitly) make-checked in parallel.



On certain targets, it has been necessary to do this explicitly for a 
long time, submitting make check-gcc, make check-fortran, make check-g++ 
separately.  Perhaps a script could be made which would detect when the 
build is complete, then submit the separate make check serial jobs together.



--
Tim Prince

Re: GCC vs ICC

2010-03-22 Thread Tim Prince


On 3/22/2010 7:46 PM, Rayne wrote:

Hi all,

I'm interested in knowing how GCC differs from Intel's ICC in terms of the 
optimization levels and catering to specific processor architecture. I'm using 
GCC 4.1.2 20070626 and ICC v11.1 for Linux.

How does ICC's optimization levels (O1 to O3) differ from GCC, if they differ 
at all?

The ICC is able to cater specifically to different architectures (IA-32, 
intel64 and IA-64). I've read that GCC has the -march compiler option which I 
think is similar, but I can't find a list of the options to use. I'm using 
Intel Xeon X5570, which is 64-bit. Are there any other GCC compiler options I 
could use that would cater my applications for 64-bit Intel CPUs?

   
Some of that seems more topical on the Intel software forum for icc, and 
the following more topical on either that forum or gcc-help, where you 
should go for follow-up.

If you are using gcc on Xeon 5570,
gcc -mtune=barcelona -ffast-math -O3 -msse4.2
might be a comparable level of optimization to
icc -xSSE4.2
For gcc 4.1, you would have to set also -ftree-vectorize, but you would 
be better off with a current version.
But, if you are optimizing for early Intel 64-bit Xeon, -mtune=barcelona 
would not be consistently good, and you could not use -msse4 or -xSSE4.2.
For optimization which observes standards and also disables vectorized 
sum reduction, you would omit -ffast-math for gcc, and set icc -fp-model 
source.


--
Tim Prince

Re: Compiler option for SSE4

2010-03-23 Thread Tim Prince


On 3/23/2010 11:02 PM, Rayne wrote:

I'm using GCC 4.1.2 20070626 on a server with Intel Xeon X5570.

How do I turn on the compiler option for SSE4? I've tried -msse4, -msse4.1 and -msse4.2, 
but they all returned the error message cc1: error: unrecognized command line option 
"-msse4.1" (for whichever option I tried).
   
You would need a gcc version which supports sse4.  As you said yourself, 
your version is approaching 3 years old.  Actually, the more important 
option for Xeon 55xx, if you are vectorizing, is the -mtune=barcelona, 
which has been supported for about 2 years.  Whether vectorizing or not, 
on an 8 core CPU, the OpenMP introduced in gcc 4.2 would be useful.
This looks like a gcc-help mail list question, which is where you should 
submit any follow-up.


--
Tim Prince

Re: Optimizing floating point *(2^c) and /(2^c)

2010-03-29 Thread Tim Prince


On 3/29/2010 10:51 AM, Geert Bosch wrote:

On Mar 29, 2010, at 13:19, Jeroen Van Der Bossche wrote:

   

've recently written a program where taking the average of 2 floating
point numbers was a real bottleneck. I've looked into the assembly
generated by gcc -O3 and apparently gcc treats multiplication and
division by a hard-coded 2 like any other multiplication with a
constant. I think, however, that *(2^c) and /(2^c) for floating
points, where the c is known at compile-time, should be able to be
optimized with the following pseudo-code:

e = exponent bits of the number
if (e>  c&&  e<  (0b111...11)-c) {
e += c or e -= c
} else {
do regular multiplication
}

Even further optimizations may be possible, such as bitshifting the
significand when e=0. However, that would require checking for a lot
of special cases and require so many conditional jumps that it's most
likely not going to be any faster.

I'm not skilled enough with assembly to write this myself and test if
this actually performs faster than how it's implemented now. Its
performance will most likely also depend on the processor
architecture, and I could only test this code on one machine.
Therefore I ask to those who are familiar with gcc's optimization
routines to give this 2 seconds of thought, as this is probably rather
easy to implement and many programs could benefit from this.
 

For any optimization suggestions, you should start with showing some real, 
compilable, code with a performance problem that you think the compiler could 
address. Please include details about compilation options, GCC versions and 
target hardware, as well as observed performance numbers. How do you see that 
averaging two floating point numbers is a bottleneck? This should only be a 
single addition and multiplication, and will execute in a nanosecond or so on a 
moderately modern system.

Your particular suggestion is flawed. Floating-point multiplication is very 
fast on most targets. It is hard to see how on any target with floating-point 
hardware, manual mucking with the representation can be a win. In particular, 
your sketch doesn't at all address underflow and overflow. Likely a complete 
implementation would be many times slower than a floating-point multiply.

   -Geert
   
gcc used to have the ability to replace division by a power of 2 by an 
fscale instruction, for appropriate targets (maybe still does).  Such 
targets have nearly disappeared from everyday usage.  What remains is 
the possibility of replacing the division by constant power of 2 by 
multiplication, but it's generally considered the programmer should have 
done that in the beginning.  icc has such an facility, but it's subject 
to -fp-model=fast (equivalent to gcc -ffast-math -fno-cx-limited-range), 
even though it's a totally safe conversion.
As Geert indicated, it's almost inconceivable that a correct 
implementation which takes care of exceptions could match the floating 
point hardware performance, even for a case which starts with operands 
in memory (but you mention the case following an addition).


--
Tim Prince

Re: GCC primary/secondary platforms?

2010-04-07 Thread Tim Prince


On 4/7/2010 9:17 AM, Gary Funck wrote:

On 04/07/10 11:11:05, Diego Novillo wrote:
   

Additionally, make sure that the branch bootstraps and tests on all
primary/secondary platforms with all languages enabled.
 

Diego, thanks for your prompt reply and suggestions.  Regarding
the primary/secondary platforms.  Are those listed here?
   http://gcc.gnu.org/gcc-4.5/criteria.html

   
Will there be a notification if and when C++ run-time will be ready to 
test on secondary platforms, or will platforms like cygwin be struck 
from the secondary list?  I'm 26 hours into testsuite for 4.5 RC for 
cygwin gcc/gfortran, didn't know of any other supported languages worth 
testing.
My ia64 box died a few months ago, but suse-linux surely was at least as 
popular as unknown-linux in recent years.


--
Tim Prince

Re: GCC primary/secondary platforms?

2010-04-08 Thread Tim Prince


On 4/8/2010 2:40 PM, Dave Korn wrote:

On 07/04/2010 19:47, Tim Prince wrote:

   

Will there be a notification if and when C++ run-time will be ready to
test on secondary platforms, or will platforms like cygwin be struck
from the secondary list?
 

   What exactly are you talking about?  Libstdc++-v3 builds just fine on Cygwin.

   

Our release criteria for the secondary platforms is:

 * The compiler bootstraps successfully, and the C++ runtime library builds.
 * The DejaGNU testsuite has been run, and a substantial majority of the 
tests pass.
 

   We pass both those criteria with flying colours.  What are you worrying 
about?

 cheers,
   DaveK
   
No one answered questions about why libstdc++ configure started 
complaining about mis-match in style of wchar support a month ago.  Nor 
did I see anyone give any changes in configure procedure. Giving it 
another try at a new download today.


--
Tim Prince

Re: GCC primary/secondary platforms?

2010-04-08 Thread Tim Prince


On 4/8/2010 6:24 PM, Dave Korn wrote:



  Nor
did I see anyone give any changes in configure procedure. Giving it
another try at a new download today.
 

   Well, nothing has changed, but then again I haven't seen anyone else
complaining about this, so there's probably some problem in your build
environment; let's see what happens with your fresh build.  (I've built the
4.5.0-RC1 candidate without any complications and am running the tests right 
now.)

   
Built OK this time around, no changes here either, except for cygwin1 
update.  testsuite results in a couple of days.

Thanks.

--
Tim Prince

Re: Why not contribute? (to GCC)

2010-04-23 Thread Tim Prince


On 4/23/2010 1:05 PM, HyperQuantum wrote:

On Fri, Apr 23, 2010 at 9:58 PM, HyperQuantum  wrote:
   

On Fri, Apr 23, 2010 at 8:39 PM, Manuel López-Ibáñez
  wrote:
 
   

What reasons keep you from contributing to GCC?
   

The lack of time, for the most part.
 

I submitted a feature request once. It's now four years old, still
open, and the last message it received was two years ago. (PR26061)
   
The average time for acceptance of a PR with a patch submission from an 
outsider such as ourselves is over 2 years, and by then the patch no 
longer fits, has to be reworked, and is about to become moot.
I still have the FSF paperwork in force, as far as I know, from over a 
decade ago, prior to my current employment.  Does it become valid again 
upon termination of employment?  My current employer has no problem with 
the FSF paperwork for employees whose primary job is maintenance of gnu 
software (with committee approval), but this does not extend to those of 
us for whom it is a secondary role.  There once was a survey requesting 
responses on how our FSF submissions compared before and after current 
employment began, but no summary of the results.


--
Tim Prince

Re: Autovectorizing does not work with classes

2008-10-07 Thread Tim Prince

Georg Martius wrote:
> Dear gcc developers,
> 
> I am new to this list. 
> I tried to use the auto-vectorization (4.2.1 (SUSE Linux)) but unfortunately 
> with limited success.
> My code is bassically a matrix library in C++. The vectorizer does not like 
> the member variables. Consider this code compiled with 
> gcc -ftree-vectorize -msse2 -ftree-vectorizer-verbose=5 
> -funsafe-math-optimizations
> that gives basically  "not vectorized: unhandled data-ref"
> 
> class P{
> public:
>   P() : m(5),n(3) {
> double *d = data;
> for (int i=0; i   d[i] = i/10.2;
>   }
>   void test(const double& sum);
> private:
>   int m;
>   int n;
>   double data[15];
> };
> 
> void P::test(const double& sum) {  
>   double *d = this->data;
>   for(int i=0; i d[i]+=sum;
>   }
> }
> 
> whereas the more or less equivalent C version works just fine:
> 
> int m=5;
> int n=3;
> double data[15];
> 
> void test(const double& sum) {  
>   int mn = m*n;
>   for(int i=0; i data[i]+=sum;
>   }
> }
> 
> 
> Is there a fundamental problem in using the vectorizer in C++?
> 

I don't see any C code above.  As another reply indicated, the most likely
C idiom would be to pass sum by value.  Alternatively, you could use a
local copy of sum, in cases where that is a problem.
The only fundamental vectorization problem I can think of which is
specific to C++ is the lack of a standard restrict keyword.  In g++,
__restrict__ is available.  A local copy (or value parameter) of sum
avoids a need for the compiler to recognize const or restrict as an
assurance of no value modification.
The loop has to have known fixed bounds at entry, in order to vectorize.
If your C++ style doesn't support that, e.g. by calculating the end value
outside the loop, as you show in your latter version, then you do have a
problem with vectorization.

Re: question. type long long

2008-10-12 Thread Tim Prince

Александр Струняшев wrote:
> Good afternoon.
> I need some help. As from what versions your compiler understand that
> "long long" is 64 bits ?
> 
> Best regards, Alexander
> 
> P.S. Sorry for my mistakes, I know English bad.

No need to be sorry about English, but the topic is OK for gcc-help, not
gcc development.
gcc was among the first compilers to support long long (always as 64-bit),
the only problem being that it was a gnu extension for g++. In that form,
the usage may not have settled down until g++ 4.1. The warnings for
attempting long long constants in 32-bit mode, without the LL suffix, have
been a subject of discussion:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13358
The warning doesn't mean that long long could be less than 64 bits; it
means the constant without the LL suffix is less than 64 bits.

Re: need to find functions definitions

2008-10-22 Thread Tim München

On Tuesday 21 October 2008 20:07:10 `VL wrote:
> Hello, ALL.
>
> I recently started to actively program using C and found that tools like
> ctags or cscope do not work properly for big projects. Quite ofthen they
> can't find function or symbol definition.  The problem here is that they
> don't use full code parsing, but just some sort of regular expressions.
>
> I need a tool for automatic code exploration that can at least find
> definition for every symbol without problems. If you know any - please
> tell.
>
> Now, to gcc. It seems to me that existing and working compiler is an ideal
> place to embedd such tool - since it already knows all the things required.
>
> I have one idea: i'm almost sure that inside gcc somewhere there is a place
> (function i beleive) that is called each time during compilation when
> definition of something (type, variable,function...) found and in this
> place gcc has context information - source file and line where this
> definition came from. So if i add something like printf("DEFINITION:
> %s,%s,%d\n", info->object_type,info->src_file,info->line) into that place,
> i will get information about every thing that compiler found.
>
> What i like more about this way of getting information about symbol
> definition is that i get the only reference to that part of source that was
> actually compiled. I.e. if there are a lot of #ifdef's, it's is hard to
> know what part of code will be used.
>
> So, my questions is:
>
> 1) Is it possible ? Is there a single place where all information is easily
> accessible ? 2) If yes - where is it and where can i find details about
> internals of gcc? 3) Any good alternatives for cscope/ctags? It seemed to
> me that
> eclipse has some good framework, but it looks to be too much integrated
> with it...
>
> Thank you!


Hi,


wouldn't it be easier to just compile with debug symbols (-g) and then look 
into the symbol table or into the DWARF debug information? Both can be done 
with the tool objdump contained in the binutils (normally installed on each 
linux), and there are libraries for both tasks to read and use the 
information in own applications. You'll get symbol (functions/methods, 
arguments, variables) names, addresses, types, etc.


Tim

Re: Backward Compatibility of RHEL Advanced Server and GCC

2008-10-29 Thread Tim Prince


Steven Bosscher wrote:

On Wed, Oct 29, 2008 at 6:19 AM, S. Suhasini
<[EMAIL PROTECTED]> wrote:
  

We would like to know whether the new version of the software (compiled with
the new GCC) can be deployed and run on the older setup with RHEL AS 3 and
GCC 2.96. We need not compile again on the older setup. Will there be any
run-time libraries dependency? Would be very grateful if we get a response
for this query.



It seems to me that this kind of question is best asked on a RedHat
support list, not on a list where compiler development is discussed.

FWIW, there is no "official" GCC 2.96, see http://gcc.gnu.org/gcc-2.96.html.

  
This might be partially topical on the gcc-help list.  If dynamic 
libraries are in use, there will be trouble.

Re: change to gcc from lcc

2008-11-14 Thread Tim München

On Friday 14 November 2008 10:09:22 Anna Sidera wrote:
> Hello,
>
> The following code works in lcc in windows but it does not work in gcc in
> unix. I think it is memory problem. In lcc there is an option to use more
> temporary memory than the default. Is there something similar in gcc?
>
> #include 
> #include 
> #include 
> #include 
> int main()
> {
> int i, j;
> int buffer1[250][100];
> for (i=0; i<250; i++) {
> for (j=0; j<100; j++) {
>  buffer1[i][j]=0;
> }
> }
> printf("\nThe program finished successfully\n");
> return 0;
> }
>
> Many Thanks,
> Anna

Anna,

the code you provided tries to allocate a huge chunk of memory on the stack. 
This is not the way things should be done. Even if the compiler allows 
for "using more temporary memory than the default", the solution is by no 
means portable. A way more elegant solution is to use memory on the heap:

int main()
{
int i, j;
int *buf = (int*) malloc (250 * 100 * sizeof(int));
for (i=0; i<250; i++) {
for (j=0; j<100; j++) {
buf[i][j]=0;
}
}
free (buf);
printf("\nYay! :D\n");
return 0;
}


Tim

Re: Cygwin support

2008-11-14 Thread Tim Prince

Brian Dessent wrote:

> Cygwin has been a secondary target for a number of years.  MinGW has
> been a secondary target since 4.3.  This generally means that they
> should be in fairly good shape, more or less.  To quote the docs:
> 
>> Our release criteria for the secondary platforms is:
>>
>> * The compiler bootstraps successfully, and the C++ runtime library 
>> builds.
>> * The DejaGNU testsuite has been run, and a substantial majority of the 
>> tests pass.
> 

> 
> More recently I've seen Danny Smith report that the IRA merge broke
> MinGW (and presumably Cygwin, since they share most of the same code)
> bootstrap.  I haven't tested this myself recently so I don't know if
> it's still broken or not.
> 

I've run the bootstrap and testsuite twice in the last month.  The
bootstrap failures are due to a broken #ifdef specific to cygwin in the
headers provided with cygwin, the requirement for a specific version of
autoconf (not available in setup), and the need to remove the -werror in
libstdc++ build (because of minor discrepancies in cygwin headers).  All
of those are easy to rectify, but fixes seem unlikely to be considered by
the decision makers.  However, the C++ testsuite results are unacceptable,
with many internal errors.
For some time now, gfortran has been broken for practical purposes, even
when it passes testsuite, as it seems to have a memory leak.  This shows
up in the public wiki binaries.
So, there are clear points for investigation of cygwin problems, and
submission of PRs, should you be interested.

>  Running the dejagnu testsuite on Cygwin is
> excruciatingly slow due to the penalty incurred from emulating fork. 

It runs over a weekend on a Pentium D which I brought back to life by
replacing the CPU cooler system.  I have no problem with running this if I
am in the office when the snapshot is released, but I think there is
little interest in fixing the problems which are specific to g++ on
cygwin, yet working gcc and gfortran aren't sufficient for gcc upgrades to
be accepted.  Support for 64-bit native looks like it will be limited to
mingw, so I no longer see a future for gcc on cygwin.

GCC 3.4.6 on x86_64: __builtin_frame_address(1) of topmost frame doesn't return 0x0

2008-11-25 Thread Tim München

Hi,


in binaries compiled with gcc 3.4.6 on an x86_64 machine, I get the following 
behaviour. I wrote a little testcase:

int main(int argc, char **argv)
{
  unsigned long addr;
  if ( (addr = (unsigned long)(__builtin_frame_address(0))) )
  {
printf ("0x%08lx\n", addr);
if ( (addr = (unsigned long)(__builtin_frame_address(1))) )
{
  printf ("0x%08lx\n", addr);
  if ( (addr = (unsigned long)(__builtin_frame_address(2))) )
  {
printf ("0x%08lx\n", addr);

// ... some more scopes ...
  }
}
  }
  return 0;
}

This code is a bit ugly, I made it that way because of the part in gcc's 
manpages:

"CC also has two builtins that can assist you, but which may or may not be 
implemented fully on your architecture, and those are __builtin_frame_address 
and __builtin_return_address. Both of which want an immediate integer level 
(by immediate, I mean it can't be a variable)."

- but it doesn't change the outcome of the test, anyway.

I ran the test on three machines with the following results:

1)

[EMAIL PROTECTED] ~]$ uname -m
i686
[EMAIL PROTECTED] ~]$ gcc -v
Lese Spezifikationen von /usr/lib/gcc/i386-redhat-linux/3.4.6/specs
Konfiguriert 
mit: ../configure --prefix=/usr --mandir=/usr/share/man 
--infodir=/usr/share/info --enable-shared --enable-threads=posix 
--disable-checking --with-system-zlib --enable-__cxa_atexit 
--disable-libunwind-exceptions --enable-java-awt=gtk --host=i386-redhat-linux
Thread-Modell: posix
gcc-Version 3.4.6 20060404 (Red Hat 3.4.6-10)
[EMAIL PROTECTED] ~]$ gcc -o test test.c && ./test
0xbfefc048
0xbfefc0a8
[EMAIL PROTECTED] ~]$

2)

[EMAIL PROTECTED] ~]$ uname -m
x86_64
[EMAIL PROTECTED] ~]$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured 
with: ../configure --prefix=/usr --mandir=/usr/share/man 
--infodir=/usr/share/info --enable-shared --enable-threads=posix 
--enable-checking=release --with-system-zlib --enable-__cxa_atexit 
--disable-libunwind-exceptions --enable-libgcj-multifile 
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk 
--disable-dssi --enable-plugin 
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic 
--host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)
[EMAIL PROTECTED] ~]$ gcc -o test test.c && ./test
0x7fffc400c8c0
[EMAIL PROTECTED] ~]$

3)

[EMAIL PROTECTED] ~]$ uname -m
x86_64
[EMAIL PROTECTED] ~]$ gcc -v
Reading specs from /usr/lib/gcc/x86_64-redhat-linux/3.4.6/specs
Configured 
with: ../configure --prefix=/usr --mandir=/usr/share/man 
--infodir=/usr/share/info --enable-shared --enable-threads=posix 
--disable-checking --with-system-zlib --enable-__cxa_atexit 
--disable-libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux
Thread model: posix
gcc version 3.4.6 20060404 (Red Hat 3.4.6-10)
[EMAIL PROTECTED] ~]$ gcc -o test test.c && ./test
0x7fb8c0
0x35952357a8
0x7fb9a8
0x7fbb7b
0x454d414e54534f48
Segmentation fault
[EMAIL PROTECTED] ~]$


So, on the 32bit machine and on the 64bit machine running gcc 4.x, the end of 
the stack is found (__builtin_frame_address(n) returned 0x0). The output is a 
bit different, apparently on the 32bit machine, the stackframe of the caller 
of main() is also found, but that is not important for the error.

On the 64bit machine using gcc 3.4.6 however, at some point, garbage (?) is 
returned.

My questions now are, is this known behaviour / a known issue? I didn't find a 
fitting patch. Or, can the fix for it be backported from gcc 4.x to 3.4.x? I 
cannot switch to gcc 4.x for some other reasons. If all this doesn't result 
in a solution, is there maybe another way for me to determine which 
stackframe is the topmost one? (Should I just compare the function name 
with "main"? That'd be a bit dirty, wouldn't it?)


Thanks,
Tim München

Re: Purpose of GCC Stack Padding?

2008-12-16 Thread Tim Prince


Andrew Tomazos wrote:

I've been studying the x86 compiled form of the following function:

void function()
{
char buffer[X];
}

where X = 0, 1, 2 .. 100

Naively, I would expect to see:

   pushl   %ebp
   movl%esp, %ebp
   subl$X, %esp
   leave
   ret

Instead, the stack appears to be padded:

For a buffer size of  0the stack size is   0
For a buffer size of  1 to   7 the stack size is  16
For a buffer size of  8 to  12 the stack size is  24
For a buffer size of 13 to  28 the stack size is  40
For a buffer size of 29 to  44 the stack size is  56
For a buffer size of 45 to  60 the stack size is  72
For a buffer size of 61 to  76 the stack size is  88
For a buffer size of 77 to  92 the stack size is 104
For a buffer size of 93 to 100 the stack size is 120

When X >= 8 gcc adds a stack corruption check (__stack_chk_fail),
which accounts for an extra 4 bytes of stack space in these cases.

This does not explain the rest of the padding.  Can anyone explain the
purpose of the rest of the padding?

  
This looks like more of a gcc-help question, trying to move the thread 
there.  Unless you over-ride defaults with -mpreferred-stack boundary 
(or -Os, which probably implies a change in stack boundary), or ask for 
a change on the basis of making a leaf function, you are generating 
alignment compatible with the use of SSE parallel instructions.  The 
stack, then, must be 16-byte aligned before entry and at exit, and also 
a buffer of 16 bytes or more must be 16-byte aligned.
I believe there is a move afoot to standardize the treatment for the 
most common x86 32-bit targets; that was done at the beginning for 
64-bit. Don't know if you are using x86 to imply 32-bit, in accordance 
with Windows terminology.

Re: Upgrade to GCC.4.3.2

2008-12-28 Thread Tim Prince

Philipp Thomas wrote:
> On Sun, 28 Dec 2008 14:24:22 -0500, you wrote:
> 
>> I have SLES9 and Linux-2.6.5-7.97 kernel install on i586 intel 32 bit 
>> machine.  The compiler is gcc-c++3.3.3-43.24.  I want to upgrade to 
>> GCC4.3.2.  My question are: Would this upgrade work with 
>> SLES9?
> 
> This is the wrong list for such questions. You should try a SUSE
> specific list like opens...@opensuse.org or
> opensuse-programm...@opensuse.org

gcc-help is a reasonable choice as well.

Re: gcc binary download

2009-01-15 Thread Tim Prince

Tobias Burnus wrote:

> 
> Otherwise, you could consider building GCC yourself, cf.
> http://gcc.gnu.org/install/. (Furthermore, some gfortran developers
> offer regular GCC builds, which are linked at
> http://gcc.gnu.org/wiki/GFortranBinaries; those are all unofficial
> builds, come without any warrantee/support, and due to, e.g., library
> issues they may not work on your system.)
> 
I believe the wiki builds include C and Fortran, but not C++, in view of
the additional limitations in supporting a new g++ on a reasonable range
of targets.  Even so, there may be minimum requirements on glibc and
binutils versions.

Re: Binary Autovectorization

2009-01-29 Thread Tim Prince

Rodrigo Dominguez wrote:

> I am looking at binary auto-vectorization or taking a binary and rewriting
> it to use SIMD instructions (either statically or dynamically).
That's a tall order, considering how much source level dependency
information is needed.  I don't know whether proprietary binary
translation projects currently under way promise to add vectorization, or
just to translate SIMD vector code to new ISA.

Re: -mfpmath=sse,387 is experimental ?

2009-03-16 Thread Tim Prince

Zuxy Meng wrote:
> Hi,
> 
> "Timothy Madden"  写入消息
!
>> I am sure having twice the number of registers (sse+387) would make a
>> big difference.
You're not counting the rename registers, you're talking about 32-bit mode
only, and you're discounting the different mode of accessing the registers.

>>
>> How would I know if my AMD Sempron 2200+ has separate execution units
>> for SSE and
>> FPU instructions, with independent registers ?
> 
> Most CPU use the same FP unit for both x87 and SIMD operations so it
> wouldn't give you double the performance. The only exception I know of
> is K6-2/3, whose x87 and 3DNow! units are separate.
> 
-march=pentium-m observed the preference of those CPUs for mixing the
types of code.  This was due more to the limited issue rate for SSE
instructions than to the expanded number of registers in use.  You are
welcome to test it on your CPU; however, AMD CPUs were designed to perform
well with SSE alone, particularly in 64-bit mode.

Re: GCC 4.4.0 Status Report (2009-03-13)

2009-03-24 Thread Tim Prince

Chris Lattner wrote:
> 
> On Mar 23, 2009, at 8:02 PM, Jeff Law wrote:
> 
>> Chris Lattner wrote:
>>>>>
>>>> These companies really don't care about FOSS in the same way GCC
>>>> developers do.   I'd be highly confident that this would still be a
>>>> serious issue for the majority of the companies I've interacted with
>>>> through the years.
>>>
>>> Hi Jeff,
>>>
>>> Can you please explain the differences you see between how GCC
>>> developers and other people think about FOSS?  I'm curious about your
>>> perception here, and what basis it is grounded on.
>>>
>> I'd divide customers into two broad camps.  Both camps are extremely
>> pragmatic, but they're focused on two totally different goals.
> 
> Thanks Jeff, I completely agree with you.  Those camps are very common
> in my experience as well.  Do you consider GCC developers to fall into
> one of these two categories, or do you see them as having a third
> perspective?  I know that many people have their own motivations and
> personal agenda (and it is hard to generalize) but I'm curious what you
> meant above.
> 
> Thanks!
> 
> -Chris
> 
>>
>>
>> The first camp sees FOSS toolkits as a means to help them sell more
>> widgets, typically processors & embedded development kits.  Their
>> belief is that a FOSS toolkit helps build a developer eco-system
>> around their widget, which in turn spurs development of consumable
>> devices which drive processor & embedded kit sales.   The key for
>> these guys is free, as in beer, widely available tools.  The fact that
>> the compiler & assorted utilities are open-source is largely irrelevant.
>>
>> The second broad camp I run into regularly are software developers
>> themselves building applications, most often for internal use, but
>> occasionally they're building software that is then licensed to their
>> customers.  They'd probably describe the compiler & associated
>> utilities as a set of hammers, screwdrivers and the like -- they're
>> just as happy using GCC as any other compiler so long as it works. 
>> The fact that the GNU tools are open source is completely irrelevant
>> to these guys.  They want to see standards compliance, abi
>> interoperability, and interoperability with other tools (such as
>> debuggers, profilers, guis, etc).  They're more than willing to swap
>> out one set of tools for another if it gives them some advantage. 
>> Note that an advantage isn't necessarily compile-time or runtime
>> performance -- it might be ease of use, which they believe allows
>> their junior level engineers to be more effective (this has come up
>> consistently over the last few years).
>>
>> Note that in neither case do they really care about the open-source
>> aspects of their toolchain (or for the most part the OS either).  
>> They may (and often do) like the commoditization of software that FOSS
>> tends to drive, but don't mistake that for caring about the open
>> source ideals -- it's merely cost-cutting.
>>
>> Jeff
>>
>>
> 
Software developers I deal with use gcc because it's a guaranteed included
part of the customer platforms they are targeting.  They're generally
looking for a 20% gain in performance plus support before adopting
commercial alternatives.  The GUIs they use don't live up to the
advertisements about ease of use.  This doesn't necessarily put them in
either of Jeff's camps.

Tim

Re: Minimum GMP/MPFR version bumps for GCC-4.5

2009-03-26 Thread Tim Prince

Kaveh R. Ghazi wrote:

>  What versions of GMP/MPFR do you get on
> your typical development box and how old are your distros?
> 
OpenSuSE 10.3 (originally released Oct. 07):
gmp-devel-4.2.1-58
gmp-devel-32bit-4.2.1-58
mpfr-2.2.1-45

Re: heise.de comment on 4.4.0 release

2009-04-25 Thread Tim Prince

Tobias Burnus wrote:
> Toon Moene wrote:
 Can somebody with access to SPEC sources confirm / deny and file a bug
 report, if appropriate?
I just started working on SPEC CPU2006 issues this week.

> Seemingly yes. To a certain extend this was by accident as "-msse3" was
> used, but it is on i586 only effective with -mfpmath=sse  (that is not
> completely obvious). By the way, my tests using the Polyhedron benchmark
> show that for 32bit, x87 and SSE are similarly fast, depending a lot on
> the test case thus it does not slow down the benchmark too much.
Certain AMD CPUs had shorter latencies for scalar single precision sse,
but generally the advantage of sse comes from vectorization.

> 
> If I understood correctly, the 32bit mode was used since the 64bit mode
> needs more than the available 2GB memory.
Certain commercial compilers make an effort to switch to 32-bit mode
automatically on several CPU2006 benchmarks, as they are too small to run
as fast in 64-bit mode.
> 
> Similarly, the option -funroll-loops was avoided as they expect that
> unrolling badly interacts with the small cache Atom processors have.
> (That CPU2006 runs that long, does not make testing different options
> that easy.)
I'm surprised that spec 2006 is considered relevant to Atom.  The entire
thing (base only) has been running under 10 hours on a dual quad core
system.  I've heard several times the sentiment that there ought to be an
"official" harness to run a single test, trying various options.

> I would have liked that the options were reported. For instance
> -ffast-math was not used out of fear that it results in too imprecise
> results causing SPEC to abort. (Admittedly, I'm also careful with that
> option, though I assume that -ffast-math works for SPEC.) On the other
> hand, certain flags implies by -ffast-math are already applied with -O1
> in some commercial compilers.
SPEC probably has been the biggest driver for inclusion of insane options
at default in commercial compilers. It's certainly not an example of
acceptable practice in writing portable code.  I have yet to find a
compiler which didn't fail at least one SPEC test, and I don't blame the
compilers.  There are dependencies on unusual C++ extensions, which
somehow weren't noticed before, examples of using "f77" as an excuse for
hiding one's intentions, and expectations of optimizations which have
little relevance for serious applications.
> 
> David Korn wrote:
>> They accused us of a too-hasty release.  My irony meter exploded!

Anyway, a fault in support for a not-so-open benchmark application seems
even less relevant in an open source effort than it is to compilers which
depend on ranking for sales success.

Re: Bootstrap broken by ppl/cloog config problem: finds non-system/non-standard "/include" dir

2009-05-06 Thread Tim Prince

Dave Korn wrote:

> 
>   Heh, I was just about to post that, only I was looking at $clooginc rather
> than $pplinc!  The same problem exists for both; I'm pretty sure we should
> fall back on $prefix if the --with option is empty.
> 

When I bootstrapped gcc 4.5 on cygwin yesterday, configure recognized the
newly installed ppl, but not the cloog.  The bootstrap completed
successfully, and I'm not looking a gift horse in the mouth.

Re: Bootstrap broken by ppl/cloog config problem: finds non-system/non-standard "/include" dir

2009-05-06 Thread Tim Prince

Dave Korn wrote:
> Tim Prince wrote:
>> Dave Korn wrote:
>>
>>>   Heh, I was just about to post that, only I was looking at $clooginc rather
>>> than $pplinc!  The same problem exists for both; I'm pretty sure we should
>>> fall back on $prefix if the --with option is empty.
>>>
>> When I bootstrapped gcc 4.5 on cygwin yesterday, configure recognized the
>> newly installed ppl, but not the cloog.  The bootstrap completed
>> successfully, and I'm not looking a gift horse in the mouth.
> 
>   You don't have a bogus /include dir, but I bet you'll find -I/include in 
> PPLINC.
> 
>   It would be interesting to know why it didn't spot cloog.  What's in your
> top-level $objdir/config.log?
> 

#include 

no such file
-I/include was set by configure.  As you say, there is something bogus here.

setup menu shows cloog installed in development category, but I can't find
any such include file.  Does this mean the cygwin distribution of cloog is
broken?

Re: Bootstrap broken by ppl/cloog config problem: finds non-system/non-standard "/include" dir

2009-05-06 Thread Tim Prince


Dave Korn wrote:

Tim Prince wrote:

  

#include 

no such file
-I/include was set by configure.  As you say, there is something bogus here.

setup menu shows cloog installed in development category, but I can't find
any such include file.  Does this mean the cygwin distribution of cloog is
broken?



  Did you make sure to get the -devel packages as well as the libs?  That's
the usual cause of this kind of problem.  I highly recommend the new version
of setup.exe that has a package-list search box :-)

cheers,
  DaveK
  
OK, I see there is a libcloog-devel in addition to the cloog Dev 
selection, guess that will fix it for cygwin.


I tried to build cloog for IA64 linux as well, gave up on include file 
parsing errors.

Re: [Fwd: Failure in bootstrapping gfortran-4.5 on Cygwin]

2009-05-08 Thread Tim Prince


Ian Lance Taylor wrote:

Angelo Graziosi  writes:

  

The current snapshot 4.5-20090507 fails to bootstrap on Cygwin:



  
It did bootstrap effortlessly for me, once I logged off to clear hung 
processes, with the usual disabling of strict warnings. I'll let 
testsuite run over the weekend.

Re: Link error ....redefinition of......

2009-06-01 Thread Tim München

On Tuesday 02 June 2009 08:16:35 Alex Luya wrote:
> I download source code for book < Analysis in C++ (Second Edition), /by Mark Allen Weiss>>
> from:http://users.cs.fiu.edu/~weiss/dsaa_c++/code/,try to compiler
> it,but got many errors,most of them say:
> .. previously declared here
> ...: redefinition of .
>
> I think template causes these errors,but how to fix it.


This is not the correct mailing list for such questions! Nevertheless, the 
reason for your compile errors is a simple one. Just drop the line

#include "StackAr.cpp"

from your header file. Why are you trying to include the implementation in the 
header? The other way round is how things work! (And you do have the header 
include in your implementation - why both directions?)



> ---
> My configuration:
> Ubuntu 9.04
> GCC version 4.3.3 (Ubuntu 4.3.3-5ubuntu4)
> Eclipse 3.4
> CDT:.5.0.2
> -
>
> Files  and error message are following:
>
> StackAr.h
> -
> #ifndef STACKAR_H
> #define STACKAR_H
>
> #include "../vector.h"
> #include "../dsexceptions.h"
>
> template 
> class Stack
> {
>   public:
> explicit Stack( int capacity = 10 );
> bool isEmpty( ) const;
> .
> #include "StackAr.cpp"
> #endif
>
> --
>
> StackAr.cpp
>
> #include "StackAr.h"
> template 
> Stack::Stack( int capacity ) : theArray( capacity )
> {
> topOfStack = -1;
> }
>
>  template 
> bool Stack::isEmpty( ) const
> {
> return topOfStack == -1;
> }
> ...
>
> Test.cpp
> #include 
> #include "StackAr.h"
> using namespace std;
>
> int main()
> {
> Stack s;
>
> for (int i = 0; i < 10; i++)
> s.push(i);
>
> while (!s.isEmpty())
> cout << s.topAndPop() << endl;
> return 0;
> }
>
>
> -
> error message:
>
>  Build of configuration Debug for project DACPP 
>
> make all
> Building file: ../src/stack/StackAr.cpp
> Invoking: GCC C++ Compiler
> g++ -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP
> -MF"src/stack/StackAr.d" -MT"src/stack/StackAr.d"
> -o"src/stack/StackAr.o" "../src/stack/StackAr.cpp"
> ../src/stack/StackAr.cpp:7: erreur: redefinition of
> ‘Stack::Stack(int)’
> ../src/stack/StackAr.cpp:7: erreur: ‘Stack::Stack(int)’
> previously declared here
> ../src/stack/StackAr.cpp:17: erreur: redefinition of ‘bool
> Stack::isEmpty() const’
> ../src/stack/StackAr.cpp:17: erreur: ‘bool Stack::isEmpty()
> const’ previously declared here
> ...



-- 
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
<>  <>
<> Tim München, M.Sc.muenc...@physik.uni-wuppertal.de   <>
<> Bergische Universitaet   <>
<> FB C - Physik Tel.: +49 (0)202 439-3521  <>
<> Gaussstr. 20  Fax : +49 (0)202 439-2811  <>
<> 42097 Wuppertal  <>
<>  <>
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Re: Failure building current 4.5 snapshot on Cygwin

2009-06-26 Thread Tim Prince

Angelo Graziosi wrote:
> I want to flag the following failure I have seen on Cygwin 1.5 trying to
> build current 4.5-20090625 gcc snapshot:

> checking whether the C compiler works... configure: error: in
> `/tmp/build/intl':
> configure: error: cannot run C compiled programs.
> If you meant to cross compile, use `--host'.
> See `config.log' for more details.

I met the same failure on Cygwin 1.7 with yesterday's and last week's
snapshots.  I didn't notice that it refers to intl/config.log, so will go
back and look, as you didn't show what happened there.

On a slightly related subject, I have shown that the libgfortran.dll.a and
libgomp.dll.a are broken on cygwin builds, including those released for
cygwin, as shown on the test case I submitted on cygwin list earlier this
week.  The -enable-shared has never been satisfactory for gfortran cygwin.

Re: Failure building current 4.5 snapshot on Cygwin

2009-06-26 Thread Tim Prince


Dave Korn wrote:

Angelo Graziosi wrote:
  

I want to flag the following failure I have seen on Cygwin 1.5 trying to
build current 4.5-20090625 gcc snapshot:



  So what's in config.log?  And what binutils are you using?

cheers,
  DaveK

  
In my case, it says no permission to execute a.exe.  However, I can run 
the intl configure and make from command line.  When I do that, and 
attempt to restart stage 2, it stops in liberty, and again I have to 
execute steps from command line.

Re: Failure building current 4.5 snapshot on Cygwin

2009-06-26 Thread Tim Prince


Kai Tietz wrote:

2009/6/26 Seiji Kachi :
  

Angelo Graziosi wrote:


Dave Korn ha scritto:
  

Angelo Graziosi wrote:


I want to flag the following failure I have seen on Cygwin 1.5 trying to
build current 4.5-20090625 gcc snapshot:
  

 So what's in config.log?  And what binutils are you using?


The config logs are attached, while binutils is the current in Cygwin-1.5,
i.e. 20080624-2.


Cheers,
Angelo.
  

I have also seen similar faulure, and the reason on my environment is as
follows.

(1) In my case, gcc build complete successfully.  But a.exe which is
compiled from the new compiler fails. Error message is

$ ./a.exe
bash: ./a.exe: Permission denied

Source code of a.exe is quite simple:
main()
{
 printf("Hello\n");
}

(2) This failuer occurres from gcc trunk r148408.  r148407 is OK.

(3) r148408 removed "#ifdef DEBUG_PUBTYPES_SECTION".  r148407 does not
generate debug_pubtypes section, but r148408 and later version generates
 debug_pubtypes section in object when we set debug option.

(4) gcc build sequence usually uses debug option.

(5) My cygwin environment seems not to accept debug_pubtypes section, and
pop up "Permission denied" error.

When I reverted "#ifdef DEBUG_PUBTYPES_SECTION" in dearf2out.c, the failuer
disappeared.

Does this failure occurr only on cygwin?

Regards,
Seiji Kachi




No, this bug appeared on all windows pe-coff targets. A fix for this
is already checked in yesterday on binutils. Could you try it with the
current binutils head version?

Cheers,
Kai

  
Is this supposed to be sufficient information for us to find that 
binutils?  I may be able to find an insider colleague, otherwise I would 
have no chance.

Re: Failure building current 4.5 snapshot on Cygwin

2009-06-26 Thread Tim Prince


Kai Tietz wrote:

2009/6/26 Tim Prince :
  

Kai Tietz wrote:


2009/6/26 Seiji Kachi :

  

Angelo Graziosi wrote:



Dave Korn ha scritto:

  

Angelo Graziosi wrote:



I want to flag the following failure I have seen on Cygwin 1.5 trying
to
build current 4.5-20090625 gcc snapshot:

  

 So what's in config.log?  And what binutils are you using?



The config logs are attached, while binutils is the current in
Cygwin-1.5,
i.e. 20080624-2.


Cheers,
Angelo.

  

I have also seen similar faulure, and the reason on my environment is as
follows.

(1) In my case, gcc build complete successfully.  But a.exe which is
compiled from the new compiler fails. Error message is

$ ./a.exe
bash: ./a.exe: Permission denied

Source code of a.exe is quite simple:
main()
{
 printf("Hello\n");
}

(2) This failuer occurres from gcc trunk r148408.  r148407 is OK.

(3) r148408 removed "#ifdef DEBUG_PUBTYPES_SECTION".  r148407 does not
generate debug_pubtypes section, but r148408 and later version generates
 debug_pubtypes section in object when we set debug option.

(4) gcc build sequence usually uses debug option.

(5) My cygwin environment seems not to accept debug_pubtypes section, and
pop up "Permission denied" error.

When I reverted "#ifdef DEBUG_PUBTYPES_SECTION" in dearf2out.c, the
failuer
disappeared.

Does this failure occurr only on cygwin?

Regards,
Seiji Kachi




No, this bug appeared on all windows pe-coff targets. A fix for this
is already checked in yesterday on binutils. Could you try it with the
current binutils head version?

Cheers,
Kai


  

Is this supposed to be sufficient information for us to find that binutils?
 I may be able to find an insider colleague, otherwise I would have no
chance.




Hello,

you can find the binutils project as usual under
http://sources.redhat.com/binutils/ . You can find on this page how
you are able to get current cvs version of binutils. This project
contains the gnu tools, like dlltool, as, objcopy, ld, etc.
The issue you are running in is reasoned by a failure in binutils
about setting correct section flags for debugging sections. By the
last change in gcc - it was the output of the .debug_pubtypes secton -
this issue was shown.
There is a patch already applied to binutils's repository head, which
should solve the issue described here in this thread. We from
mingw-w64 were fallen already over this issue and have taken care.

Cheers,
Kai

  
My colleague suggested building and installing last week's binutils 
release.   I did so, but it didn't affect the requirement to run each 
stage 2 configure individually from command line.

Thanks,
Tim

Re: random numbers

2009-07-07 Thread Tim Prince


ecrosbie wrote:

how do I  generate random numbers in  a f77  program?

Ed Crosbie

Re: random numbers

2009-07-07 Thread Tim Prince


ecrosbie wrote:

how do I  generate random numbers in  a f77  program?

Ed Crosbie

This subject isn't topical on the gcc development forum.  If you wish to 
use a gnu Fortran random number generator, please consider gfortran, 
which implements the language standard random number facility.

http://gcc.gnu.org/onlinedocs/gcc-4.4.0/gfortran/

questions might be asked on the gfortran list (follow-up set) or 
comp.lang.fortran

In addition, you will find plenty of other advice by using your web browser.

Re: optimizing a DSO

2010-05-28 Thread Tim Prince


On 5/28/2010 11:14 AM, Ian Lance Taylor wrote:

Quentin Neill  writes:

   

A little off topic, but by what facility does the compiler know the
linker (or assembler for that matter) is gnu?
 

When you run configure, you can specify --with-gnu-as and/or
--with-gnu-ld.  If you do, the compiler will assume the GNU assembler
or linker.  If you do not, the compiler will assume that you are not
using the GNU assembler or linker.  In this case the compiler will
normally use the common subset of command line options supported by
the native assembler and the GNU assembler.

In general that only affects the compiler behaviour on platforms which
support multiple assemblers and/or linkers.  E.g., on GNU/Linux, we
always assume the GNU assembler and linker.

There is an exception.  If you use --with-ld, the compiler will run
the linker with the -v option and grep for GNU in the output.  If it
finds it, it will assume it is the GNU linker.  The reason for this
exception is that --with-ld gives a linker which will always be used.
The assumption when no specific linker is specified is that you might
wind up using any linker available on the system, depending on the
value of PATH when running the compiler.

Ian
   
Is it reasonable to assume when the configure test reports using GNU 
linker, it has taken that "exception," even without a --with-ld 
specification?


--
Tim Prince

Re: gcc command line exceeds 8191 when building in XP

2010-07-19 Thread Tim Prince


On 7/19/2010 4:13 PM, IceColdBeer wrote:

Hi,

I'm building a project using GNU gcc, but the command line used to build
each source file sometimes exceeds 8191 characters, which is the maximum
supported command line length under Win XP.Even worst under Win 2000,
where the maximum command line length is limited to 2047 characters.

Can the GNU gcc read the build options from a file instead ?I have
searched, but cannot find an option in the documentation.

Thanks in advance,
ICB


   

redirecting to gcc-help.
The gcc builds for Windows themselves use a scheme for splitting the 
link into multiple steps in order to deal with command line length 
limits.  I would suggest adapting that.  Can't study it myself now while 
travelling.


--
Tim Prince

Re: x86 assembler syntax

2010-08-08 Thread Tim Prince


On 8/8/2010 10:21 PM, Rick C. Hodgin wrote:

All,

Is there an Intel-syntax compatible option for GCC or G++?  And if not,
why not?  It's so much cleaner than AT&T's.

- Rick C. Hodgin


   

I don't know how you get along without a search engine.  What about
http://tldp.org/HOWTO/Assembly-HOWTO/gas.html ?

--
Tim Prince

Re: food for optimizer developers

2010-08-10 Thread Tim Prince


On 8/10/2010 9:21 PM, Ralf W. Grosse-Kunstleve wrote:

Most of the time is spent in this function...

void
dlasr(
   str_cref side,
   str_cref pivot,
   str_cref direct,
   int const&  m,
   int const&  n,
   arr_cref  c,
   arr_cref  s,
   arr_ref  a,
   int const&  lda)

in this loop:

 FEM_DOSTEP(j, n - 1, 1, -1) {
   ctemp = c(j);
   stemp = s(j);
   if ((ctemp != one) || (stemp != zero)) {
 FEM_DO(i, 1, m) {
   temp = a(i, j + 1);
   a(i, j + 1) = ctemp * temp - stemp * a(i, j);
   a(i, j) = stemp * temp + ctemp * a(i, j);
 }
   }
 }

a(i, j) is implemented as

   T* elems_; // member

 T const&
 operator()(
   ssize_t i1,
   ssize_t i2) const
 {
   return elems_[dims_.index_1d(i1, i2)];
 }

with

   ssize_t all[Ndims]; // member
   ssize_t origin[Ndims]; // member

 size_t
 index_1d(
   ssize_t i1,
   ssize_t i2) const
 {
   return
   (i2 - origin[1]) * all[0]
 + (i1 - origin[0]);
 }

The array pointer is buried as elems_ member in the arr_ref<>  class template.
How can I apply __restrict in this case?

   
Do you mean you are adding an additional level of functions and hoping 
for efficient in-lining?   Your programming style is elusive, and your 
insistence on top posting will make this thread difficult to deal with.
The conditional inside the loop likely is even more difficult for C++ to 
optimize than Fortran. As already discussed, if you don't optimize 
otherwise, you will need __restrict to overcome aliasing concerns among 
a,c, and s.  If you want efficient C++, you will need a lot of hand 
optimization, and verification of the effect of each level of obscurity 
which you add.   How is this topic appropriate to gcc mail list?


--
Tim Prince

Re: End of GCC 4.6 Stage 1: October 27, 2010

2010-09-06 Thread Tim Prince


 On 9/6/2010 9:21 AM, Richard Guenther wrote:

On Mon, Sep 6, 2010 at 6:19 PM, NightStrike  wrote:

On Mon, Sep 6, 2010 at 5:21 AM, Richard Guenther  wrote:

On Mon, 6 Sep 2010, Tobias Burnus wrote:


Gerald Pfeifer wrote:

Do you have a pointer to testresults you'd like us to use for reference?

  From our release criteria, for secondary platforms we have:

• The compiler bootstraps successfully, and the C++ runtime library
  builds.
• The DejaGNU testsuite has been run, and a substantial majority of the
  tests pass.

See for instance:
   http://gcc.gnu.org/ml/gcc-testresults/2010-09/msg00295.html

There are no libstdc++ results in that.

Richard.

This is true.  I always run make check-gcc.  What should I be doing instead?

make -k check

make check-c++ runs both g++ and libstdc++-v3 testsuites.

--
Tim Prince

Re: Turn on -funroll-loops at -O3?

2011-01-21 Thread Tim Prince


On 1/21/2011 10:43 AM, H.J. Lu wrote:

Hi,

SInce -O3 turns on vectorizer, should it also turn on
-funroll-loops?

Only if a conservative default value for max-unroll-times is set 2<= 
value <= 4


--
Tim Prince

Re: Why doesn't vetorizer skips loop peeling/versioning for target supports hardware misaligned access?

2011-01-24 Thread Tim Prince


On 1/24/2011 5:21 AM, Bingfeng Mei wrote:

Hello,
Some of our target processors support complete hardware misaligned
memory access. I implemented movmisalignm patterns, and found
TARGET_SUPPORT_VECTOR_MISALIGNMENT (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
On 4.6) hook is based on checking these patterns. Somehow this
hook doesn't seem to be used. vect_enhance_data_refs_alignment
is called regardless whether the target has HW misaligned support
or not.

Shouldn't using HW misaligned memory access be better than
generating extra code for loop peeling/versioning? Or at least
if for some architectures it is not the case, we should have
a compiler hook to choose between them. BTW, I mainly work
on 4.5, maybe 4.6 has changed.

Thanks,
Bingfeng Mei

Peeling for alignment still presents a performance advantage on longer 
loops for the most common current CPUs.  Skipping the peeling is likely 
to be advantageous for short loops.
I've noticed that 4.6 can vectorize loops with multiple assignments, 
presumably taking advantage of misalignment support.  There's even a 
better performing choice of instructions for -march=corei7 misaligned 
access than is taken by other compilers, but that could be an accident.
At this point, I'd like to congratulate the developers for the progress 
already evident in 4.6.


--
Tim Prince

supporting finer grained -Wextra

2006-04-03 Thread Tim Janik


hi there.

i just enabled -Wextra to catch broken if statements,
i.e. to enable warnings on:
   *   An empty body occurs in an if or else statement.

however this unfortunately triggers other warnings that i can't
reasonably get rid of. here's a test snippet:

== test.c ==
typedef enum { VAL0 = 0, VAL1 = 1, VAL2 = 2 } PositiveEnum;
int main (int argc, char *argv[])
{
  PositiveEnum ev = VAL1;
  unsigned int limit = VAL2;
  if (ev >= 0)
{}
  if (ev <= limit)
{}
  if (1)
;
  return 0;
}
== test.c ==

compiled as C code, this will produce (with a 4.2 snapshot):
$ gcc -Wall -Wextra -Wno-unused -x c test.c
test.c: In function 'main':
test.c:6: warning: comparison of unsigned expression >= 0 is always true
test.c:11: warning: empty body in an if-statement

and compiled as C++ code:
$ gcc -Wall -Wextra -Wno-unused -x c++ test.c
test.c: In function 'int main(int, char**)':
test.c:8: warning: comparison between signed and unsigned integer expressions
test.c:11: warning: empty body in an if-statement

that means, for a header file that is used for C and C++ code, i simply
can't avoid one of those enum signedness warnings. simply because the enum
is treated as unsigened in C and as signed in C++.

now, aparently the enum related signedness warnings are unconditionally
enabled by -Wextra, as is the if-related warnings, from the man-page:
-Wextra
   [...] Print extra warning messages for these events:

*   An unsigned value is compared against zero with < or >=.

*   An empty body occurs in an if or else statement.

since the enum related signedness warnings are clearly bogus [1], i'd
like to request new warning options, that allow enabling/disabling of
the empty-body vs. unsigned-zero-cmp warnings independently. that way
i can preserve "empty body in an if-statement" while getting rid of the
useless enum comparison warnings.


[1] i'm aware that the enum signedness comparison warnings could be worked
around by explicit casts or by adding a negative enum value. this would
just create new problems though, because aside from worsen readability,
casts tend to "blinden" the compiler with regards to whole classes of
other bugs, and adding a dummy enum value would affect API and auto
generated documentation.

---
ciaoTJ

Re: Problem with type safety and the "sentinel" attribute

2006-06-09 Thread Tim Janik

thanks for the quick response Kaveh.

On Fri, 9 Jun 2006, Kaveh R. Ghazi wrote:

>  void print_string_array (const char *array_name,
> const char *string, ...) __attribute__
> ((__sentinel__));
>
>   print_string_array ("empty_array", NULL); /* gcc warns, but shouldn't */
>
> The only way out for keeping the sentinel attribute and avoiding the
> warning is using
>
> static void print_string_array (const char *array_name, ...)
> __attribute__ ((__sentinel__));

I think you could maintain typesafety and silence the warning by
keeping the more specific prototype and adding an extra NULL, e.g.:

print_string_array ("empty_array", NULL, NULL);

Doesn't seem elegant, but it does the job.

this is an option for a limited set of callers, yes.

> By the way, there is already an existing gcc bug, which is about the
> same thing (NULL passed within named args), but wants to have it the
> way it works now:
>
>   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21911

Correct, the feature as I envisioned it expected the sentinel to
appear in the variable arguments only.  This PR reflects when I found
out it didn't do that and got around to fixing it.  Note the "buggy"
behavior wasn't exactly what you wanted either because GCC got fooled
by a sentinel in *any* of the named arguments, not just the last one.

> so if it gets changed, then gcc might need to support both
>  - NULL termination within the last named parameter allowed
>  - NULL termination only allowed within varargs parameters (like it is
>now)

I'm not against this enhancement, but you need to specify a syntax
that allows the old behavior but also allows doing it your way.

Hmm, perhaps we could check for attribute "nonnull" on the last named
argument, if it exists then that can't be the sentinel, if it's not
there then it does what you want.  This is not completely backwards
compatible because anyone wanting the existing behavior has to add the
attribute nonnull.  But there's precedent for this when attribute
printf split out whether the format specifier could be null...

We could also create a new attribute name for the new behavior.  This
would preserve backwards compatibility.  I like this idea better.

i agree here. as far as the majority of the GLib and Gtk+ APIs are concerned,
we don't really need the flexibility of the sentinel attribute but rather
a compiler check on whether the last argument used in a function call
is NULL or 0 (regardless of whether it's the last named arg or already part
of the varargs list).
that's also why the actual sentinel wrapper in GLib looks like this:

  #define G_GNUC_NULL_TERMINATED __attribute__((__sentinel__))

so, if i was to make a call on this issue, i'd either introduce
__attribute__((__null_terminated__)) with the described semantics,
or have __attribute__((__sentinel__(-1))) work essentially like
__attribute__((__sentinel__(0))) while also accepting 0 in the position
of the last named argument.

Next you need to recruit someone to implement this enhancement, or
submit a patch. :-) Although given that you can silence the warning by
adding an extra NULL at the call site, I'm not sure it's worth it.

i would say this is definitely worth it, because the issue also shows up in
other code that is widely used:
  gpointerg_object_new   (GType   object_type,
  const gchar*first_property_name,
  ...);
that's for instance a function which is called in many projects. 
putting the burden on the caller is clearly the wrong trade off here.

so please take this as a vote for the worthiness of a fix ;)

--Kaveh
--
Kaveh R. Ghazi  [EMAIL PROTECTED]

---
ciaoTJ

Re: Are 8-byte ints guaranteed?

2006-07-16 Thread Tim Prince


Thomas Koenig wrote:

Hello world,

are there any platforms where gcc doesn't support 8-byte ints?
Can a front end depend on this?

This would make life easier for Fortran, for example, because we
could use INTEGER(KIND=8) for a lot of interfaces without having
to bother with checks for the presence of KIND=8 integers.

No doubt, there are such platforms, although I doubt there is sufficient 
interest in running gfortran on them.  Support for 64-bit integers on 
common 32-bit platforms is rather inefficient, when it is done by pairs 
of 32-bit integers.

Re: g77 problem for octave

2006-07-16 Thread Tim Prince


[EMAIL PROTECTED] wrote:

Dear Sir/Madame,

I have switched my OS to SuSE Linux 10.1 and for a while trying to install
"Octave" to my computer. Unfortunately, the error message below is the only
thing that i got.


Installing octave-2.1.64-3.i586[Local packages]
There are no installable providers of gcc-g77 for
octave-2.1.64-3.i586[Local packages]


On my computer, the installed version of gcc is 4.1.0-25 and i could not
find any compatible version of g77 to install. For the installation of
octave, i need exactly gcc-g77 not gcc-fortran.

Can you please help me to deal with this problem?

If you are so interested in using g77 rather than gfortran, it should be 
easy enough to grab gcc-3.4.x sources and build g77.  One would wonder 
why you dislike gfortran so much.

libgomp: Thread creation failed: Invalid argument

2006-08-15 Thread Tim Schmielau

I am very happy to see that gfortran from current gcc snapshots can 
successfully compile an 18000 lines Fortran 77 numerics program I wrote. 
Results are indeed the same as obtained with other compilers (g77, PGI, 
ifort), and also execution speed seems roughly comparable, although I 
haven't yet done any precise measurements.
A big thank you to the developers for that!

Now I am trying to get the program to run with OpenMP, which works 
(although slower than anticipated) with PGI and ifort compilers. While I 
can successfully build and execute small OpenMP test programs, starting my 
large program fails with the message

  libgomp: Thread creation failed: Invalid argument

resulting from a failing call to pthread_create() in libgomp/team.c.

Using gdb I see that pthread_create() is called with the same 
gomp_thread_attr argument as for the smaller, succeeding testcases.

strace shows that pthread_create() fails without trying to call clone(), 
while the clone() call of course does happen for the succeeding testcases.

How to further debug this problem?
I am currently using gcc-4.2-20060812 on i686 and x86_64 SuSE 10.0 Linux 
systems.

Thank you,
Tim

1 2 3 >

1 - 100 of 231 matches

Mail list logo