Re: (new) Failure building GFortran (Cygwin)

2008-07-11 Thread Raksit Ashok
On Fri, Jul 11, 2008 at 1:09 AM, Angelo Graziosi
<[EMAIL PROTECTED]> wrote:
> Gabriel Dos Reis ha scritto:
>>
>> On Tue, Jul 8, 2008 at 3:48 AM, Angelo Graziosi
>> <> wrote:
>>>
>>> Ian Lance Taylor ha scritto:

 This is OK, with a ChangeLog entry, if it passes bootstrap with the
 appropriate configure option.
>>>
>>> The following bootstraps rev. 137613, having configured as
>>>
>>> ${gcc_dir}/configure --prefix="${prefix_dir}" \
>>>--exec-prefix="${eprefix_dir}" \
>>>--sysconfdir="${sysconf_dir}" \
>>>--libdir="${lib_dir}" \
>>>--libexecdir="${libexec_dir}" \
>>>--mandir="${man_dir}" \
>>>--infodir="${info_dir}" \
>>>--enable-languages=c,fortran \
>>>--enable-bootstrap \
>>>--enable-decimal-float=bid \
>>>--enable-libgomp \
>>>--enable-threads \
>>>--enable-sjlj-exceptions \
>>>--enable-version-specific-runtime-libs \
>>>--enable-nls \
>>>--enable-checking=release \
>>>--disable-fixed-point \
>>>--disable-libmudflap \
>>>--disable-shared \
>>>--disable-win32-registry \
>>>--with-system-zlib \
>>>--without-included-gettext \
>>>--without-x
>>>
>>> ===
>>> gcc/ChangeLog:
>>> 2008-07-08  Angelo Graziosi  <>
>>>
>>>
>>>   * ggc-page.c (alloc_page):
>>>   Substituting xmalloc, xcalloc with
>>>   XNEWVEC and XCNEWVAR macros which add the
>>>   needed casts.
>>>
>>>
>>>
>>> --- gcc.orig/gcc/ggc-page.c  2008-06-29 06:39:16.0 +0200
>>> +++ gcc/gcc/ggc-page.c   2008-07-08 09:00:20.90625 +0200
>>> @@ -799,7 +799,7 @@
>>>   alloc_size = GGC_QUIRE_SIZE * G.pagesize;
>>>  else
>>>   alloc_size = entry_size + G.pagesize - 1;
>>> -  allocation = xmalloc (alloc_size);
>>> +  allocation = XNEWVEC (char, alloc_size);
>>
>> OK.
>>
>>>  page = (char *) (((size_t) allocation + G.pagesize - 1) &
>>> -G.pagesize);
>>>  head_slop = page - allocation;
>>> @@ -842,7 +842,7 @@
>>> struct page_entry *e, *f = G.free_pages;
>>> for (a = enda - G.pagesize; a != page; a -= G.pagesize)
>>>   {
>>> - e = xcalloc (1, page_entry_size);
>>> + e = XCNEWVAR (struct page_entry, page_entry_size);
>>
>> OK.
>>
>>
>>> e->order = order;
>>> e->bytes = G.pagesize;
>>> e->page = a;
>>> ===
>
> If the patch is OK, may someone apply/commit it so we can test it also with
> the next 4.4 snapshot?

Committed on Angelo's behalf as revision 137722. (I was running into
the same build failure on cygwin, and I validated that this patch
fixes the build failure and it bootstraps on i686-pc-cygwin and
i686-linux).

Thanks,
Raksit



>
>
> Thanks,
>   Angelo.
>
> ---
> Pace รจ gioia!
>
>


Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Raksit Ashok
On Thu, Jul 24, 2008 at 1:03 AM, Agner Fog <[EMAIL PROTECTED]> wrote:
> Dennis Clarke wrote:
>>The Sun Studio 12 compiler with Solaris 10 on AMD Opteron or
>>UltraSparc beats GCC in almost every single test case that I have
>>seen.
>
> This is memcpy on Solaris:
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/i386/gen/memcpy.s
>
> It uses exactly the same method as memcpy on gcc libc, with only minor
> differences that have no influence on performance.

There is a more optimized version for 64-bit:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/memcpy.s

I think this looks similar to your implementation, Agner.

-raksit

>
>> Also, you have provided no data at all.
>
> I have linked to the data rather than copying it here to save space on the
> mailing list. Here is the link again:
> http://www.agner.org/optimize/optimizing_cpp.pdf  section 2.6, page 12.
>
>> So your assertions are those of a marketing person at the moment.
>
> Who sounds like a marketing person, you or me? :-)
>
>> Please post some code that can be compiled and then tested with high
>> resolution timers and perhaps
>> we can compare notes.
>
> Here is my code, again:
> http://www.agner.org/optimize/asmlib.zip
> My test results, referred to above, uses the "core clock cycles" performance
> counter on Intel and RDTSC on AMD. It's the highest resolution you can get.
> Feel free to do you own tests, it's as simple as linking my library into
> your test program.
>
> Tim Prince wrote:
>>you identify the library you tested only as "ubuntu g++ 4.2.3."
> Where can I see the libc version?
>
>>The corresponding 64-bit linux will see vastly different levels of
>> performance, depending on the
>>glibc version, as it doesn't use a builtin string move.
> Yes, this is exactly what my tests show. 64-bit libc is better than 32-bit
> libc, but still 3-4 times slower than the best library for unaligned
> operands on an Intel.
>
>>Certain newer CPUs aim to improve performance of the 32-bit gcc builtin
>> string moves, but don't
>> entirely eliminate the situations where it isn't optimum.
>
> The Intel manuals are not clear about this. Intel Optimization reference
> manual says:
>>In most cases, applications should take advantage of the default memory
>> routines provided by Intel compilers.
> What an excellent advice - the Intel compiler puts in a library with an
> automatic run-slowly-on-AMD feature!
> The Intel library does not use rep movs when running on an Intel CPU.
>
> The AMD software optimization guide mentions specific situations where rep
> movs is optimal. However, my tests on an Opteron (K8) tell that rep movs is
> never optimal on AMD either. I have no access to test it on the new AMD K10,
> but I expect the XMM register code to run much faster on K10 than on K8
> because K10 has 128-bit data paths where K8 has only 64-bit.
>
> Evidently, the problem with memcpy has been ignored for years, see
> http://softwarecommunity.intel.com/Wiki/Linux/719.htm
>
>