Re: build failure? (libgfortran)

2007-02-05 Thread Dorit Nuzman
Grigory Zagorodnev <[EMAIL PROTECTED]> wrote on 05/02/2007
08:18:34:

> Dorit Nuzman wrote:
> > I'm seeing this bootstrap failure on i686-pc-linux-gnu (revision
121579) -
> > something I'm doing wrong, or is anyone else seeing this?
>
> Yes. I see the same at x86_64-redhat-linux.
>

Thanks.

Turns out I see the same problem on ppc64-yellowdog-linux

(with system compiler configured as follows:
> gcc -v
Reading specs from /usr/lib/gcc-lib/ppc64-yellowdog-linux/3.3.3/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--disable-checking --with-system-zlib --enable-__cxa_atexit
--enable-symvers=gnu --host=ppc64-yellowdog-linux
--build=ppc64-yellowdog-linux --target=ppc64-yellowdog-linux
--with-cpu=default32 --enable-biarch
Thread model: posix
gcc version 3.3.3 (Yellow Dog Linux 3.3.3-16.ydl.8)
)

Any idea which patch might be causing this?


/home/dorit/mainline_dn/build/./gcc/gfortran
-B/home/dorit/mainline_dn/build/./gcc/
-B/home/dorit/mainline_dn/ppc64-yellowdog-linux/bin/
-B/home/dorit/mainline_dn/ppc64-yellowdog-linux/lib/ -isystem
/home/dorit/mainline_dn/ppc64-yellowdog-linux/include -isystem
/home/dorit/mainline_dn/ppc64-yellowdog-linux/sys-include -DHAVE_CONFIG_H
-I. -I../../../gcc/libgfortran -I. -iquote../../../gcc/libgfortran/io
-I../../../gcc/libgfortran/../gcc -I../../../gcc/libgfortran/../gcc/config
-I../.././gcc -D_GNU_SOURCE -I . -Wall -fno-repack-arrays -fno-underscoring
-fallow-leading-underscore -g -O2 -c
../../../gcc/libgfortran/intrinsics/f2c_specifics.F90 -o f2c_specifics.o
>/dev/null 2>&1

/home/dorit/mainline_dn/build/./gcc/xgcc
-B/home/dorit/mainline_dn/build/./gcc/
-B/home/dorit/mainline_dn/ppc64-yellowdog-linux/bin/
-B/home/dorit/mainline_dn/ppc64-yellowdog-linux/lib/ -isystem
/home/dorit/mainline_dn/ppc64-yellowdog-linux/include -isystem
/home/dorit/mainline_dn/ppc64-yellowdog-linux/sys-include -shared
.libs/compile_options.o .libs/environ.o .libs/error.o .libs/fpu.o
.libs/main.o .libs/memory.o .libs/pause.o .libs/stop.o .libs/string.o
.libs/select.o .libs/all_l4.o .libs/all_l8.o .libs/all_l16.o .libs/any_l4.o
.libs/any_l8.o .libs/any_l16.o .libs/count_4_l4.o .libs/count_8_l4.o
.libs/count_16_l4.o .libs/count_4_l8.o .libs/count_8_l8.o
.libs/count_16_l8.o .libs/count_4_l16.o .libs/count_8_l16.o
..

.libs/environ.o(.text+0x0): In function `gnu_dev_major':
/home/dorit/mainline_dn/build/./gcc/include/sys/sysmacros.h:56: multiple
definition of `gnu_dev_major'
.libs/compile_options.o(.text+0x0):/home/dorit/mainline_dn/build/./gcc/include/sys/sysmacros.h:56:
 first defined here
.libs/environ.o(.text+0x10): In function `gnu_dev_minor':
/home/dorit/mainline_dn/build/./gcc/include/sys/sysmacros.h:66: multiple
definition of `gnu_dev_minor'
.libs/compile_options.o(.text+0x10):/home/dorit/mainline_dn/build/./gcc/include/sys/sysmacros.h:66:
 first defined here
.libs/environ.o(.text+0x30): In function `gnu_dev_makedev':
/home/dorit/mainline_dn/build/./gcc/include/sys/sysmacros.h:76: multiple
definition of `gnu_dev_makedev'
...

collect2: ld returned 1 exit status
make[3]: *** [libgfortran.la] Error 1
make[3]: Leaving directory
`/home/dorit/mainline_dn/build/ppc64-yellowdog-linux/libgfortran'
make[2]: *** [all] Error 2
make[2]: Leaving directory
`/home/dorit/mainline_dn/build/ppc64-yellowdog-linux/libgfortran'
make[1]: *** [all-target-libgfortran] Error 2
make[1]: Leaving directory `/home/dorit/mainline_dn/build'
make: *** [all] Error 2


thanks,
dorit

> - Grigory



Re: gcc-4.1.2 RC1 build problem

2007-02-05 Thread Paolo Bonzini



The macro $(SYSTEM_HEADER_DIR) is used in a double-quoted context,
leading to nonportable "...`..."..."...`...",  see
.

Proposed untested patch.  (I also haven't checked whether there are
other instances of this issue in 'make install' code.)


If you or anybody can test this (better on an affected system, but even 
i686-pc-linux-gnu would be ok), ok for mainline and all affected release 
branches.


Paolo


Re: bugzilla error

2007-02-05 Thread Daniel Berlin

Clear your cookie, try again, and it should fix it.

(Sorry, i'm working on the cookie issues. There is something very odd going on)

On 2/5/07, Matthias Klose <[EMAIL PROTECTED]> wrote:

Got this page, trying to add an attachment to #30706.

  Matthias


This is GCC Bugzilla

This is GCC Bugzilla Version 2.20+
Internal Error

GCC Bugzilla has suffered an internal error. Please save this page and
send it to [EMAIL PROTECTED] with details of what you were doing at
the time this message appeared.

URL: http://gcc.gnu.org/bugzilla/attachment.cgi
undef error - Undefined subroutine Fh::slice at
data/template/template/en/default/global/hidden-fields.html.tmpl line 58
Actions:
Home | New | Search | bug # | Reports | Requests | New Account | Log In



Re: build failure? (libgfortran)

2007-02-05 Thread Richard Guenther

On 2/5/07, Dorit Nuzman <[EMAIL PROTECTED]> wrote:

Grigory Zagorodnev <[EMAIL PROTECTED]> wrote on 05/02/2007
08:18:34:

> Dorit Nuzman wrote:
> > I'm seeing this bootstrap failure on i686-pc-linux-gnu (revision
121579) -
> > something I'm doing wrong, or is anyone else seeing this?
>
> Yes. I see the same at x86_64-redhat-linux.
>

Thanks.

Turns out I see the same problem on ppc64-yellowdog-linux


This is because we now fixinclude sysmacros.h and libgfortran is built with
-std=gnu99.

Caused by:

2007-02-03  Bruce Korb <[EMAIL PROTECTED]>

   * inclhack.def (glibc_c99_inline_4): replace "extern" only if
   surrounded by space characters.

Richard.


Re: GCC 4.1.2 Status Report

2007-02-05 Thread Richard Guenther
On Sun, 4 Feb 2007, Mark Mitchell wrote:

> [Danny, Richard G., please see below.]
> 
> Thanks to all who have helped tested GCC 4.1.2 RC1 over the last week.
> 
> I've reviewed the list traffic and Bugzilla.  Sadly, there are a fair
> number of bugs.  Fortunately, most seem not to be new in 4.1.2, and
> therefore I don't consider them showstoppers.
> 
> The following issues seem to be the 4.1.1 regressions:
> 
>   http://gcc.gnu.org/wiki/GCC_4.1.2_Status
> 
> PR 28743 is only an ICE-on-invalid, so I'm not terribly concerned.
> 
> Daniel, 30088 is another aliasing problem.  IIIRC, you've in the past
> said that these were (a) hard to fix, and (b) uncommon.  Is this the
> same problem?  If so, do you still feel that (b) is true?  I'm
> suspicious, and I am afraid that we need to look for a conservative hack.

PR30708 popped up as well and maybe related (and easier to analyze as
it's C-only).  Both don't seem to be regressions on the branch, though.

> Richard, 30370 has a patch, but David is concerned that we test it on
> older GNU/Linux distributions, and suggested SLES9.  Would you be able
> to test that?

The patch bootstrapped and tested ok on SLES9-ppc.

> Richard, 29487 is an issue raised on HP-UX 10.10, but I'm concerned that
> it may reflect a bad decision about optimization of C++ functions that
> don't throw exceptions.  Would you please comment?

I believe reverting this patch on the branch is the only option (to
fixing the HP-UX assembler).  I don't know of any real-life issue
it fixes.

Richard.

-- 
Richard Guenther <[EMAIL PROTECTED]>
Novell / SUSE Labs


Re: build failure? (libgfortran)

2007-02-05 Thread Bruce Korb

On 2/5/07, Richard Guenther <[EMAIL PROTECTED]> wrote:

> > > I'm seeing this bootstrap failure on i686-pc-linux-gnu (revision
> 121579) -
> > > something I'm doing wrong, or is anyone else seeing this?


I *didn't* see it or I would not have committed.


This is because we now fixinclude sysmacros.h and libgfortran is built with
-std=gnu99.

Caused by:

2007-02-03  Bruce Korb <[EMAIL PROTECTED]>

* inclhack.def (glibc_c99_inline_4): replace "extern" only if
surrounded by space characters.


Which means there are cases where "extern" was suppressed and is no longer
suppressed.  Can someone please post a sysmacros.h fragment that should have
been fixed, but was not?  Thank you. - Bruce


Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Richard Guenther

Hi,

currently with -ftree-vectorize we generate for

  for (i=0; i<3; ++i)
  # SFT.4346_507 = VDEF 
  # SFT.4347_508 = VDEF 
  # SFT.4348_509 = VDEF 
d[i] = 0.0;

  for (j=0; j:;
  vect_cst_.4501_723 = { 0.0, 0.0 };
  vect_p.4506_724 = (vector double *) &D.76822;
  vect_p.4502_725 = vect_p.4506_724;

  # ivtmp.4508_728 = PHI <0(6), ivtmp.4508_729(11)>
  # ivtmp.4507_726 = PHI 
  # ivtmp.4461_601 = PHI <3(6), ivtmp.4461_485(11)>
  # SFT.4348_612 = PHI 
  # SFT.4347_611 = PHI 
  # SFT.4346_610 = PHI 
  # i_582 = PHI <0(6), i_118(11)>
:;
  # SFT.4346_507 = VDEF 
  # SFT.4347_508 = VDEF 
  # SFT.4348_509 = VDEF 
  *ivtmp.4507_726 = vect_cst_.4501_723;
  i_118 = i_582 + 1;
  ivtmp.4461_485 = ivtmp.4461_601 - 1;
  ivtmp.4507_727 = ivtmp.4507_726 + 16B;
  ivtmp.4508_729 = ivtmp.4508_728 + 1;
  if (ivtmp.4508_729 < 1) goto ; else goto ;

  # i_722 = PHI 
  # ivtmp.4461_717 = PHI 
:;

  # ivtmp.4461_706 = PHI 
  # SFT.4348_707 = PHI 
  # SFT.4347_708 = PHI 
  # SFT.4346_709 = PHI 
  # i_710 = PHI 
:;
  # SFT.4346_711 = VDEF 
  # SFT.4347_712 = VDEF 
  # SFT.4348_713 = VDEF 
  D.76822.D.44378.values[i_710] = 0.0;
  i_714 = i_710 + 1;
  ivtmp.4461_715 = ivtmp.4461_706 - 1;
  if (ivtmp.4461_715 != 0) goto ; else goto ;

...

and we are later not able to do constant propagation to the
second loop which we can do if we first unroll such small loops.

As we also only vectorize innermost loops I believe doing a
complete unrolling pass early will help in general (I pushed
for this some time ago).

Thoughts?

Thanks,
Richard.

-- 
Richard Guenther <[EMAIL PROTECTED]>
Novell / SUSE Labs


Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Paolo Bonzini



As we also only vectorize innermost loops I believe doing a
complete unrolling pass early will help in general (I pushed
for this some time ago).

Thoughts?


It might also hurt, though, since we don't have a basic block 
vectorizer.  IIUC the vectorizer is able to turn


  for (i = 0; i < 4; i++)
v[i] = 0.0;

into

  *(vector double *)v = (vector double){0.0, 0.0, 0.0, 0.0};

Paolo


Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Richard Guenther
On Mon, 5 Feb 2007, Paolo Bonzini wrote:

> 
> > As we also only vectorize innermost loops I believe doing a
> > complete unrolling pass early will help in general (I pushed
> > for this some time ago).
> > 
> > Thoughts?
> 
> It might also hurt, though, since we don't have a basic block vectorizer.
> IIUC the vectorizer is able to turn
> 
>   for (i = 0; i < 4; i++)
> v[i] = 0.0;
> 
> into
> 
>   *(vector double *)v = (vector double){0.0, 0.0, 0.0, 0.0};

That's true.  But we can not do constant propagation out of this
(and the vectorizer leaves us with a lot of cruft which is only
removed much later).

The above case would also ask for an early vectorization pass if the
loop was wrapped into another.

Finding a good heuristic for which loops to completely unroll early
is not easy, though for odd small numbers of iterations it is
probably always profitable.

Richard.

-- 
Richard Guenther <[EMAIL PROTECTED]>
Novell / SUSE Labs


Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Jan Hubicka
> 
> Hi,
> 
> currently with -ftree-vectorize we generate for
> 
>   for (i=0; i<3; ++i)
>   # SFT.4346_507 = VDEF 
>   # SFT.4347_508 = VDEF 
>   # SFT.4348_509 = VDEF 
> d[i] = 0.0;

Also Tomas' patch is supposed to catch this special case and convert it
into memset that should be subsequently optimized into assignment that
should be good enough (which reminds me that I forgot to merge the
memset part of stringop optimizations).

Perhaps this can be made a bit more generic and construct INIT_EXPRs for
small arrays directly from Tomas's pass (going from memset to assignment
works just in special cases).  Tomas, what is the status of your patch?
> 
>   for (j=0; j x[j] = d;
> 
> (that is, zero a small vector and use that to initialize an array
> of vectors)
> 
> :;
>   vect_cst_.4501_723 = { 0.0, 0.0 };
>   vect_p.4506_724 = (vector double *) &D.76822;
>   vect_p.4502_725 = vect_p.4506_724;
> 
>   # ivtmp.4508_728 = PHI <0(6), ivtmp.4508_729(11)>
>   # ivtmp.4507_726 = PHI 
>   # ivtmp.4461_601 = PHI <3(6), ivtmp.4461_485(11)>
>   # SFT.4348_612 = PHI 
>   # SFT.4347_611 = PHI 
>   # SFT.4346_610 = PHI 
>   # i_582 = PHI <0(6), i_118(11)>
> :;
>   # SFT.4346_507 = VDEF 
>   # SFT.4347_508 = VDEF 
>   # SFT.4348_509 = VDEF 
>   *ivtmp.4507_726 = vect_cst_.4501_723;
>   i_118 = i_582 + 1;
>   ivtmp.4461_485 = ivtmp.4461_601 - 1;
>   ivtmp.4507_727 = ivtmp.4507_726 + 16B;
>   ivtmp.4508_729 = ivtmp.4508_728 + 1;
>   if (ivtmp.4508_729 < 1) goto ; else goto ;
> 
>   # i_722 = PHI 
>   # ivtmp.4461_717 = PHI 
> :;
> 
>   # ivtmp.4461_706 = PHI 
>   # SFT.4348_707 = PHI 
>   # SFT.4347_708 = PHI 
>   # SFT.4346_709 = PHI 
>   # i_710 = PHI 
> :;
>   # SFT.4346_711 = VDEF 
>   # SFT.4347_712 = VDEF 
>   # SFT.4348_713 = VDEF 
>   D.76822.D.44378.values[i_710] = 0.0;
>   i_714 = i_710 + 1;
>   ivtmp.4461_715 = ivtmp.4461_706 - 1;
>   if (ivtmp.4461_715 != 0) goto ; else goto ;
> 
> ...
> 
> and we are later not able to do constant propagation to the
> second loop which we can do if we first unroll such small loops.
> 
> As we also only vectorize innermost loops I believe doing a
> complete unrolling pass early will help in general (I pushed
> for this some time ago).

Did you run some benchmarks?

Honza
> 
> Thoughts?
> 
> Thanks,
> Richard.
> 
> -- 
> Richard Guenther <[EMAIL PROTECTED]>
> Novell / SUSE Labs


Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Richard Guenther
On Mon, 5 Feb 2007, Jan Hubicka wrote:

> > 
> > Hi,
> > 
> > currently with -ftree-vectorize we generate for
> > 
> >   for (i=0; i<3; ++i)
> >   # SFT.4346_507 = VDEF 
> >   # SFT.4347_508 = VDEF 
> >   # SFT.4348_509 = VDEF 
> > d[i] = 0.0;
> 
> Also Tomas' patch is supposed to catch this special case and convert it
> into memset that should be subsequently optimized into assignment that
> should be good enough (which reminds me that I forgot to merge the
> memset part of stringop optimizations).
> 
> Perhaps this can be made a bit more generic and construct INIT_EXPRs for
> small arrays directly from Tomas's pass (going from memset to assignment
> works just in special cases).  Tomas, what is the status of your patch?

It would be certainly interesting to make constant propagation work in 
this case (though after vectorization aliasing is in an "interesting" 
state).

> Did you run some benchmarks?

Not yet - I'm looking at the C++ SPEC 2006 benchmarks at the moment
and using vectorization there seems to do a lot of collateral damage
(maybe not measurable though).

Richard.

-- 
Richard Guenther <[EMAIL PROTECTED]>
Novell / SUSE Labs


Re: GCC 4.1.2 Status Report

2007-02-05 Thread Daniel Berlin

On 2/4/07, Mark Mitchell <[EMAIL PROTECTED]> wrote:

[Danny, Richard G., please see below.]

Thanks to all who have helped tested GCC 4.1.2 RC1 over the last week.

I've reviewed the list traffic and Bugzilla.  Sadly, there are a fair
number of bugs.  Fortunately, most seem not to be new in 4.1.2, and
therefore I don't consider them showstoppers.

The following issues seem to be the 4.1.1 regressions:

  http://gcc.gnu.org/wiki/GCC_4.1.2_Status

PR 28743 is only an ICE-on-invalid, so I'm not terribly concerned.

Daniel, 30088 is another aliasing problem.  IIIRC, you've in the past
said that these were (a) hard to fix, and (b) uncommon.  Is this the
same problem?  If so, do you still feel that (b) is true?  I'm
suspicious, and I am afraid that we need to look for a conservative hack.


It's certainly true that people will discover more and more aliasing
bugs the harder they work 4.1 :)
There is always the possibility of turning off the pruning, which will
drop our performance, but will hide most of the latent bugs we later
fixed through rewrites well enough that they can't be triggered (the
4.1 optimizers aren't aggressive enough).


Re: GCC 4.1.2 Status Report

2007-02-05 Thread Mark Mitchell
Daniel Berlin wrote:

>> Daniel, 30088 is another aliasing problem.  IIIRC, you've in the past
>> said that these were (a) hard to fix, and (b) uncommon.  Is this the
>> same problem?  If so, do you still feel that (b) is true?  I'm
>> suspicious, and I am afraid that we need to look for a conservative hack.
> 
> It's certainly true that people will discover more and more aliasing
> bugs the harder they work 4.1 :)

Do you think that PR 30088 is another instance of the same problem, and
that therefore turning off the pruning will fix it?

> There is always the possibility of turning off the pruning, which will
> drop our performance, but will hide most of the latent bugs we later
> fixed through rewrites well enough that they can't be triggered (the
> 4.1 optimizers aren't aggressive enough).

Is it convenient for you (or Richard?) to measure that on SPEC?
(Richard, thank you very much for stepping up to help with the various
issues that I've raised for 4.1.2!)  Or, have we already done so, and
I've just forgotten?  I'm very mindful of the import or performance, but
if we think that these aliasing bugs are going to affect reasonably
large amounts of code (which I'm starting to think), then shipping the
compiler as is seems like a bad idea.

(Yes, there's a slippery slope argument whereby we turn off all
optimization, since all optimization passes may have bugs.  But, if I
understand correctly, the aliasing algorithm in 4.1 has relatively
fundamental problems, which is rather different.)

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Dorit Nuzman
Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 18:16:05:

> On Mon, 5 Feb 2007, Jan Hubicka wrote:
...
> > Did you run some benchmarks?
>
> Not yet - I'm looking at the C++ SPEC 2006 benchmarks at the moment
> and using vectorization there seems to do a lot of collateral damage
> (maybe not measurable though).
>

Interesting. In SPEC 2000 there is also a hot small loop in the only C++
benchmark (eon), which get vectorized, and as a result degrades
performance. We really should not vectorize such loops, and the solution
is:
1. FORNOW: use --param min-vect-loop-bound=2 (or some value greater than
0).
2. SOON: rely on the vectorizer to do the cost analysis and decide not to
vectorize such loops, using a cost model - this is in the works.

dorit

> Richard.
>
> --
> Richard Guenther <[EMAIL PROTECTED]>
> Novell / SUSE Labs



Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Dorit Nuzman
Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:

> On Mon, 5 Feb 2007, Paolo Bonzini wrote:
>
> >
> > > As we also only vectorize innermost loops I believe doing a
> > > complete unrolling pass early will help in general (I pushed
> > > for this some time ago).
> > >
> > > Thoughts?
> >
> > It might also hurt, though, since we don't have a basic block
vectorizer.
> > IIUC the vectorizer is able to turn
> >
> >   for (i = 0; i < 4; i++)
> > v[i] = 0.0;
> >
> > into
> >
> >   *(vector double *)v = (vector double){0.0, 0.0, 0.0, 0.0};
>
> That's true.

That's going to change once this project goes in: "(3.2) Straight-line code
vectorization" from http://gcc.gnu.org/wiki/AutovectBranchOptimizations. In
fact, I think in autovect-branch, if you unroll the above loop it should
get vectorized already. Ira - is that really the case?

> But we can not do constant propagation out of this
> (and the vectorizer leaves us with a lot of cruft which is only
> removed much later).
>

this calls for improving/extending constant propagation, and also improving
the vectorizer to convey information that it has on constants and ranges
that are otherwise hard to figure out. For example, when the VF is 4 and we
peel before the loop to align memory-accesses, and after the loop to align
the number of iterations, we know that the maximum number of iterations of
the peeled loops is 3, but I don't think we let the rest of the compiler
know about that.

> The above case would also ask for an early vectorization pass if the
> loop was wrapped into another.
>

not sure I understand what you mean?

> Finding a good heuristic for which loops to completely unroll early
> is not easy, though for odd small numbers of iterations it is
> probably always profitable.
>

In general I prefer to leave such decisions to the vectorizer - the
vectorizer (with a proper cost model, that is now being built) should be
able to decide not to vectorize too small loops, leaving them to the
subsequent complete-loop-unrolling pass

dorit

> Richard.
>
> --
> Richard Guenther <[EMAIL PROTECTED]>
> Novell / SUSE Labs



Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Dorit Nuzman
Hi Richard,

Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:27:03:

...
>
> ...
>
> and we are later not able to do constant propagation to the
> second loop which we can do if we first unroll such small loops.
>
> As we also only vectorize innermost loops

by the way, we are working on vectorization of outer-loops

> I believe doing a
> complete unrolling pass early will help in general (I pushed
> for this some time ago).
>
> Thoughts?
>

My initial thought was that it's probably not the right thing to do
because:

1. the problems with constant propagation and aliasing really calls for
improving/extending these optimization (and probably also have the
vectorizer convey more of the information it has), rather than work around
the problem.
2. the vectorizer should be able to decide, based on a cost model, whether
it is profitable to vectorize a loop, and if not - as in the case above -
leave it unvectorized.
3. In the meantime, you can use --param min-vect-loop-bound=2 to disallow
vectorization of this loop. In fact, maybe we should make that the deafult
(instead of the current default - 0).

However...,

I have seen cases in which complete unrolling before vectorization enabled
constant propagation, which in turn enabled significant simplification of
the code, thereby, in fact making a previously unvectorizable loop (at
least on some targets, due to the presence of divisions, unsupported in the
vector unit), into a loop (in which the divisions were replaced with
constants), that can be vectorized.

Also, given that we are working on "SLP" kind of technology (straight line
code vectorization), which would make vectorization less sensitive to
unrolling, I think maybe it's not such a bad idea after all... One option
is to increase the default value of --param min-vect-loop-bound for now,
and when SLP is incorporated, go ahead and schedule early complete
unrolling. However, since SLP implementation may take some time (hopefully
within the time frame of 4.3 though) - we could just go ahead and schedule
early complete unrolling right now. (I can't believe I'm in favor of this
idea, but that loop I was talking about before - improved by a factor over
20x when early complete unrolling + subsequent vectorization were
applied...)

dorit


> Thanks,
> Richard.
>
> --
> Richard Guenther <[EMAIL PROTECTED]>
> Novell / SUSE Labs



Re: GCC 4.1.2 Status Report

2007-02-05 Thread Richard Guenther

On 2/5/07, Mark Mitchell <[EMAIL PROTECTED]> wrote:

Daniel Berlin wrote:

>> Daniel, 30088 is another aliasing problem.  IIIRC, you've in the past
>> said that these were (a) hard to fix, and (b) uncommon.  Is this the
>> same problem?  If so, do you still feel that (b) is true?  I'm
>> suspicious, and I am afraid that we need to look for a conservative hack.
>
> It's certainly true that people will discover more and more aliasing
> bugs the harder they work 4.1 :)

Do you think that PR 30088 is another instance of the same problem, and
that therefore turning off the pruning will fix it?


Disabling pruning will also increase memory-usage and compile-time.


> There is always the possibility of turning off the pruning, which will
> drop our performance, but will hide most of the latent bugs we later
> fixed through rewrites well enough that they can't be triggered (the
> 4.1 optimizers aren't aggressive enough).

Is it convenient for you (or Richard?) to measure that on SPEC?
(Richard, thank you very much for stepping up to help with the various
issues that I've raised for 4.1.2!)  Or, have we already done so, and
I've just forgotten?  I'm very mindful of the import or performance, but
if we think that these aliasing bugs are going to affect reasonably
large amounts of code (which I'm starting to think), then shipping the
compiler as is seems like a bad idea.

(Yes, there's a slippery slope argument whereby we turn off all
optimization, since all optimization passes may have bugs.  But, if I
understand correctly, the aliasing algorithm in 4.1 has relatively
fundamental problems, which is rather different.)


I don't think we need to go this way - there is a workaround available
(use -fno-strict-aliasing) and there are not enough problems to warrant
this.

Richard.


Re: GCC 4.1.2 Status Report

2007-02-05 Thread Mark Mitchell
Richard Guenther wrote:

>> > It's certainly true that people will discover more and more aliasing
>> > bugs the harder they work 4.1 :)
>>
>> Do you think that PR 30088 is another instance of the same problem, and
>> that therefore turning off the pruning will fix it?
> 
> Disabling pruning will also increase memory-usage and compile-time.

You indicated earlier that you didn't think 30088 was a regression on
the branch.  That's an excellent point.  I had it on my list of
regressions from 4.1.[01], but perhaps I was misinformed when I put it
on the list.  Given that, I don't think we need to worry about it for
4.1.2; it's just one of several wrong-code regressions...

> I don't think we need to go this way - there is a workaround available
> (use -fno-strict-aliasing) and there are not enough problems to warrant
> this.

For the record, I don't think the workaround argument is as strong,
though.  When the user compiles a large application and it doesn't work,
there's no hint that -fno-strict-aliasing is the work-around.  It's not
like an ICE that makes you think "Hmm, maybe I should turn off that
pass, or compile this file with -O0".

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


ANNOUNCE: Gelato ICE GCC track, San Jose, CA, April 16-18, 2007

2007-02-05 Thread Mark K. Smith
The following GCC track is part of the Gelato ICE (Itanium Conference
& Expo) technical program, April 16-18, 2007, San Jose, CA. All
interested GCC developers are invited to attend .
A working list of speakers and topics can be found here:


This year there is a strong focus on Linux. Andrew Morton and Wim
Coekaerts, Senior Director for Linux Engineering at Oracle, are
keynote speakers. In addition to the GCC track, there are tracks
covering the Linux IA-64 kernel, virtualization, tools and tuning,
multi-core programming, and research.

GCC Track at Gelato ICE:

- Update on Scheduler Work & Discussion of New Software Pipelining
Work, Arutyun Avetisyan, Russian Academy of Science
- GPL2 and GPL3, Dan Berlin, Google
- Update on the Gelato GCC Build Farm, Matthieu Delahaye, 
Gelato Central Operations
- Update on Prefetching Work, Zdenek Dvorak, SuSE
- Interprocedural Optimization Framework, Jan Hubicka, SuSE
- Update on Superblock Work, Bob Kidd, University of Illinois
- GCC and Osprey Update, Shin-Ming Liu, HP
- Compiling Debian Using GCC 4.2 and Osprey, Martin Michlmayr, Debian
- Update on Alias Analysis Work, Diego Novillo, Redhat
- Update on LTO, Kenneth Zadeck, NaturalBridge



Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Zdenek Dvorak
Hello,

> >As we also only vectorize innermost loops I believe doing a
> >complete unrolling pass early will help in general (I pushed
> >for this some time ago).
> >
> >Thoughts?
> 
> It might also hurt, though, since we don't have a basic block 
> vectorizer.  IIUC the vectorizer is able to turn
> 
>   for (i = 0; i < 4; i++)
> v[i] = 0.0;
> 
> into
> 
>   *(vector double *)v = (vector double){0.0, 0.0, 0.0, 0.0};

I intentionally put the cunroll pass after vectorizer to enable this.  I
guess we might choose to rather unroll the loop, and leave creation of
the vector operations on some kind of later combine/straight-line code
vectorization pass.

Zdenek


RE: testing GCC 4.2 on IA64 using Debian as a test suite <--- correction

2007-02-05 Thread Mark K. Smith
> Eight IA64 specific and 10 generic GCC defects previously unknown
> were identified. All these bugs have been reported to the GCC bug 
> tracker together with test cases and have all been fixed.

Correction/clarification: All IA-64 specific bugs have been fixed.



gcc-4.1-20070205 is now available

2007-02-05 Thread gccadmin
Snapshot gcc-4.1-20070205 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20070205/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.1 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch 
revision 121620

You'll find:

gcc-4.1-20070205.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.1-20070205.tar.bz2 C front end and core compiler

gcc-ada-4.1-20070205.tar.bz2  Ada front end and runtime

gcc-fortran-4.1-20070205.tar.bz2  Fortran front end and runtime

gcc-g++-4.1-20070205.tar.bz2  C++ front end and runtime

gcc-java-4.1-20070205.tar.bz2 Java front end and runtime

gcc-objc-4.1-20070205.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.1-20070205.tar.bz2The GCC testsuite

Diffs from 4.1-20070129 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.1
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Scheduling an early complete loop unrolling pass?

2007-02-05 Thread Dorit Nuzman
> Hi Richard,
>
>
...
> However...,
>
> I have seen cases in which complete unrolling before vectorization
enabled
> constant propagation, which in turn enabled significant simplification of
> the code, thereby, in fact making a previously unvectorizable loop (at
> least on some targets, due to the presence of divisions, unsupported in
the
> vector unit), into a loop (in which the divisions were replaced with
> constants), that can be vectorized.
>
> Also, given that we are working on "SLP" kind of technology (straight
line
> code vectorization), which would make vectorization less sensitive to
> unrolling, I think maybe it's not such a bad idea after all... One option
> is to increase the default value of --param min-vect-loop-bound for now,
> and when SLP is incorporated, go ahead and schedule early complete
> unrolling. However, since SLP implementation may take some time
(hopefully
> within the time frame of 4.3 though) - we could just go ahead and
schedule
> early complete unrolling right now. (I can't believe I'm in favor of this
> idea, but that loop I was talking about before - improved by a factor
over
> 20x when early complete unrolling + subsequent vectorization were
> applied...)
>

After sleeping on it, it actually makes a lot of sense to me to schedule
complete loop unrolling before vectorization - I think it would either
simplify loops (sometimes creating more opportunities for vectorization),
or prevent vectorization of loops we probably don't want to vectorize
anyhow, and even that - only temporarily - until we have straight-line-code
vectorization in place. So I'm all for it.

dorit

> dorit
>
>
> > Thanks,
> > Richard.
> >
> > --
> > Richard Guenther <[EMAIL PROTECTED]>
> > Novell / SUSE Labs
>