from:"Jeff"

How to add a NOP instruction in each basic block for obj code that gcc generates

2006-08-10 Thread jeff jeff


Hi all,

I'm doing an experiment with gcc.  I need to modify gcc so that a NOP
instruction will be inserted into each basic block in binary code that
gcc generates.  I know this sounds weird but it is part of my
experiment.  As I'm unfamiliar with gcc, is there someone willing to
help me out and do a quick hack for me?

Thanks in advance,
Jeff

Re: How to add a NOP instruction in each basic block for obj code that gcc generates

2006-08-10 Thread Jeff

2006/8/10, Paolo Bonzini <[EMAIL PROTECTED]>:

jeff jeff wrote:
> Hi all,
>
> I'm doing an experiment with gcc.  I need to modify gcc so that a NOP
> instruction will be inserted into each basic block in binary code that
> gcc generates.  I know this sounds weird but it is part of my
> experiment.  As I'm unfamiliar with gcc, is there someone willing to
> help me out and do a quick hack for me?

You can probably do so much more easily by modifying the assembly
language.  That is, instead of letting the compiler produce a .o, you
produce a .s file (using the -S option), run some perl or awk script on
the assembly, and compile it again (gcc accepts .s as well as .c).

If you care, you can write a shell script that does these three steps
automatically, and receives the same command line as gcc (or a suitable
subset).

Otherwise, you can use the "-B" option to replace "as" with your own
executable or shell script.  This shell script would run the perl or awk
script on the input, and call the system "as" with the output.  To
understand what's going on (i.e. debugging), in turn, the "-###" option
shows you which commands the gcc driver is executing ("cc1" is the
compiler proper, "as" is the assembler", etc.).

What's in the perl or awk script?  To find basic block boundaries,
search for things like "L123".  If you need the nop at the beginning,
you need to look for jump tables and not insert the nop there.  If you
need the nop at the end, you can blindly insert one at the end of a jump
table, but on the other hand you will have to insert it before jump
instructions.  If you need more information, please ask on a
Perl/awk/shell-scripting newsgroups or mailing lists.

Paolo

Paolo,

Thanks a lot for your reply.  You provided a good alternative.  I need
to use gcc to compile a big project in my experiment.  In the
makefile, some source code are compiled to .o files and others are
compiled to .S file.  I'm wondering how much effort I need, provided I
have the script.  I'll look into further.

It would be cleaner if I know how to modify the gcc source code and
let it insert a nop to each basic block.  This shouldn't be a hard job
for an experienced gcc developer, should this?

Jeff

Re: How to add a NOP instruction in each basic block for obj code that gcc generates

2006-08-10 Thread Jeff


If you don't want to change the generated code other than inserting
the nops, and you can restrict yourself to a processor which does not
need to track addresses to avoid out-of-range branches, then you could
approximate what you want by emitting a nop in final_scan_insn when
you see a CODE_LABEL, after the label.  You'll need to also emit a nop
in the first basic block in the function.  That probably won't be the
precise set of basic blocks as the compiler sees them, but basic block
is a somewhat nebulous concept and that may be good enough for your
purposes, whatever they are.


Thanks, Ian.  I don't want the NOPs to affect gcc's optimization.

I've found the function final_scan_insn, which is in ./gcc/final.c.
Here is the code snippet related to CODE_LABEL:

   case CODE_LABEL:
 /* The target port might emit labels in the output function for
some insn, e.g. sh.c output_branchy_insn.  */
 if (CODE_LABEL_NUMBER (insn) <= max_labelno)
   {
 int align = LABEL_TO_ALIGNMENT (insn);
#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
 int max_skip = LABEL_TO_MAX_SKIP (insn);
#endif

 if (align && NEXT_INSN (insn))
   {
#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
 ASM_OUTPUT_MAX_SKIP_ALIGN (file, align, max_skip);
#else
#ifdef ASM_OUTPUT_ALIGN_WITH_NOP
 ASM_OUTPUT_ALIGN_WITH_NOP (file, align);
#else
 ASM_OUTPUT_ALIGN (file, align);
#endif
#endif
   }
   }

Which function should I use in order to emit a nop?

Thanks,
Jeff

Re: How to add a NOP instruction in each basic block for obj code that gcc generates

2006-08-11 Thread Jeff


The simplest way is going to be something like
   fprintf (asm_out_file, "\tnop\n");


I added fprintf (asm_out_file, "\tnop\n"); to the end of case
CODE_LABEL.  Then I recompile the gcc.  Unfortunately, it doesn't seem
that a NOP was inserted.  Any ideaes?

   case CODE_LABEL:
 /* The target port might emit labels in the output function for
some insn, e.g. sh.c output_branchy_insn.  */
 if (CODE_LABEL_NUMBER (insn) <= max_labelno)
   {
 int align = LABEL_TO_ALIGNMENT (insn);
#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
 int max_skip = LABEL_TO_MAX_SKIP (insn);
#endif

 if (align && NEXT_INSN (insn))
   {
#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
 ASM_OUTPUT_MAX_SKIP_ALIGN (file, align, max_skip);
#else
#ifdef ASM_OUTPUT_ALIGN_WITH_NOP
 ASM_OUTPUT_ALIGN_WITH_NOP (file, align);
#else
 ASM_OUTPUT_ALIGN (file, align);
#endif
#endif
   }
   }
   fprintf (asm_out_file, "\tnop\n");

Solaris 9, GCC 4.1.1 and GCC 3.3.5 ... --disable-shared at build time?

2006-10-30 Thread Jeff Blaine


I'm backed into a corner here and really not sure what the
proper path out is.

-- Our production GCC is 3.3.5.  It was built with default
   args.  Previously we ran 2.95.3.  You can perhaps realize
   my surprise when I found that a lot of apps we had built
   with this GCC 3.3.5 had libgcc_s.so linked dynamically
   to them.  You can perhaps also realize my surprise when
   I came to this conclusion after a lot of stuff broke when
   I expired our GCC 3.3.5 install in favor of 4.1.1.

-- But okay.  This process at least made one thing clear.
   We need to offer our users multiple GCC versions.  Some
   want 3.3.x, some want to test 4.1.1's pedantic nature,
   etc.

-- So I says to my self, "Self, when you go to build the
   new multiple GCCs in the new production language area,
   build them with --disable-shared so N tens of apps are
   not depending on your GCC staying put in order for them
   to function (!??).

   I build GCC 3.3.5 and 4.1.1, both with --disable-shared.
   I do this for Solaris 9, 10, Linux 2.4 for i686, and
   Linux 2.4 for AMD64.  Yes, a hoot.  Weeks pass during
   this time and the leaves begin to fall.

   Oh, and the Solaris ones were built to reference the
   Sun 'as' and 'ld' (/usr/ccs/bin).

-- In order to redo all of the "broken because they're linked
   to libgcc_s.so" apps, I set my PATH to use the new
   compilers (the ones that were built with --disable-shared).
   I find that my life is hell, as just about half of every-
   thing I try to build under Solaris 9 does not build.

   I get text relocation errors from our built libz.a,
   I fail to build subversion for mysterious reasons/errors,
   I get Python 2.4.x to build fine without libgcc_s.so
   linked to it, then I drop a Modules/Setup.local in place,
   make again to build the modules, and everything goes to
   hell with a new ./python that is now magically linked
   to libgcc_s.so (the old one we have to keep around until
   our apps are rebuilt).

It would seem that GCC 3.3.5 + Sun as + Sun ld do not play
nice at all with libraries previously created with GNU
binutils.

So... could someone elaborate on what it is I am doing that
is so wrong?  What is the successful recipe for using GCC
3.3.5 + 4.1.1 and/or binutils under Solaris?

Re: Newlib _ctype_ alias kludge now invalid due to PR middle-end/15700 fix.

2005-03-17 Thread Jeff Johnston

Giovanni Bajo wrote:
Hans-Peter Nilsson <[EMAIL PROTECTED]> wrote:

So, the previously-questionable newlib alias-to-offset-in-table
kludge is finally judged invalid.  This is a heads-up for newlib
users.  IMHO it's not a GCC bug, though there's surely going to
be some commotion.  Maybe a NEWS item is called for, I dunno.

It will be in NEWS, since RTH already updated
http://gcc.gnu.org/gcc-4.0/changes.html. I hope newlib will be promptly fixed.
Giovanni Bajo
I have just checked in a patch to newlib that changes the ctype macros to use 
__ctype_ptr instead of _ctype_.  In addition, a configuration check is made to 
see whether the array aliasing trick can be used or not.

The code allows for backward compatibility except in the case where the old code 
is using negative offsets and the current version of newlib is built with a 
compiler that does not support the array aliasing trick.

Corinna, if this causes any Cygwin issues, please let me know.
-- Jeff J.

Successful gcc4.0.0 build (Redhat 9. Kernel 2.4.25)

2005-04-25 Thread Jeff Clifford

make bootstrap successful build info:
config.guess states:
i686-pc-linux-gnu
gcc -v states:
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /tmp/downloads/gcc-4.0.0/configure 
--prefix=/apps/Linux/gcc400 --program-suffix=400 
--with-local-prefix=/usr/include --enable-threads
Thread model: posix
gcc version 4.0.0

/etc/issue states:
Red Hat Linux release 9 (Shrike)
Kernel \r on an \m
uname -a states:
Linux parsley 2.4.25-3dneg #1 SMP Tue Sep 7 11:16:44 BST 2004 i686 i686 
i386 GNU/Linux  -- (This is essentially the 2.4.25 kernel with some nfs 
patches)

rpm -q glibc states:
glibc-2.3.2-11.9

Notes:
This build worked for installing sitewide over nfs fine.  --prefix and 
--program-suffix worked fine.

--with-local-prefix=/usr/include had to be specified as the default of 
/usr/local/include is not right for certain Linux distributions (like 
Redhat).


Jeff Clifford

Successful gcc4.0.1 build (Redhat 9. Kernel 2.4.25)

2005-07-15 Thread Jeff Clifford


make bootstrap successful build info:


config.guess states:
i686-pc-linux-gnu

gcc -v states:
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /tmp/downloads/gcc-4.0.1/configure 
-prefix=/apps/Linux/gcc401 --program-suffix=401

Thread model: posix
gcc version 4.0.1

/etc/issue states:
Red Hat Linux release 9 (Shrike)
Kernel \r on an \m

uname -a states:
Linux parsley 2.4.25-3dneg #1 SMP Tue Sep 7 11:16:44 BST 2004 i686 i686 
i386 GNU/Linux  -- (This is essentially the 2.4.25 kernel with some nfs 
patches)


rpm -q glibc states:
glibc-2.3.2-11.9



Notes:

This build worked for installing sitewide over nfs fine.  --prefix and 
--program-suffix worked fine.




Jeff Clifford

Successful gcc4.0.2 build (Redhat 9, Kernel 2.4.25)

2005-10-12 Thread Jeff Clifford


make bootstrap successful build info:


config.guess states:
i686-pc-linux-gnu

gcc -v states:
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /tmp/downloads/gcc-4.0.2/configure 
-prefix=/apps/Linux/gcc402 --program-suffix=402

Thread model: posix
gcc version 4.0.2

/etc/issue states:
Red Hat Linux release 9 (Shrike)
Kernel \r on an \m

uname -a states:
Linux parsley 2.4.25-3dneg #1 SMP Tue Sep 7 11:16:44 BST 2004 i686 i686 
i386 GNU/Linux  -- (This is essentially the 2.4.25 kernel with some nfs 
patches)


rpm -q glibc states:
glibc-2.3.2-11.9



Notes:

This build worked for installing sitewide over nfs fine.  --prefix and 
--program-suffix worked fine.




Jeff Clifford

glibc compilation error

2005-10-26 Thread Jeff Stevens

I am trying to cross compile GCC for an AMCC 440SP
platform (powerpc-linux).  Binutils and bootstrap GCC
compile fine, but when I make glibc it errors out with
the following:

 snippet 

if test -r
/opt/luan2/toolchain/build/glibc/csu/abi-tag.h.new;
then mv -f
/opt/luan2/toolchain/build/glibc/csu/abi-tag.h.new
/opt/luan2/toolchain/build/glibc/csu/abi-tag.h; \
else echo >&2 'This configuration not matched in
../abi-tags'; exit 1; fi
gawk -f ../scripts/gen-as-const.awk
../linuxthreads/sysdeps/powerpc/tcb-offsets.sym \
| powerpc-linux-gcc -S -o
/opt/luan2/toolchain/build/glibc/tcb-offsets.hT3
-std=gnu99 -O2 -Wall -Winline -Wstrict-prototypes
-Wwrite-strings -g -mnew-mnemonics  -I../include
-I. -I/opt/luan2/toolchain/build/glibc/csu -I..
-I../libio  -I/opt/luan2/toolchain/build/glibc
-I../sysdeps/powerpc/powerpc32/elf
-I../sysdeps/powerpc/elf
-I../linuxthreads/sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../linuxthreads/sysdeps/unix/sysv/linux/powerpc
-I../linuxthreads/sysdeps/unix/sysv/linux
-I../linuxthreads/sysdeps/pthread -I../sysdeps/pthread
-I../linuxthreads/sysdeps/unix/sysv
-I../linuxthreads/sysdeps/unix
-I../linuxthreads/sysdeps/powerpc/powerpc32
-I../linuxthreads/sysdeps/powerpc
-I../sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../sysdeps/unix/sysv/linux/powerpc
-I../sysdeps/unix/sysv/linux -I../sysdeps/gnu
-I../sysdeps/unix/common -I../sysdeps/unix/mman
-I../sysdeps/unix/inet -I../sysdeps/unix/sysv
-I../sysdeps/unix/powerpc -I../sysdeps/unix
-I../sysdeps/posix -I../sysdeps/powerpc/powerpc32/fpu
-I../sysdeps/powerpc/powerpc32
-I../sysdeps/wordsize-32 -I../sysdeps/powerpc/soft-fp
-I../sysdeps/powerpc/fpu -I../sysdeps/powerpc
-I../sysdeps/ieee754/flt-32
-I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754
-I../sysdeps/generic/elf -I../sysdeps/generic
-nostdinc -isystem
/opt/luan2/toolchain/bin/lib/gcc/powerpc-linux/4.0.2/include
-isystem
/opt/luan2/toolchain/source/linux-2.6.13/include/
-D_LIBC_REENTRANT -include ../include/libc-symbols.h  
-DHAVE_INITFINI -x c - \
-MD -MP -MF
/opt/luan2/toolchain/build/glibc/tcb-offsets.h.dT -MT
'/opt/luan2/toolchain/build/glibc/tcb-offsets.h.d
/opt/luan2/toolchain/build/glibc/tcb-offsets.h'
: In function dummy:
:11: warning: asm operand 0 probably doesnt
match constraints
:11: error: impossible constraint in asm
make[2]: ***
[/opt/luan2/toolchain/build/glibc/tcb-offsets.h] Error
1
make[2]: Leaving directory
`/opt/luan2/toolchain/source/glibc-2.3.5/csu'
make[1]: *** [csu/subdir_lib] Error 2
make[1]: Leaving directory
`/opt/luan2/toolchain/source/glibc-2.3.5'
make: *** [all] Error 2

 snippet 


Here is the configuration I run from a separate
build/glibc directory:

../../source/glibc-2.3.5/configure
--prefix=/opt/luan2/toolchain/bin
--target=powerpc-linux --host=powerpc-linux
--enable-add-ons=linuxthreads
--with-headers=/opt/luan2/toolchain/source/linux-2.6.13/include/
--with-binutils=/opt/luan2/toolchain/bin/powerpc-linux/bin

This seems to complete without any issues.

It seems that gcc is having issues with the following
line in gen-as-const.awk:

printf "asm (\"@@@name@@@%s@@@value@@@%%0@@@end@@@\" :
: \"i\" (%s));\n", name, $0;


Is my configure line incorrect, or have I maybe
incorrectly configured bootstrap gcc prior to building
glibc?

Thanks,
   Jeff Stevens



__ 
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

HowTo Cross Compile GCC on x86 Host for PowerPC Target

2005-10-26 Thread Jeff Stevens

Is there a HowTo out there on how to cross compile GCC
to run on another platform?  I have an x86 host
running linux, and an embedded PowerPC 440SP target
running linux.  I would like to compile GCC to run on
the target but am having some difficulties.  I have
compiled the cross compiler fine, but when I try to
compile a native compiler, it acts just like the cross
compiler (runs on the host and not the target).  All I
did was re-run gcc configure and "make all install". 
Here is the configuration I ran:

../../source/gcc-3.4.4/configure
--target=powerpc-linux --host=powerpc-linux
--prefix=/opt/luan2/toolchain/bin --enable-shared
--enable-threads --enable-languages=c

I'm obviously missing something, but can't seem to
find anything on the internet that explains
cross-compiling gcc for another target.

Thanks,
   Jeff Stevens



__ 
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

RE: HowTo Cross Compile GCC on x86 Host for PowerPC Target

2005-10-26 Thread Jeff Stevens

Yes I added the cross-compiler to the path and created
a separate build directory (ppc_gcc).  

Thanks,
   Jeff Stevens

--- Dave Korn <[EMAIL PROTECTED]> wrote:

> Dave Korn wrote:
> > Jeff Stevens wrote:
> >> Is there a HowTo out there on how to cross
> compile GCC
> >> to run on another platform?  I have an x86 host
> >> running linux, and an embedded PowerPC 440SP
> target
> >> running linux.  I would like to compile GCC to
> run on
> >> the target but am having some difficulties.  I
> have
> >> compiled the cross compiler fine, but when I try
> to
> >> compile a native compiler, it acts just like the
> cross
> >> compiler (runs on the host and not the target). 
> All I
> > 
> >   *All* compilers "run on the host"; the term
> "host" is defined as "the
> > machine on which the compiler runs".  The target
> is the machine on which
> > the _generated_ code runs.  So for a native
> compiler, host==target, and
> > for a cross-compiler, host!=target.
> 
>   Doh.  I misread this; I see now that what you mean
> is you wanted a native
> compiler on the target.  
> 
> >> did was re-run gcc configure and "make all
> install".
> >> Here is the configuration I ran:
> >> 
> >> ../../source/gcc-3.4.4/configure
> >> --target=powerpc-linux --host=powerpc-linux
> >> --prefix=/opt/luan2/toolchain/bin --enable-shared
> >> --enable-threads --enable-languages=c
> 
>   So, this should have worked.  Did you perhaps
> re-build in the same directory
> that you had already configured the cross-compiler
> in without first running
> "make clean" perhaps?  Was the powerpc-linux cross
> compiler placed in your
> $PATH setting, so that configure could find the
> powerpc-linux-gcc executable?
> 
>   [ This is OT for this list really; we really
> should take it to crossgcc ]
> 
> 
> cheers,
>   DaveK
> -- 
> Can't think of a witty .sigline today
> 
> 




__ 
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

Re: Howto Cross Compile GCC to run on PPC Platform

2005-11-03 Thread Jeff Stevens

I am using the AMCC 440SP processor.  I went and
bought "Building Embedded Linux Systems" by Karim
Yaghmour.  It seems to be a pretty complete book, and
I have gotten the cross-compiler completely installed,
but it doesn't get into installing a native compiler. 
However, I tried cross compiling gcc by first running
this configure line:

../gcc-3.4.4/configure
--build=`../gcc-3.4.4/config.guess`
--target=powerpc-linux --host=powerpc-linux
--prefix=${PREFIX} --enable-languages=c

and then a make all.  The make went fine, and
completed without any errors.  However, when I ran
'make install' I got the following error:

powerpc-linux-gcc: installation problem, cannot exec
`/opt/recorder/tools/libexec/gcc/powerpc-linux/3.4.4/collect2':
Exec format error
make[2]: *** [nof/libgcc_s_nof.so] Error 1
make[2]: Leaving directory
`/opt/recorder/build-tools/build-native-gcc/gcc'
make[1]: *** [stmp-multilib] Error 2
make[1]: Leaving directory
`/opt/recorder/build-tools/build-native-gcc/gcc'
make: *** [install-gcc] Error 2

How do I install the native compiler?

Thanks,
   Jeff Stevens

--- Clemens Koller <[EMAIL PROTECTED]> wrote:

> Hello, Jeff!
> 
> >I am trying to compile GCC on an x86 platform
> to
> > run natively on an embedded PPC platform.
> 
> What CPU do you use?
> I am currently working on an mpc8540 natively (from
> harddisk).
> And have a current toolchain up and running.
> I can recommend the latest Linux-From-Scratch
> documentation
> to get an idea of what to do.
> 
> >  I am able
> > to compile gcc as a cross compiler (to run on
> x86),
> > but can't seem to get it to cross compile gcc (to
> run
> > on ppc).  Does anyone know of a good HowTo to do
> this?
> >  I'm currently downloading the source distro of
> ELDK,
> > so if it's already in there I'll find it, but if
> there
> > is one elsewhere online please let me know.
> 
> I've started with the ELDK 3.1 too. And updated it
> step by step to the latest versions.
> 
> Greets,
> 
> Clemens Koller
> ___
> R&D Imaging Devices
> Anagramm GmbH
> Rupert-Mayer-Str. 45/1
> 81379 Muenchen
> Germany
> 
> http://www.anagramm.de
> Phone: +49-89-741518-50
> Fax: +49-89-741518-19
> 





__ 
Start your day with Yahoo! - Make it your home page! 
http://www.yahoo.com/r/hs

Re: Howto Cross Compile GCC to run on PPC Platform

2005-11-03 Thread Jeff Stevens

I am creating the target tree on my host, so that I
can later transfer it to a USB storage device.  I was
going to manually move everything, but only saw one
binary, xgcc.  Is that all, or aren't there some other
utilities that go along with it?  I just didn't know
exactly what to copy and where to copy it to.  When I
built glibc, those were built for the target system,
but installed to the target directory structure that I
am creating.

The 'make install' command that I ran for glibc was:

make install_root=${TARGET_PREFIX} prefix="" install

where TARGET_PREFIX is the target filesystem tree.  I
used the same make install command for the native gcc
that I compiled.

Thanks,
   Jeff Stevens

--- Kai Ruottu <[EMAIL PROTECTED]> wrote:

> Jeff Stevens wrote:
> 
> > .../gcc-3.4.4/configure
> > --build=`../gcc-3.4.4/config.guess`
> > --target=powerpc-linux --host=powerpc-linux
> > --prefix=${PREFIX} --enable-languages=c
> > 
> > and then a make all.  The make went fine, and
> > completed without any errors.  However, when I ran
> > 'make install' I got the following error:
> > 
> > powerpc-linux-gcc: installation problem, cannot
> exec
> >
>
`/opt/recorder/tools/libexec/gcc/powerpc-linux/3.4.4/collect2':
> > Exec format error
> > 
> > How do I install the native compiler?
> 
> You shouldn't ask how but where !
> 
> You cannot install alien binaries into
> the native places on your host !  This
> is not sane at all...
> 
> Ok, one solution is to collect the components
> from the produced stuff, pack them into a
> '.tar.gz' or something and then ftp or something
> the stuff into the native system.
> 
> If you really want to install the stuff into your
> host, you should know the answer to the "where"
> first, and should read from the "GCC Install"
> manual the chapter 7, "Final installation" and
> see what option to 'make' you should use in order
> to get the stuff into your chosen "where"...
> 

__ 
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

Re: Need sanity check on DSE vs expander issue

2020-01-14 Thread Jeff Law

On Fri, 2019-12-20 at 12:08 +0100, Richard Biener wrote:
> On December 20, 2019 8:25:18 AM GMT+01:00, Jeff Law  wrote:
> > On Fri, 2019-12-20 at 08:09 +0100, Richard Biener wrote:
> > > On December 20, 2019 3:20:40 AM GMT+01:00, Jeff Law 
> > wrote:
> > > > I need a sanity check here.
> > > > 
> > > > Given this code:
> > > > 
> > > > > typedef union { long double value; unsigned int word[4]; }
> > > > memory_long_double;
> > > > > static unsigned int ored_words[4];
> > > > > static void add_to_ored_words (long double x)
> > > > > {
> > > > >   memory_long_double m;
> > > > >   size_t i;
> > > > >   memset (&m, 0, sizeof (m));
> > > > >   m.value = x;
> > > > >   for (i = 0; i < 4; i++)
> > > > > {
> > > > >   ored_words[i] |= m.word[i];
> > > > > }
> > > > > }
> > > > > 
> > > > 
> > > > DSE is removing the memset as it thinks the assignment to m.value
> > is
> > > > going to set the entire union.
> > > > 
> > > > But when we translate that into RTL we use XFmode:
> > > > 
> > > > > ;; m.value ={v} x_6(D);
> > > > > 
> > > > > (insn 7 6 0 (set (mem/v/j/c:XF (plus:DI (reg/f:DI 77
> > > > virtual-stack-vars)
> > > > > (const_int -16 [0xfff0])) [2
> > m.value+0
> > > > S16 A128])
> > > > > (reg/v:XF 86 [ x ])) "j.c":13:11 -1
> > > > >  (nil))
> > > > > 
> > > > 
> > > > That (of course) only writes 80 bits of data because of XFmode,
> > leaving
> > > > 48 bits uninitialized.  We then read those bits, or-ing the
> > > > uninitialized data into ored_words and all hell breaks loose later.
> > > > 
> > > > Am I losing my mind?  ISTM that dse and the expander have to agree
> > on
> > > > how much data is written by the store to m.value.
> > > 
> > > It looks like MEM_SIZE is wrong here, so you need to figure how we
> > arrive at this (I guess TYPE_SIZE vs. MODE_SIZE mismatch is biting us
> > here?)
> > > That is, either the MEM should have BLKmode or the mode size should
> > match
> > > MEM_SIZE. Maybe DSE can avoid looking at MEM_SIZE for non-BLKmode
> > MEMs? 
> > It's gimple DSE that removes the memset, so it shouldn't be mucking
> > around with modes at all.  stmt_kills_ref_p seems to think the
> > assignment to m.value sets all of m.
> > 
> > The ao_ref for memset looks reasonable:
> > 
> > > (gdb) p *ref
> > > $14 = {ref = 0x0, base = 0x77ffbea0, offset = { > long>> = {coeffs = {0}}, }, 
> > >   size = {> = {coeffs = {128}},  > fields>}, max_size = {> = {
> > >   coeffs = {128}}, }, ref_alias_set = 0,
> > base_alias_set = 0, volatile_p = false}
> > 128 bits with a base of VAR_DECL m.
> > 
> > We looking to see if this statement will kill the ref:
> > 
> > > (gdb) p debug_gimple_stmt (stmt)
> > > # .MEM_8 = VDEF <.MEM_6>
> > > m.value ={v} x_7(D);
> > > $21 = void
> > > (gdb) p debug_tree (lhs)
> > >   > > type  > volatile XF
> > > size 
> > > unit-size 
> > > align:128 warn_if_not_align:0 symtab:0 alias-set -1
> > canonical-type 0x7fffea988690 precision:80>
> > > side-effects volatile
> > > arg:0  > > type  > sizes-gimplified volatile type_0 BLK size  > 128> unit-size 
> > > align:128 warn_if_not_align:0 symtab:0 alias-set -1
> > canonical-type 0x7fffea988348 fields 
> > context 
> > > pointer_to_this >
> > > side-effects addressable volatile used read BLK j.c:10:31
> > size  unit-size  > 0x7fffea7f3d38 16>
> > > align:128 warn_if_not_align:0 context  > 0x7fffea97bd00 add_to_ored_words>
> > > chain  > 0x7fffea9430a8 size_t>
> > > used unsigned read DI j.c:11:10
> > > size 
> > > unit-size 
> > > align:64 warn_if_not_align:0 context  > 0x7fffea97bd00 add_to_ored_words>>>
> > > arg:1  > > type  > XF size  unit-size  > 0x7fffea7f3d38 16>
> > > align:128 warn_if_not_align:0 symtab:0 alias-set -

Re: Git ChangeLog policy for GCC Testsuite inquiry

2020-01-24 Thread Jeff Law

On Fri, 2020-01-24 at 13:49 -0500, David Edelsohn wrote:
> > > > On 1/24/20 8:45 AM, David Edelsohn wrote:
> > > > > There is no ChangeLog entry for the testsuite changes.
> > > > 
> > > > I don't believe in ChangeLog entries for testcases, but I'll add one for
> > > > the target-supports.exp change, thanks.
> > > 
> > > Is this a general policy change that we want to make?  Current we
> > > still have gcc/testsuite/ChangeLog and developers are updating that
> > > file.
> > 
> > I would support formalizing that as policy; currently there is no policy.
> > 
> > https://gcc.gnu.org/codingconventions.html#ChangeLogs
> > 
> > "There is no established convention on when ChangeLog entries are to be 
> > made for testsuite changes."
> 
> Do we want to continue with ChangeLog entries for testsuite changes or
> only rely on Git log?
I strongly prefer to move towards relying on the git log.

jeff

Re: Git ChangeLog policy for GCC Testsuite inquiry

2020-01-24 Thread Jeff Law

On Fri, 2020-01-24 at 20:32 +0100, Eric Botcazou wrote:
> > I strongly prefer to move towards relying on the git log.
> 
> In my experience the output of git log is a total mess so cannot replace 
> ChangeLogs.  But we can well decide to drop ChangeLog for the testsuite.
Well, glibc has moved to extracting them from git, building policies
and scripts around that.  I'm pretty sure other significant projecs are
also extracting their ChangeLogs from git.

We could do the same, selecting some magic date as the cutover point
after which future ChangeLogs are extracted from GIT.  In fact, that's
precisely what I'd like to see us do.

jeff

Re: Git ChangeLog policy for GCC Testsuite inquiry

2020-01-25 Thread Jeff Law

On Sat, 2020-01-25 at 10:50 -0500, Nathan Sidwell wrote:
> On 1/24/20 4:36 PM, Jeff Law wrote:
> > On Fri, 2020-01-24 at 20:32 +0100, Eric Botcazou wrote:
> > > > I strongly prefer to move towards relying on the git log.
> > > 
> > > In my experience the output of git log is a total mess so cannot replace
> > > ChangeLogs.  But we can well decide to drop ChangeLog for the testsuite.
> > Well, glibc has moved to extracting them from git, building policies
> > and scripts around that.  I'm pretty sure other significant projecs are
> > also extracting their ChangeLogs from git.
> > 
> > We could do the same, selecting some magic date as the cutover point
> > after which future ChangeLogs are extracted from GIT.  In fact, that's
> > precisely what I'd like to see us do.
> 
> The GCC10 release date would seem a good point to do this.  That gives us 
> around 
> 3 months to figure the details (and get stakeholder buy-in)
Yup.  That would be what I'd recommend we shoot for.  As you say it
gives time to work on the details and for folks to start changing their
habits.

jeff

Re: Git push account

2020-01-25 Thread Jeff Law

On Sat, 2020-01-25 at 12:39 +, Feng Xue OS wrote:
> Which account should I use to push my local patch to git repo of gcc? 
> I have a sourceware account that works for svn, but now it doesn't for git.
> Actually both below commands were tried, but failed.
>   git push ssh://f...@sourceware.org/gcc/gcc.git ..., 
>   git push ssh://f...@gcc.gnu.org/gcc/gcc.git ...
It shouldn't matter.  Under the hood sourceware.org and gcc.gnu.org are
the same machine.

jeff

Re: SSA Question related to Dominator Trees

2020-01-27 Thread Jeff Law

On Mon, 2020-01-27 at 10:18 -0500, Nicholas Krause wrote:
> Greetings,
> 
> Sorry if this question has been asked before but do we extend out the 
> core tree type for SSA or
> is there a actual dominator tree type. It seems to be we just extend or 
> override the core tree
> type parameters but was unable to verify it by looking in the manual.
There is no type or class for the dominator tree.  Having one would be
useful.


jeff

Re: Aliasing rules for unannotated SYMBOL_REFs

2020-01-27 Thread Jeff Law

On Sat, 2020-01-25 at 09:31 +, Richard Sandiford wrote:
> TL;DR: if we have two bare SYMBOL_REFs X and Y, neither of which have an
> associated source-level decl and neither of which are in an anchor block:
> 
> (Q1) can a valid byte access at X+C alias a valid byte access at Y+C?
> 
> (Q2) can a valid byte access at X+C1 alias a valid byte access at Y+C2,
>  C1 != C2?
> 
> Also:
> 
> (Q3) If X has a source-level decl and Y doesn't, and neither of them are
>  in an anchor block, can valid accesses based on X alias valid accesses
>  based on Y?
So what are the  cases where Y won't have a source level decl but we
have a decl in RTL?  anchors, other cases? 


> 
> (well, OK, that wasn't too short either...)
I would have thought the answer would be "no" across the board.  But
the code clearly indicates otherwise.

Interposition clearly complicates things as do explicit aliases though.



> 
> This part seems obvious enough.  But then, apart from the special case of
> forced address alignment, we use an offset-based check even for cmp==-1:
> 
>   /* Assume a potential overlap for symbolic addresses that went
>through alignment adjustments (i.e., that have negative
>sizes), because we can't know how far they are from each
>other.  */
>   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>   return -1;
>   /* If decls are different or we know by offsets that there is no 
> overlap,
>we win.  */
>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>   return 0;
> 
> So we seem to be taking cmp==-1 to mean that although we don't know
> the relationship between the symbols, it must be the case that either
> (a) the symbols are equal (e.g. via aliasing) or (b) the accesses are
> to non-overlapping objects.  In other words, one of the situations
> described by cmp==1 or cmp==0 must be true, but we don't know which
> at compile time.
Right.  That was the conclusion I came to.  If a  SYMBOL_REF has an
alias, the alias must have the same value as the SYMBOL_REF.  So their
either equal or there's no valid case for overlap.

> 
> This means that in practice, the answer to (Q1) appears to be "yes"
> but the answer to (Q2) appears to be "no".
That would be my understanding once aliases/interpositioning come into
play.

> 
> This somewhat contradicts:
> 
>   /* In general we assume that memory locations pointed to by different labels
>  may overlap in undefined ways.  */
>   return -1;
> 
> at the end of compare_base_symbol_refs, which seems to be saying
> that the answer to (Q2) ought to be "yes" instead.  Which is right?
I'm not sure how we could get to yes in that case.  A symbol alias or
interposition ultimately still results in two symbols having the same
final address.  Thus for a byte access if C1 != C2, then we can't have
an overlap.


> 
> In PR92294 we have a symbol X at ANCHOR+OFFSET that's preemptible.
> Under the (Q1)==yes/(Q2)==no assumption, cmp==-1 means that either
> (a) X = ANCHOR+OFFSET or (b) X and ANCHOR reference non-overlapping
> objects.  So we should take the offset into account when doing:
> 
>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
>   return 0;
> 
> Let's call this FIX1.
So this is a really interesting wrinkle.  Doesn't this change Q2 to a
yes?  In particular it changes the "invariant" that the symbols have
the same address in the event of an symbol alias or interposition.  Of
course one could ask the question of whether or not we should handle
cases with anchors specially.


> 
> But that then brings us to: why does memrefs_conflict_p return -1
> when one symbol X has a decl and the other symbol Y doesn't, and neither
> of them are block symbols?  Is the answer to (Q3) that we allow equality
> but not overlap here too?  E.g. a linker script could define Y to X but
> not to a region that contains X at a nonzero offset?
Does digging into the history provide any insights here?

I'm not sure given the issues you've introduced if I could actually
fill out the matrix of answers without more underlying information. 
ie, when can we get symbols without source level decls, 
anchors+interposition issues, etc.

Jeff
>

Re: Git ChangeLog policy for GCC Testsuite inquiry

2020-02-03 Thread Jeff Law

On Mon, 2020-02-03 at 18:55 +, Richard Sandiford wrote:
> "H.J. Lu"  writes:
> > On Fri, Jan 24, 2020 at 2:39 PM Paul Smith  wrote:
> > > On Fri, 2020-01-24 at 22:45 +0100, Jakub Jelinek wrote:
> > > > > > In my experience the output of git log is a total mess so cannot
> > > > > > replace ChangeLogs.  But we can well decide to drop ChangeLog for
> > > > > > the testsuite.
> > > > > 
> > > > > Well, glibc has moved to extracting them from git, building
> > > > > policies and scripts around that.  I'm pretty sure other
> > > > > significant projecs are also extracting their ChangeLogs from git.
> > > > > 
> > > > > We could do the same, selecting some magic date as the cutover
> > > > > point after which future ChangeLogs are extracted from GIT.  In
> > > > > fact, that's precisely what I'd like to see us do.
> > > > 
> > > > We don't have a tool that can do it, not even get the boilerplate
> > > > right. Yes, mklog helps, but it very often gets stuff wrong.  Not to
> > > > mention that the text what actually changed can't be generated very
> > > > easily.
> > > 
> > > I don't know if it counts as a significant project, but GNU make has
> > > been doing this for years.
> > > 
> > > What I did was take the existing ChangeLogs and rename them to
> > > ChangeLog.1 or whatever, then started with a new ChangeLog generated
> > > from scratch from Git messages.
> > > 
> > > I use the gnulib build-aux/gitlog-to-changelog script to do it.  It
> > > requires a little bit of discipline to get right; in particular you
> > > have to remember that the Git commit message will be indented 8 spaces
> > > in the ChangeLog, so you have to be careful that your commit messages
> > > wrap at char 70 (or less) in your Git commit.
> > > 
> > > If you have Git hooks you could enforce a bit of formatting; for
> > > example any line not indented by space must be <=70 chars long; this
> > > allows people to use long lines for formatted content if they indent it
> > > with a space or two.
> > > 
> > > Otherwise, it's the same as writing the ChangeLog and you only have to
> > > do it once.
> > > 
> > > Just to note, the above script simply transcribes the commit message
> > > into ChangeLog format.  It does NOT try to auto-generate ChangeLog-
> > > style content (files that changed, functions, etc.) from the Git diff
> > > or whatever.
> > > 
> > > There are a few special tokens you can add to your Git commit message
> > > that get reformated to special changelog tokens like "(tiny change)"
> > > etc.
> > > 
> > > As mentioned previously, it's very important that the commit message be
> > > provided as part of the code review, and it is very much fair game for
> > > review comments.  This is common practice, and a good idea because bad
> > > commit messages are always a bummer, ChangeLog or not.
> > > 
> > 
> > Libgcrypt includes ChangeLog entries in git commit messages:
> > 
> > http://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgcrypt.git
> > 
> > In each patch, commit log starts with ChangeLog entries without leading
> > TABs followed by separator line with -- and then commit message.   They
> > have a script to extract ChangeLog for release.
> 
> How many people would we be catering for by generating changelogs at
> release time though?  It seems too low-level to be useful to users,
> and people wanting to track gcc development history at the source level
> would surely be better off using git (which e.g. makes it much easier to
> track changes to particular pieces of code).
> 
> Perhaps there are practical or policy reasons for not requiring everyone
> who wants to track gcc development history to build or install git.
> But if so, why not just include the output of "git log", with whatever
> options seem best?  (Probably --stat at least, to show the affected files.)
> 
> Like with the svn-to-git conversion, the less we change the way the
> history is presented, the less chance there is of something going wrong.
> And the idea is that git log should be informative enough for upstream
> developers, so surely it should be enough for others too.
I believe the ChangeLog is primarily a FSF requirement, hence
generating it from the SCM at release time seems reasonable.

ANd yes, even though I have been a regular ChangeLog user, I rely more
and more on the git log these days.

jeff

Fwd: February sourceware.org transition to new server!

2020-02-04 Thread Jeff Law

This affects gcc.gnu.org as well...Expect weekend outages...
--- Begin Message ---
Community,

The sourceware.org server will be transitioning to a new server over
the next 2-4 weeks. The new server will be CentOS 8-based with more
CPU and more RAM.

Please keep this in mind when planning out your work. Starting in 2
weeks time we might see some weekend outages as Frank Eigler and the
overseers team work out the bugs.

Thanks to Frank and all of overseers for their tireless efforts!

Cheers,
Carlos.

--- End Message ---

Re: Git ChangeLog policy for GCC Testsuite inquiry

2020-02-06 Thread Jeff Law

On Wed, 2020-02-05 at 15:18 -0600, Segher Boessenkool wrote:
> On Mon, Feb 03, 2020 at 01:24:04PM -0700, Jeff Law wrote:
> > ANd yes, even though I have been a regular ChangeLog user, I rely more
> > and more on the git log these days.
> 
> As a reviewer, the changelog is priceless still.  We shouldn't drop the
> changelog before people write *good* commit messages (and we are still
> quite far from that goal).
I believe the current proposal is not to switch immediately, but to do
so after gcc-10 is released.  Feel free to suggest improvements to
ChangeLogs or summaries in the mean time to get folks to start
rethinking what they write in 'em.

And FWIW, we're talking about the ChangeLog *file* here.  If folks
continued writing the same log messages and put them into git, I
personally think that's sufficient to transition away from having a
ChangeLog file in the source tree.

I don't want to make perfect the enemy of the very very good here and
moving away from a ChangeLog file in the source tree is, IMHO, very
very good.

jeff

Re: [EXTERNAL] Re: GCC selftest improvements

2020-02-15 Thread Jeff Law

On Thu, 2020-02-13 at 22:18 +, Modi Mo wrote:
> > On 2/12/20 8:53 PM, David Malcolm wrote:
> > > Thanks for the patch.
> > > 
> > > Some nitpicks:
> > > 
> > > Timing-wise, the GCC developer community is focusing on gcc 10
> > > bugfixing right now (aka "stage 4" of the release cycle).  So this
> > > patch won't be suitable to commit to master until stage 1 of the
> > > release cycle for gcc 11 (in April, hopefully).
> > > 
> 
> Ah I should've looked a bit harder for timelines before asking 
> https://gcc.gnu.org/develop.html. Appreciate the response here!
> 
> > > But yes, it's probably a good idea to get feedback on the patch given
> > > the breadth of platforms we support.
> > > 
> > > The patch will need an update to the docs; search for "Tools/packages
> > > necessary for building GCC" in gcc/doc/install.texi, which currently
> > > has some paragraphs labelled:
> > >@item ISO C++98 compiler
> > > that will need changing.
> > > 
> > > I think Richi mentioned that the minimum gcc version should be 4.8.2
> > > as he recalled issues with .1, so maybe the error message and docs
> > > should reflect that?
> > > 
> > > https://gcc.gnu.org/ml/gcc/2019-10/msg00180.html
> > > 
> 
> Segher here suggests 4.8.5 instead of 4.8.2: 
> https://gcc.gnu.org/ml/gcc/2019-11/msg00192.html
> 
> Looking at release dates 4.8.5 was in June 2015 while 4.8.2 in October 2013 
> which is a pretty big gap. I'd for moving the needle as far as we reasonably 
> can since this is a leap anyways. @Segher do you have a reason in mind for 
> the higher versioning?
I doubt there's a lot of functional difference between 4.8.5 and 4.8.2.
It really should just be bugfixes.While I'd prefer 4.8.5 over
4.8.2, I could live with either.

Jeff

Re: Branch instructions that depend on target distance

2020-02-24 Thread Jeff Law

On Mon, 2020-02-24 at 12:36 +0100, Petr Tesarik wrote:
> On Mon, 24 Feb 2020 11:14:44 +
> Jozef Lawrynowicz  wrote:
> 
> > On Mon, 24 Feb 2020 12:05:28 +0100
> > Petr Tesarik  wrote:
> > 
> > > Hi all,
> > > 
> > > I'm looking into reviving the efforts to port gcc to VideoCore IV [1].
> > > One issue I've run into is the need to find out target branch distance
> > > at compile time. I looked around, and it's not the first one
> > > architecture with such requirement, but AFAICS it has never been solved
> > > properly.
> > > 
> > > For example, AVR tracks instruction length. Later, ret_cond_branch()
> > > selects between a branch instruction and an inverted branch followed by
> > > an unconditional jump based on these calculated lengths.
> > > 
> > > This works great ... until there's some inline asm() statement, for
> > > which gcc cannot keep track of the length attribute, so it is probably
> > > taken as zero. Linker then fails with a cryptic message:
> > >   
> > > > relocation truncated to fit: R_AVR_7_PCREL against `no symbol'
> > 
> > The MSP430 backend just always generates maximum range branch instructions,
> > except for some special cases. We then rely on the linker to relax branch
> > instructions to shorter range "jump" instructions when the destination is
> > within range.
> > 
> > So the compiler output will always work, but not be the smallest possible 
> > code
> > size.
> > 
> > For that relocation truncated to fit error message you want to check that 
> > the
> > linker has the ability to relax whatever branch instruction it is failing 
> > on to
> > a longer range branch.
> 
> But that would change the instruction length, so not really an option
> AFAICS (unless I also switch to LTO).
> 
> Anyway, the situation is much worse on the VideoCore IV. The
> alternatives here are:
> 
> 1.
>addcmpbCC rx, 0, imm, target
>; usually written as bCC rx, imm, target
> 
> 2.
> cmp rx, imm
> bCC .+2
> j   target
Yea, this isn't that uncommon.  You can describe both of these to the
branch shortening pass.

> 
> The tricky part is that the addcmpbCC instruction does NOT modify
> condition codes, while the cmp instruction does. Nothing you could
> solve in the linker...
> 
> OK, it seems I'll have to go with the worst-case variant.
You can support both.  You output the short case when the target is
close enough and the longer variant otherwise.

Jeff

Re: Fwd: Legal Prerequisites contributions

2020-03-01 Thread Jeff Law

On Sun, 2020-03-01 at 14:37 +0100, Michael de Lang wrote:
> Dear Sir/Madam,
> 
> I'm working on implementing pr0980r1 and people in the #gcc channel
> told me to get the legal process started asap. I am willing to sign
> copyright assignments as outlayed on
> https://gcc.gnu.org/contribute.html
Contact ass...@gnu.org to get your paperwork started.

Thanks,
Jeff
>

Away on PTO, expect some delays in GCC patch reviews

2015-06-04 Thread Jeff Law



I'm away on PTO for the next couple weeks, which likely means that patch 
review times will suffer.  Normally I'd ask Richard Henderson to help 
cover, but he's going to be on PTO as well.


It's probably safe to assume that when I return there will be a bit of a 
patch backlog and I'll have a higher than usual backlog of 
non-patch-review things to be doing for Red Hat.


So, at least for the next few weeks, patch review will be slower than 
usual.  Please be patient :-)


Jeff

Re: ifcvt limitations?

2015-06-23 Thread Jeff Law


On 06/10/2015 07:36 AM, Kyrill Tkachov wrote:


Thanks, I've made some progress towards making it more aggressive.
A question since I'm in the area...
noce_try_cmove_arith that I've been messing around with has this code:

   /* A conditional move from two memory sources is equivalent to a
  conditional on their addresses followed by a load.  Don't do this
  early because it'll screw alias analysis.  Note that we've
  already checked for no side effects.  */
   /* ??? FIXME: Magic number 5.  */
   if (cse_not_expected
   && MEM_P (a) && MEM_P (b)
   && MEM_ADDR_SPACE (a) == MEM_ADDR_SPACE (b)
   && if_info->branch_cost >= 5)


Any ideas on where the rationale for that 5 came from?
I see it's been there since the very introduction of ifcvt.c
I'd like to replace it with something more sane, maybe even remove it?
Richard was working on Itanic at the time.  So I can speculate that the 
transformation wasn't generally profitable on other targets, so he 
picked a value that was high enough for the code to only trigger on 
Itanic (and perhaps Alphas since he was still doing a lot of work on 
them and knew their properties quite well).


Richard is currently on PTO, so I don't think you're likely to get a 
quick response from him with further details.


jeff

Re: set_src_cost lying comment

2015-06-23 Thread Jeff Law


On 06/21/2015 11:57 PM, Alan Modra wrote:

set_src_cost says it is supposed to
/* Return the cost of moving X into a register, relative to the cost
of a register move.  SPEED_P is true if optimizing for speed rather
than size.  */

Now, set_src_cost of a register move (set (reg1) (reg2)), is zero.
Why?  Well, set_src_cost is used just on the right hand side of a SET,
so the cost is that of (reg2), which is zero according to rtlanal.c
rtx_cost.  targetm.rtx_costs doesn't get a chance to modify this.

Now consider (set (reg1) (ior (reg2) (reg3))), for which set_src_cost
on rs6000 currently returns COSTS_N_INSNS(1).  It seems to me that
this also ought to return zero, if the set_src_cost comment is to be
believed.  I'd claim the right hand side of this expression costs the
same as a register move.  A register move machine insn "mr reg1,reg2"
is encoded as "or reg1,reg2,reg2" on rs6000!
Certainly seems inconsistent -- all the costing stuff should be 
revisited.  The basic design for costing dates back to the m68k/vax era.


I certainly agree that the cost of a move, logicals and arithmetic is 
essentially the same at the chip level for many processors.  But a copy 
has other properties that make it "cheaper" -- namely we can often 
propagate it away or arrange for the source & dest of the copy to have 
the same hard register which achieves the same effect.


So one could argue that a copy should have cost 0 as it has a reasonable 
chance of just going away, while logicals, alu operations on the 
appropriate chips should have a cost of 1.


jeff

Re: set_src_cost lying comment

2015-06-24 Thread Jeff Law


On 06/24/2015 03:18 AM, Alan Modra wrote:

On Tue, Jun 23, 2015 at 11:05:45PM -0600, Jeff Law wrote:

I certainly agree that the cost of a move, logicals and arithmetic is
essentially the same at the chip level for many processors.  But a copy has
other properties that make it "cheaper" -- namely we can often propagate it
away or arrange for the source & dest of the copy to have the same hard
register which achieves the same effect.

So one could argue that a copy should have cost 0 as it has a reasonable
chance of just going away, while logicals, alu operations on the appropriate
chips should have a cost of 1.


That's an interesting point, and perhaps true for rtl expansion.  I'm
not so sure it is correct for later rtl passes where you'd like to
discourage register moves..
It was the best I could come up with :-) I certainly don't know the 
history behind the choices.




Case in point:  The rs6000 backend happens to use zero for the cost of
setting registers to simple constants.  That might be an accident, but
when I fixed this by making (set (reg) (const_int)) cost one insn as
it actually does for a range of constants, I found some call sequences
regressesd.  A call like foo(0,0) is better as
   (set (reg r3) (const_int 0)) li 3,0
   (set (reg r4) (const_int 0)) li 4,0
   (call ...)   bl foo
rather than
   (set (reg r3) (const_int 0)) li 3,0
   (set (reg r4) (reg r3))  mr 4,3
   (call ...)   bl foo
CSE will say the second sequence is cheaper if loading a constant is
more expensive than a copy.  In reality the second sequence is less
preferable since you have a register dependency.

Agreed 100%.



A similar problem happens with foo(x+1,x+1) which currently emits
   (set (reg r3) (plus (reg x) (const_int 1)))
   (set (reg r4) (reg r3))
for the arg setup insns.  On modern processors it would be better as
   (set (reg r3) (plus (reg x) (const_int 1)))
   (set (reg r4) (plus (reg x) (const_int 1)))

So in these examples we'd really like register moves to cost one
insn.  Hmm, at least, moves from hard regs ought to cost something.
Agreed again.  These are good examples of things the costing model 
simply wasn't ever designed to consider  -- because they weren't 
significant issues on the m68k, vax and other ports in the gcc-1 era.


So I don't really know how to tell you to proceed -- I've considered the 
costing models fundamentally flawed for many years, but haven't ever 
tried to come up with something that works better.


Jeff

Re: set_src_cost lying comment

2015-06-24 Thread Jeff Law


On 06/24/2015 03:18 AM, Alan Modra wrote:


So in these examples we'd really like register moves to cost one
insn.  Hmm, at least, moves from hard regs ought to cost something.
The more I think about it, the more I think that's a reasonable step. 
Nothing should have cost 0.


Jeff

Re: set_src_cost lying comment

2015-06-25 Thread Jeff Law


On 06/25/2015 06:28 AM, Richard Earnshaw wrote:

On 24/06/15 17:47, Jeff Law wrote:

On 06/24/2015 03:18 AM, Alan Modra wrote:


So in these examples we'd really like register moves to cost one
insn.  Hmm, at least, moves from hard regs ought to cost something.

The more I think about it, the more I think that's a reasonable step.
Nothing should have cost 0.

Jeff


It really means what you mean by cost here.  I think rtx_cost is really
talking about delta costs much of the time, so when recursing down

plus (op1, op2)

there's a cost from the plus, a cost from op1 and a cost from op2.

I believe the idea behind reg being cost 0 is that when op1 and op2 are
both registers in the above expression the overall cost is just the cost
of the plus, with no additional cost coming from the operands.
Perhaps, but it's also the case on the PPC and a variety of other 
architectures that there's no difference between a reg and some 
constants.  And on some architectures the PLUS is no different than a 
logical operation or even a copy.







This leaves only one problem: if the entire expression is just

reg

then the overall cost becomes zero.  We hit this problem because we only
look at the source cost, not the overall insn cost.  Mostly that's ok,
but in the specific case of a move instruction it doesn't really
generate the desired result.

Perhaps the best thing to do is to use the OUTER code to spot the
specific case where you've got a SET and return non-zero in that case.

Seems like it's worth an experiment.

jeff

Re: Proposal for merging scalar-storage-order branch into mainline

2015-06-25 Thread Jeff Law


On 06/09/2015 04:52 AM, Richard Biener wrote:

On Tue, Jun 9, 2015 at 12:39 PM, Eric Botcazou  wrote:

What's the reason to not expose the byte swapping operations earlier, like
on GIMPLE?  (or even on GENERIC?)


That would be too heavy, every load and store in GENERIC/GIMPLE would have an
associated byte swapping operation, although you don't know if they will be
needed in the end.  For example, if the structure is scalarized, they are not.


Yes, but I'd expect them to be optimized away (well, hopefully).
Anwyay, and I thought
use of the feature would be rare so that "every load and store" is
still very few?
Seems like it'd be a great way to test the effectiveness of our bswap 
pass :-)


jeff

Re: Proposal for merging scalar-storage-order branch into mainline

2015-06-25 Thread Jeff Law


On 06/09/2015 10:20 AM, Eric Botcazou wrote:

Because some folks don't want to audit their code to where to add byteswaps.
I am serious people have legacy big-endian code they want to run little
endian. There is a reason this is around in the first place. Developers are
lazy.


That's a little rough, but essentially correct in our experience.
Agreed on both points.  These legacy codebases can be large and full 
auditing may not really be that feasible.


jeff

Re: C++ coding style inconsistencies

2015-06-25 Thread Jeff Law


On 06/25/2015 12:28 PM, Richard Sandiford wrote:

Sorry in advance for inviting a bikeshed discussion, but while making
the hashing changes that I just committed, I noticed that the C++ification
has been done in a variety of different styles.  I ended up having to follow
the "do what the surrounding code does" principle that some code bases have,
but to me that's always seemed like an admission of failure.  One of the
strengths of the GCC code base was always that it was written in a very
consistent style.  Regardless of what you think of that style (I personally
like it, but I know others don't at all), it was always easy to work on
a new area of the compiler without having to learn how the surrounding code
preferred to format things.  It would be a shame if we lost that in the
rush to make everything "more C++".

The three main inconsistencies I saw were:

(1) Should inline member functions be implemented inside the class or outside
 the class?  If inside, should they be formatted like this:

void
foo (args...)
{
  ...;
}

 or like this:

void
foo (args...)
  {
...;
  }

 (both have been used).

 The coding standard is pretty clear about this one:

 Define all members outside the class definition. That is, there
 are no function bodies or member initializers inside the class
 definition.

 But in-class definitions have become very common.  Do we want
 to revisit this?  Or do we just need more awareness of what the
 rule is supposed to be?

 [Personally I like the rule.  The danger with in-class definitions
 is that it becomes very hard to see the interface at a glance.
 It obviously makes things more verbose though.]
I'd say let's go with the existing rule.  I know that in-class 
definitions are relatively common, particularly if they are trivial, so 
do we want an exception for anything that fits on a single line?





(2) Is there supposed to be a space before a template parameter list?
 I.e. is it:

foo

 or:

foo 

 ?  Both are widely used.

 The current coding conventions don't say explicitly, but all the
 examples use the second style.  It's also more in keeping with
 convention for function parameters.  On the other hand, it could
 be argued that the space in:

foo ::thing

 makes the binding confusing and looks silly compared to:

foo::thing

 But there again, the second one might look like two unrelated
 blobs at first glance.
I'd go with whatever gnu-ident does with these things.  I'd hate for us 
to settle on a style that requires non-default behaviour from gnu-indent.








(3) Do we allow non-const references to be passed and returned by
 non-operator functions?  Some review comments have pushed back
 on that, but some uses have crept in.

 [IMO non-const references are too easy to misread as normal
 parameters.]

In all three cases, whether the answer is A or B is less important
than whether the answer is the same across the code base.

I could make an argument either way on this one...

jeff

Re: C++ coding style inconsistencies

2015-06-26 Thread Jeff Law


On 06/26/2015 03:50 AM, Martin Jambor wrote:

Hi,

On Thu, Jun 25, 2015 at 04:59:51PM -0400, David Malcolm wrote:

On Thu, 2015-06-25 at 19:28 +0100, Richard Sandiford wrote:

Sorry in advance for inviting a bikeshed discussion, but while making
the hashing changes that I just committed, I noticed that the C++ification
has been done in a variety of different styles.  I ended up having to follow
the "do what the surrounding code does" principle that some code bases have,
but to me that's always seemed like an admission of failure.  One of the
strengths of the GCC code base was always that it was written in a very
consistent style.  Regardless of what you think of that style (I personally
like it, but I know others don't at all), it was always easy to work on
a new area of the compiler without having to learn how the surrounding code
preferred to format things.  It would be a shame if we lost that in the
rush to make everything "more C++".


[...snip...]

If we're bike-shedding (sorry, I'm waiting for a bootstrap), do we have
a coding standard around the layout of member initialization in
constructors?


Yes, https://gcc.gnu.org/codingconventions.html#Member_Form



i.e. should it be:

foo::foo (int x, int y) :
   m_x (x),
   m_y (y)
{
}

vs

foo::foo (int x, int y)
   : m_x (x),
 m_y (y)
{
}


according to the document, the semicolon should be on the first column
if all initializers do not fit on one line with the definition.  Emacs
gnu-style indentation does not do that and produces your second case
above, which, according to some simple grepping, is also greatly
prevails in the codebase now.  So perhaps we should change the rule?



(how much indentation?)

https://gcc.gnu.org/wiki/CppConventions


I'd be wary of citing and using this document, IIRC it sometimes
contradicts the official one and was meant as a basis for discussion
when we were discussing whether to switch to gcc in the first place.
But it (CppConventions in the wiki) also has a note that the top that 
makes it explicit that the information on that page is obsolete and 
refers the reader to the official conventions.


Jeff

Re: Proposal for merging scalar-storage-order branch into mainline

2015-06-26 Thread Jeff Law


On 06/26/2015 01:56 AM, Richard Biener wrote:

On Thu, Jun 25, 2015 at 7:03 PM, Jeff Law  wrote:

On 06/09/2015 10:20 AM, Eric Botcazou wrote:


Because some folks don't want to audit their code to where to add
byteswaps.
I am serious people have legacy big-endian code they want to run little
endian. There is a reason this is around in the first place. Developers
are
lazy.



That's a little rough, but essentially correct in our experience.


Agreed on both points.  These legacy codebases can be large and full
auditing may not really be that feasible.


Well - they need a full audit anyway to slap those endian attributes on the
appropriate structures.  We are not, after all, introducing a -fbig-endian
switch.

The cases I'm aware of would *love* a -fbig-endian switch :-)




"Legacy code base" and "new compiler feature" don't mix in my mind.
Assume the legacy platform didn't use GCC (because GCC wasn't a viable 
option way back when the now legacy platform was state-of-the-art), 
while the new platform will use GCC and will have an endianness change. 
 In that case features to ease the pain of migration make a goodly 
amount of sense.


jeff

Re: C++ coding style inconsistencies

2015-06-29 Thread Jeff Law


On 06/27/2015 01:18 AM, Richard Sandiford wrote:

Mikhail Maltsev  writes:

Perhaps one disappointing exception is mixed space/tabs indentation. It
is often inconsistent (i.e. some parts use space-only indentation).


Yeah.  Been hitting that recently too.
commit-hooks are the solution to this class of problem, IMHO.  Either by 
rejecting code which violates the formatting standards or automagically 
formatting it for us (via gnu-indent presumably).


I believe glibc (for example) is using commit hooks to reject commits 
that violate certain formatting rules.  No reason we couldn't do the same.


Not sure what granularity that hooks uses -- ie, does it flag 
preexisting formatting nits or just new ones.  Either way works.  The 
former is a bit more burdensome initially, but does get the code base 
into shape WRT formatting stuff more quickly.


Jeff

Re: Multi-version IF-THEN-ELSE conditional

2015-06-29 Thread Jeff Law


On 06/27/2015 06:10 AM, Ajit Kumar Agarwal wrote:

All:

The presence of aliases disables many optimizations like CCP(conditional 
constant propagation) , PRE(Partial Redundancy Elimination),
  Scalar Replacements for conditional IF-THEN-ELSE. The presence of aliasing 
also disables the IF-conversion.

I am proposing the Multi-version IF-THEN-ELSE where the different version of 
the IF-THEN-ELSE is one without aliasing and other
versions with the aliasing. Thus converting the different Multi-version 
IF-THEN-ELSE enables the CCP, PRE, Scalar replacements
and IF-conversions for the version of the IF-THEN-ELSE that does not have 
aliasing on the pointer variables and the versions that
has alias information will not be affected with such optimizations.

I don't have examples on hand currently but I am working on currently to 
provide the examples.
Probably the hardest part will be the heuristics around this.   The 
optimizations you want to improve happen as separate passes, so there 
won't necessarily be a good way to predict if the multi-version 
if-then-else will enable further optimizations.


Jeff

Re: gcc feature request / RFC: extra clobbered regs

2015-06-30 Thread Jeff Law


On 06/30/2015 04:02 PM, H. Peter Anvin wrote:

On 06/30/2015 02:55 PM, Andy Lutomirski wrote:

On Tue, Jun 30, 2015 at 2:52 PM, H. Peter Anvin  wrote:

On 06/30/2015 02:48 PM, Andy Lutomirski wrote:

On Tue, Jun 30, 2015 at 2:41 PM, H. Peter Anvin  wrote:

On 06/30/2015 02:37 PM, Jakub Jelinek wrote:

I'd say the most natural API for this would be to allow
f{fixed,call-{used,saved}}-REG in target attribute.


Either that or

 __attribute__((fixed(rbp,rcx),used(rax,rbx),saved(r11)))

... just to be shorter.  Either way, I would consider this to be
desirable -- I have myself used this to good effect in a past life
(*cough* Transmeta *cough*) -- but not a high priority feature.


I think I mean the per-function equivalent of -fcall-used-reg, so
hpa's "used" suggestion would do the trick.

I guess that clobbering the frame pointer is a non-starter, but five
out of six isn't so bad.  It would be nice to error out instead of
producing "disastrous results", though, if another bad reg is chosen.
(Presumably the PIC register on PIC builds would be an example of
that.)



Clobbering the frame pointer is perfectly fine, as is the PIC register.
  However, gcc might need to handle them as "fixed" rather than "clobbered".


Hmm.  True, I guess, although I wouldn't necessarily expect gcc to be
able to generate code to call a function like that.



No, but you need to be able to call other functions, or you just push
the issue down one level.
For ia32, the PIC register really isn't special anymore.  I'd be 
surprised if you couldn't clobber it.


jeff

Re: Allocation of hotness of data structure with respect to the top of stack.

2015-07-06 Thread Jeff Law


On 07/05/2015 05:11 AM, Ajit Kumar Agarwal wrote:

All:

I am wondering allocation of hot data structure closer to the top of
the stack increases the performance of the application. The data
structure are identified as hot and cold data structure and all the
data structures are sorted in decreasing order of The hotness and the
hot data structure will be allocated closer to the top of the stack.

The load and store on accessing with respect to allocation of data
structure on stack will be faster with allocation of hot Data
structure closer to the top of the stack.

Based on the above the code is generated with respect to load and
store with the correct offset of the stack allocated on the
decreasing order of hotness.
You might want to look at this paper from an old gcc summit conference. 
 Basically they were trying to reorder stack slots to minimize offsets 
in reg+d addressing for hte SH port.  It should touch on a number of 
common issues/goals.



ftp://gcc.gnu.org/pub/gcc/summit/2003/Optimal%20Stack%20Slot%20Assignment.pdf


I can't recall if they ever tried to submit that work for inclusion.


Jeff

Re: rl78 vs cse vs memory_address_addr_space

2015-07-06 Thread Jeff Law


On 07/01/2015 10:14 PM, DJ Delorie wrote:

In this bit of code in explow.c:

   /* By passing constant addresses through registers
  we get a chance to cse them.  */
   if (! cse_not_expected && CONSTANT_P (x) && CONSTANT_ADDRESS_P (x))
 x = force_reg (address_mode, x);

On the rl78 it results in code that's a bit too complex for later
passes to be optimized fully.  Is there any way to indicate that the
above force_reg() is bad for a particular target?
I believe this used to be conditional on -fforce-mem or -fforce-reg or 
some such option that we deprecated long ago.


It'd be helpful if you could be more specific about what can't be 
handled.  combine for example was extended to handle larger chains of 
insns not terribly long ago.


jeff

Re: Uninitialized registers handling in the REE pass

2015-07-07 Thread Jeff Law


On 07/06/2015 09:42 AM, Pierre-Marie de Rodat wrote:

Hello,

The attached reproducer[1] seems to trigger a code generation issue at
least on x86_64-linux:

 $ gnatmake -q p -O3 -gnatn
 $ ./p

 raised PROGRAM_ERROR : p.adb:9 explicit raise

Can you please file this as a bug in bugzilla so that can get tracked?

http://gcc.gnu.org/bugzilla

jeff

Re: Can shrink-wrapping ever move prologue past an ASM statement?

2015-07-07 Thread Jeff Law


On 07/07/2015 11:53 AM, Martin Jambor wrote:

Hi,

I've been asked to look into the item one of
http://permalink.gmane.org/gmane.linux.kernel/1990397 and found out
that at least shrink-wrapping happily moves prologue past an asm
statement which can be bad if the asm statement contains a call
instruction.

Am I right concluding that this is a bug?  Looking into the manual and
at requires_stack_frame_p() in shrink-wrap.c, I do not see any obvious
way of marking the asm statement as requiring the stack frame (but I
will not mind being proven wrong).  Do we want to create one, such as
only disallowing moving prologue past volatile asm statements?  Any
other ideas?

Shouldn't this be driven by dataflow?

jeff

Re: s390: larl for Simode on 64-bit

2015-07-08 Thread Jeff Law


On 07/08/2015 02:33 PM, DJ Delorie wrote:

Is there any reason that LARL can't be used to load a 32-bit symbolic
value, in 64-bit mode?  On TPF (64-bit) the app has the option of
being loaded in the first 4Gb so that all symbols are also valid
32-bit addresses, for backward compatibility.  (and if not, the linker
would complain)
It would seem that we'd want the compiler to know when the app is going 
to be loaded into that first 4G so that it can use the more efficient 
addressing modes.

Jeff

Re: Can shrink-wrapping ever move prologue past an ASM statement?

2015-07-08 Thread Jeff Law


On 07/08/2015 02:51 PM, Josh Poimboeuf wrote:

On Wed, Jul 08, 2015 at 11:22:34AM -0500, Josh Poimboeuf wrote:

On Wed, Jul 08, 2015 at 05:36:31AM -0500, Segher Boessenkool wrote:

On Wed, Jul 08, 2015 at 11:23:09AM +0200, Martin Jambor wrote:

For other archs, e.g. x86-64, you can do

register void *sp asm("%sp");
asm volatile("call func" : "+r"(sp));


I've found that putting "sp" in the clobber list also seems to work:

   asm volatile("call func" : : : "sp");

This syntax is nicer because it doesn't need a local variable associated
with the register.  Do you see any issues with this approach?
Given that SP isn't subject to register allocation, I'd expect it's 
fine.  Note that some folks have (loudly) requested that GCC issue an 
error if an asm tries to clobber sp.


The call doesn't actually clobber the stack pointer does it?  ISTM that 
a use of sp makes more sense and is better "future proof'd" than 
clobbering sp.


Jeff

Re: s390: larl for Simode on 64-bit

2015-07-08 Thread Jeff Law


On 07/08/2015 03:05 PM, DJ Delorie wrote:

In the TPF case, the software has to explicitly mark such pointers as
SImode (such things happen only when structures that contain addresses
can't change size, for backwards compatibility reasons[1]):

int * __attribute__((mode(SImode))) ptr;

   ptr = &some_var;
So in effect, we have two pointer sizes, 64 being the default, but we 
can also get a 32 bit pointer via the syntax above?  Wow, I'm surprised 
that works.


And the only time we'd be able to use larl is a dereference of a pointer 
declared with the syntax above.  Right


OK for the trunk with a simple testcase.  I think you can just scan the 
assembler output for the larl instruction.





so I wouldn't consider this the "default" case for those apps, just
*a* case that needs to be handled "well enough", and the user is
already telling the compiler that they assume those addresses are
32-bit (that either the whole app, or at least the part with that
object, will be linked below 4Gb).

The majority of the addresses are handled as 64-bit.


[1] /me refrains from commenting on the worth of such practices, just
 that they exist and need to be (and have been) supported.
Understood, but we also need to make sure that we don't do something 
that breaks things.  Thus I needed to know the tidbit about explicitly 
declaring those pointers as SImode.




jeff

Re: [RFC] Design and Implementation for Path Splitting for Loop with Conditional IF-THEN-ELSE

2015-07-09 Thread Jeff Law

On 06/02/2015 10:43 PM, Ajit Kumar Agarwal wrote:

-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Tuesday, June 02, 2015 9:19 PM
To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [RFC] Design and Implementation for Path Splitting for Loop with 
Conditional IF-THEN-ELSE

On 06/02/2015 12:35 AM, Ajit Kumar Agarwal wrote:

I don't offhand know if any of the benchmarks you cite above are
free-enough to derive a testcase from.  But one trick many of us use
is to instrument the >>pass and compile some known free software
(often gcc
itself) to find triggering code and use that to generate tests for the new 
transformation.

I will add tests in the suite. I could see many existing tests in the suite 
also get triggered with this optimization.

Thanks.  For cases in the existing testsuite where you need to change the expected output, 
it's useful to note why the expected output was changed.  >>Sometimes a test is 
compromised by a new optimization, sometimes the expected output is changed and is papering 
over a problem, etc so it's something >>we look at reasonably closely.

Thanks. I will modify accordingly.

diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c index 9faa339..559ca96
100644
--- a/gcc/cfghooks.c
+++ b/gcc/cfghooks.c
@@ -581,7 +581,7 @@ delete_basic_block (basic_block bb)

  /* If we remove the header or the latch of a loop, mark the loop for
 removal.  */
-  if (loop->latch == bb
+  if (loop && loop->latch == bb
  || loop->header == bb)
mark_loop_for_removal (loop);

So what caused you to add this additional test?  In general loop
structures are supposed to always be available.  The change here
implies that the loop structures were gone at some point.  That
seems at first glance a mistake.

I was using the gimple_duplicate_bb which will not add the duplicate
basic block inside the current_loops. That's why the above Condition
is required. I am using duplicate_block instead of gimple_duplicate_bb. With 
this change the above check with loop Is not required as it adds the duplicate 
basic block inside the loops.

OK.  Good to hear it's not required anymore.

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index aed5254..b25e409
100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -1838,6 +1838,64 @@ replace_uses_by (tree name, tree val)
}
}

+void
+gimple_threaded_merge_blocks (basic_block a, basic_block b)

If we keep this function it will need a block comment.

I say "if" for a couple reasons.  First we already have support
routines that know how to merge blocks.  If you really need to merge
blocks you should try to use them.

Second, I'm not sure that you really need to worry about block
merging in this pass.  Just create the duplicates, wire them into
the CFG and let the existing block merging support handle this problem.

The above routine is not merging but duplicates the join nodes into
its predecessors. If I change the name of the above Function to the 
gimple_threaded_duplicating_join_node it should be fine.

But you don't need to duplicate into the predecessors.  If you create the duplicates 
and wire them into the CFG properly the existing code in cfgcleanup >>should 
take care of this for you.

Certainly I will do it.

diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 4303a18..2c7d36d 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1359,6 +1359,322 @@ thread_through_normal_block (edge e,
  return 0;
}

+static void
+replace_threaded_uses (basic_block a,basic_block b)

If you keep this function, then it'll need a function comment.

It looks like this is just doing const/copy propagation.  I think a
better structure is to implement your optimization as a distinct pass,
then rely on existing passes such as update_ssa, DOM, CCP to handle
updating the SSA graph and propagation opportunities exposed by your
transformation.

Similarly for the other replace_ functions.

I think these replace_ functions are required as existing Dom, CCP and 
propagation opportunities doesn't transform these
Propagation given below.

:
xk_124 = MIN_EXPR ;
xc_126 = xc_121 - xk_6;
xm_127 = xm_122 - xk_6;
xy_128 = xy_123 - xk_6;
*EritePtr_14 = xc_126;
MEM[(Byte *)EritePtr_14 + 1B] = xm_127;
MEM[(Byte *)EritePtr_14 + 2B] = xy_128;
EritePtr_135 = &MEM[(void *)EritePtr_14 + 4B];
MEM[(Byte *)EritePtr_14 + 3B] = xk_6;
i_137 = i_4 + 1;
goto ;

:
xk_125 = MIN_EXPR ;
xc_165 = xc_121 - xk_6;
xm_166 = xm_122 - xk_6;
xy_167 = xy_123 - xk_6;
*EritePtr_14 = xc_126;
MEM[(Byte *)EritePtr_14 + 1B] = xm_127;
MEM[(Byte *)EritePtr_14 + 2B] = xy_128;
EritePtr_171 = &MEM[(void *)EritePtr_14 + 4B];
MEM[(Byte *)EritePtr_14 + 3B] = xk_6;
i_173 = i_4 + 1;

 and  are the predecessors of the joi

Re: GCC/JIT and precise garbage collection support?

2015-07-10 Thread Jeff Law


On 07/10/2015 09:04 AM, Armin Rigo wrote:

Hi David,

On 10 July 2015 at 16:11, David Malcolm  wrote:

AIUI, we have CALL_INSN instructions all the way through the RTL phase
of the backend, so we can identify which locations in the generated code
are calls; presumably we'd need at each CALL_INSN to determine somehow
which RTL expressions tagged as being GC-aware are live (perhaps a
mixture of registers and fp-offset expressions?)

So presumably we could use that information (maybe in the final pass) to
write out some metadata describing for each %pc callsite the relevant GC
roots.

Armin: does this sound like what you need?


Not quite.  I can understand that you're trying to find some solution
with automatic discovery of the live variables of a "GC pointer" type
and so on.  This is more than we need, and if
we had that, then we'd need to work harder to remove the extra stuff.
We only want the end result: attach to each CALL_INSN a list of
variables which should be stored in the stack map for that call, and
be ready to see these locations be modified from outside across the
call if a GC occurs.
I wonder how much overlap there is between this need and what we're 
going to need to do for resumable functions which are being discussed in 
the ISO C++ standards meetings.


jeff

Re: Spurious parallel make failures in libgcc.

2015-07-15 Thread Jeff Law


On 07/15/2015 08:33 AM, Andrew MacLeod wrote:

Oh, wait. this isnt on a scratch build this time, this is on an
incremental rebuild I just noticed (I was doing stuff on  multiple
machines, turn out the other one was a scratch build)
make -j16 from the root build directory.

and it may have happened with me putzing around with a .awk file in the
gcc which required rebuilding.. so maybe configure did require
re-running somehow?  still seems to be a timing issue tho.

Maybe if gthr-default already existed (as well as config.status), the
makefile would spawn the libgcov-interface.c object builds... meanwhile
a reconfigure is going which ends up overwriting gthr-default.h at what
turns out to be a poor time ?  that sort of makes sense I guess.   Im
not sure how we synchronize the parallel bits.
I always blow away the target directory (and stage2-* stage3-*) when I 
do an incremental bootstrap.


If I had to guess, something is missing a dependency which allows 
compilation of libgcov-interface to start building prior to configure 
re-running (since I believe it's configure that sets up gthr-default).


I bet if you have libgcov-interface.o (and anything else which depends 
on gthr.h or gthr-default.h depend on config.status this problem would 
go away.


I'm not sure what the real fix is, but it's got to be a missing 
dependency that allows libgcov-interface.c to build prior to configure 
being completed.


jeff

Re: ira.c update_equiv_regs patch causes gcc/testsuite/gcc.target/arm/pr43920-2.c regression

2015-07-29 Thread Jeff Law


On 07/28/2015 12:18 PM, Alex Velenko wrote:

On 21/04/15 06:27, Jeff Law wrote:

On 04/20/2015 01:09 AM, Shiva Chen wrote:

Hi, Jeff

Thanks for your advice.

can_replace_by.patch is the new patch to handle both cases.

pr43920-2.c.244r.jump2.ori is the original  jump2 rtl dump

pr43920-2.c.244r.jump2.patch_can_replace_by is the jump2 rtl dump
after patch  can_replace_by.patch

Could you help me to review the patch?

Thanks.  This looks pretty good.

I expanded the comment for the new function a bit and renamed the
function in an effort to clarify its purpose.  From reviewing
can_replace_by, it seems it should have been handling this case, but
clearly wasn't due to implementation details.

I then bootstrapped and regression tested the patch on x86_64-linux-gnu
where it passed.  I also instrumented that compiler to see how often
this code triggers.  During a bootstrap it triggers a couple hundred
times (which is obviously a proxy for cross jumping improvements).  So
it's triggering regularly on x86_64, which is good.

I also verified that this fixes BZ64916 for an arm-non-eabi toolchain
configured with --with-arch=armv7.

Installed on the trunk.  No new testcase as it's covered by existing
tests.

Thanks,,
jeff



Hi,
I see this patch been committed in r56 on trunk. Is it okay to port
this to fsf-5?
It's not a regression, so backporting it would be generally frowned 
upon.  If you feel strongly about it, you should ask Jakub, Joseph or 
Richi (the release managers) for an exception to the general policy.


jeff

Bin Cheng as Loop Induction Variable Optimizations maintainer

2015-07-31 Thread Jeff Law

I am pleased to announce that the GCC Steering Committee has appointed 
Bin Cheng as the IVopts maintainer.


Please join me in congratulating Bin on his new role.

Bin, please update your entry in the MAINTAINERS file.  I also believe 
you have some patches to self-approve :-)


Thanks,

Jeff

Re: Finding insns to reorder using dataflow

2015-08-13 Thread Jeff Law


On 08/13/2015 05:06 AM, Kyrill Tkachov wrote:

Hi all,

I'm implementing a target-specific reorg pass, and one thing that I want
to do
is for a given insn in the stream to find an instruction
in the stream that I can swap it with, without violating any dataflow
dependencies.
The candidate instruction could be earlier or later in the stream.

I'm stuck on finding an approach to do this. It seems that using some of
the dataflow
infrastructure is the right way to go, but I can't figure out the details.
can_move_insns_across looks like relevant, but it looks too heavyweight
with quite a lot
of arguments.

I suppose somehow constructing regions of interchangeable instructions
would be the way
to go, but I'm not sure how clean/cheap that would be outside the scheduler

Any ideas would be appreciated.

I think you want all the dependency analysis done by the scheduler.

Which leads to the question, can you model what you're trying to do in 
the various scheduler hooks -- in particular walking through the ready 
list seems appropriate.


jeff

Re: Finding insns to reorder using dataflow

2015-08-14 Thread Jeff Law


On 08/14/2015 03:05 AM, Kyrill Tkachov wrote:


The problem I'm trying to solve can be expressed in this way: "An
insn that satisfies predicate pred_p (insn) cannot appear exactly N
insns apart from another insn 'insn2' that satisfies pred_p (insn2).
N is a constant". So, the problem here is that this restriction is
not something expressed in terms of cycles or DFA states, but rather
distance in the instruction stream.

I wasn't really suggesting to model it in DFA states, but instead use
the dependency analysis + hooks.   The dependency analysis in particular
when it's safe to interchange two insns.

Given the additional information, I think you'd want to note when an
insn fires and satisfies pred_p, and associate a counter with each
firing.  THe active counters are bumped (decremented?) at each firing
(so you can track how many insns appear after the one that satisfied
pred_p).  Note that for insns which generate multiple assembly
instructions, you need to decrement the counter by the number of
assembly instructions they emit.

Then when sorting the ready list, if you have an insn that satisfies
pred_p and an active counter has just reached zero, make sure some other
insn fires (what if there aren't any other ready insns?  Is this a
correctness or performance issue?)






I don't think I can do this reliably during sched2 because there is
still splitting that can be done that will create more insns that
will invalidate any book keeping that I do there.

Right.  You need everything split and you need accurate insn length
information for every insn in the backend that isn't split.  If this is
a correctness issue, then you also have to deal with final deleting
insns behind your back as well.

Many years ago I did something which required 100% accurate length
information from the backend.  It was painful, very very painful.
Ultimately it didn't work out and the code was scrapped.





However, during TARGET_MACHINE_DEPENDENT_REORG I can first split all
 insns and then call schedule_insns () to do another round of
scheduling. However, I'm a bit confused by all the different
scheduler hooks and when each one is called in relation to the
other.
You'll have to work through them -- I haven't kept close tabs on the 
various hooks we have, I just know we have them.



I'd need to keep some kind of bitfield recording for the previous N
instructions in the stream whether they satisfy pred_p. Where would I
record that? Can I just do everything in TARGET_SCHED_REORDER? i.e.
given a ready list check that no pred_p insns in it appear N insns
apart from another such insn (using my bitfield as a lookup helper),
reorder insns as appropriate and then record the order of the pred_p
insns in the bitfield. Would the scheduler respect the order of the
insns that was set by TARGET_SCHED_REORDER and not do any further
reordering?
The problem I see is that once one of these insns fire, other new insns 
will be added to the ready list.  So you have to keep some kind of state 
about how many instructions back one of these insns fired and consult 
that data when making a decision about the next instruction to fire.


All this will fall apart if this is a correctness issue since you'd have 
to issue a nop or somesuch.  Though I guess you might be able to arrange 
to get a nop into the scheduled stream.  If this is a correctness issue, 
tackling it in the assembler may make more sense.


Jeff

Re: Deprecate SH5/SH64

2015-08-18 Thread Jeff Law


On 08/18/2015 11:11 AM, David Edelsohn wrote:

On Tue, Aug 18, 2015 at 1:00 PM, Oleg Endo 
wrote:

Hi all,

Kaz and I have been discussing the SH5/SH64 status, which is part
of the SH port, every now and then.  To our knowledge, there is no
real hardware available as of today and we don't think there are
any real users for a SH5/SH64 toolchain out there.  Moreover, the
SH5/SH64 parts of the SH port haven't been touched by anybody for a
long time.  The only exception is occasional ad-hoc fixes for bug
reports from people who build GCC for every architecture that is
listed in the Linux kernel.  However, we don't actually know
whether code compiled for SH5/SH64 still runs at an acceptable
level since nobody has been doing any testing for that architecture
for a while now.

If there are no objections, we would like to deprecate SH5/SH64
support as of GCC 6.

Initially this would include an announcement on the changes page
and the removal of any documentation related to SH5/SH64.  After
GCC 6 we might start removing configure options and the respective
code paths in the target.


+1
Works for me based on what I've heard independently about sh5 hardware 
situation.



Frankly, I think we should be more aggressive about this kind of 
port/variant pruning across the board.


Jeff

Re: Question about "instruction merge" pass when optimizing for size

2015-08-19 Thread Jeff Law


On 08/19/2015 02:38 PM, DJ Delorie wrote:

I've seen this on other targets too, sometimes so bad I write a quick
target-specific "stupid move optimizer" pass to clean it up.

A generic pass would be much harder, but very useful.
More important is to determine *why* we're getting these patterns.  In 
the IRA/LRA world, they should be a lot less common.


Jeff

Re: Question about "instruction merge" pass when optimizing for size

2015-08-20 Thread Jeff Law


On 08/20/2015 01:07 AM, sa...@hederstierna.com wrote:




From: Jeff Law 
More important is to determine *why* we're getting these patterns.  In
the IRA/LRA world, they should be a lot less common.


Yes I agree this phenomena seems more common after introducing LRA.

Though I was thinking that such a pass still maybe can be relevant.

Thinking hypothetically of an architecture, lets call it cortex-X,
assume this specific target type have an op-code for ADD with 5-operands.

Optimal code for

a = a + b + c + d

would be

addx Ra,Ra,Rb,Rc,Rd

where in the optimization process do we introduce the merging into this target 
type specific instruction.
Can the more generic IRA/LRA handle this?
Lots of passes could be involved.  It's better to work with a real 
example on a real target for this kind of discussion.


Assuming sensible three address code comes out of the gimple with 
non-overlapping lifetimes, then I'd expect this to be primarily a 
combiner issue.




And maybe patterns can appear across different BB, or somewhere that the normal 
optimizers have hard to find, or figure out?

Sorry if I'm ignorant, I don't know the internals of the different optimizers, 
but I'm trying to learn and understand how to come forward on this issue we 
have with code size currently.
(I tried to but some bugs on it also Bug 61578 and Bug 67213.)
Unfortunately, 61578 has multiple testcases.  Each should be its own bug 
that can be addressed and tracked individually.


Peeking at the last testcase in c#19 is interesting.  Presumably the 
typecasting is necessary to avoid doing multiple comparisons and the 
assumption is that the casts will be NOPs at the RTL level.


That assumption seems to be fine through IRA.  The allocation seems 
sane, except there's a reload needed for thumb1_addsi3_addgeu to ensure 
operand 1 and operand 0 match due to the matching constraint.


That points to two issues.

1. Is IRA correctly tracking the need for those two operands to be the 
same and accouting for that in its cost model.


2. In the case where IRA still generates code that needs a reload, why 
was the old reload code able to eliminate the copy, the LRA can't.



67213 is probably a costing issue somewhere.  Since Richi is already 
involved, I'll let the two of you dig into the details.


jeff

Re: Possible issue with using LAST_INSN_CODE

2015-08-20 Thread Jeff Law


On 08/20/2015 02:54 AM, Claudiu Zissulescu wrote:

Hi,

The LAST_INSN_CODE is used to mark the last instruction code valid
for a particular architecture (e.g., For ARM the value of
LAST_INSN_CODE is 3799). Also this code (i.e., 3799) is used by a
predicated instruction (e.g., for ARM this code is used by predicated
version of arm_usatsihi => {*p arm_usatsihi}).

However, the LAST_INSN_CODE macro is used by lra, recog and
tree-vect-stmts to dimension various arrays which may lead to various
errors.

For example, when calling preprocess_insn_constraints (recog.c:2444),
the compilation may go berserk when evaluating  "if
(this_target_recog->x_op_alt[icode])" line when icode is  exactly the
LAST_INSN_CODE as  "this_target_recog->x_op_alt" is dimensioned up to
LAST_INSN_CODE (recog.h:397).

A possible solution is having the LAST_INSN_CODE value to be exactly
the value returned by get_num_isns_codes() (gencodes.c:89).
Alternatively is to use LAST_INSN_CODE+1 when defining an array.

Please can someone confirm my observation. And what will be the best
solution for this.

It seems to me like something has been broken then.

LAST_INSN_CODE is supposed to be higher than any insn defined by the 
backend.



Jeff

Re: Possible issue with using LAST_INSN_CODE

2015-08-20 Thread Jeff Law


On 08/20/2015 11:28 AM, Claudiu Zissulescu wrote:

Hi Jeff,

In the gencodes.c:89, it explicitly  decrements by one the return
value of get_num_insn_codes(). While for the get_num_insn_codes is
stated this:

/* Return the number of possible INSN_CODEs.  Only meaningful once the
whole file has been processed.  */

I can provide an example for the ARC port where it crashes due to
LAST_INSN_CODE issue. Probably it can be reproduced with other more
popular port like ARM.

Passing along a test, even for the ARC is useful.

This is something Richard recently changed, it's probably just an 
oversight on his part.   I believe he's in the UK and may be offline for 
the day.


jeff

Re: Moving to git

2015-08-20 Thread Jeff Law


On 08/20/2015 11:57 AM, Jason Merrill wrote:

I hear that at Cauldron people were generally supportive of switching
over to git as the primary GCC repository, and talked about me being
involved in that transition.  Does anyone have more information about
this discussion?

Our current workflow translates over to a git master pretty easily:
basically, in the current git-svn workflow, replace git svn rebase and
git svn dcommit with git pull --rebase and git push.

Right.



It should be pretty straightforward to use the existing git mirror as
the master repository; the main adjustment I'd want to make is rewriting
the various subdirectory branches to be properly represented in git.
This is straightforward, but we'll want to stop SVN commits to
subdirectory branches shortly before the changeover.

Seems reasonable.

I think we also need to convert our SVN hooks into git hooks, but 
presumably that'll be easy.


I suspect Jakub will strongly want to see some kind commit hook to 
associate something similar to an SVN id to each git commit to support 
his workflow where the SVN ids are  associated with the compiler 
binaries he keeps around for very fast bisection.  I think when we 
talked about it last year, he just needs an increasing # for each 
commit, presumably starting with whatever the last SVN ID is when we 
make the change.





It would be good to have a more explicit policy on branch/tag creation,
rebasing, and deletion in the git world where branches are lighter
weight and so more transient.
Presumably for branch/tag creation the primary concern is the namespace? 
 I think if we define a namespace folks can safely use without getting 
in the way of the release managers we get most of what we need.


ISTM that within that namespace, folks ought to have the freedom to use 
whatever works for them.  If folks want to create a transient branch, 
push-rebase-push on that branch, then later remove it, I tend to think, 
why not let them.


Do we want a namespace for branches which are perhaps not as transient 
in nature, ie longer term projects, projects on-ice or works-in-progress 
that we don't want to lose?


As far as the trunk and release branches, are there any best practices 
out there that we can draw from?  Obviously doing things like 
push-rebase-push is bad.  Presumably there's others.


jeff

Re: Moving to git

2015-08-21 Thread Jeff Law


On 08/20/2015 02:09 PM, Jason Merrill wrote:

On 08/20/2015 02:23 PM, Jeff Law wrote:

I suspect Jakub will strongly want to see some kind commit hook to
associate something similar to an SVN id to each git commit to support
his workflow where the SVN ids are  associated with the compiler
binaries he keeps around for very fast bisection.  I think when we
talked about it last year, he just needs an increasing # for each
commit, presumably starting with whatever the last SVN ID is when we
make the change.


Jakub: How about using git bisect instead, and identify the compiler
binaries with the git commit sha1?
That would seem to make reasonable sense to me.  Jakub is on PTO, so we 
should re-engage on this tweak to his workflow when he returns.


Jeff

Re: Moving to git

2015-08-24 Thread Jeff Law


On 08/24/2015 02:17 AM, Jakub Jelinek wrote:

On Thu, Aug 20, 2015 at 04:09:39PM -0400, Jason Merrill wrote:

On 08/20/2015 02:23 PM, Jeff Law wrote:

I suspect Jakub will strongly want to see some kind commit hook to
associate something similar to an SVN id to each git commit to support
his workflow where the SVN ids are  associated with the compiler
binaries he keeps around for very fast bisection.  I think when we
talked about it last year, he just needs an increasing # for each
commit, presumably starting with whatever the last SVN ID is when we
make the change.


Jakub: How about using git bisect instead, and identify the compiler
binaries with the git commit sha1?


That is really not useful.  While you speed it bisection somewhat by avoiding
network traffic and communication with a server, there is still significant
time spent on actually building the compiler.
I thought the suggestion was to use the git hash to identify the builds 
you save.


So you'd use git bisect merely to get the hash id.  Once you've got the 
git hash, you can then use that to find the right cc1/cc1plus/f95 that 
you'd previously built.


It's not perfect (since you can't just look git hashes and know which 
one is newer).


Jeff

Re: Moving to git

2015-08-24 Thread Jeff Law


On 08/24/2015 09:43 AM, Jakub Jelinek wrote:

On Mon, Aug 24, 2015 at 09:34:41AM -0600, Jeff Law wrote:

On 08/24/2015 02:17 AM, Jakub Jelinek wrote:

On Thu, Aug 20, 2015 at 04:09:39PM -0400, Jason Merrill wrote:

On 08/20/2015 02:23 PM, Jeff Law wrote:

I suspect Jakub will strongly want to see some kind commit hook to
associate something similar to an SVN id to each git commit to support
his workflow where the SVN ids are  associated with the compiler
binaries he keeps around for very fast bisection.  I think when we
talked about it last year, he just needs an increasing # for each
commit, presumably starting with whatever the last SVN ID is when we
make the change.


Jakub: How about using git bisect instead, and identify the compiler
binaries with the git commit sha1?


That is really not useful.  While you speed it bisection somewhat by avoiding
network traffic and communication with a server, there is still significant
time spent on actually building the compiler.

I thought the suggestion was to use the git hash to identify the builds you
save.

So you'd use git bisect merely to get the hash id.  Once you've got the git
hash, you can then use that to find the right cc1/cc1plus/f95 that you'd
previously built.

It's not perfect (since you can't just look git hashes and know which one is
newer).


But then you are forced to use git bisect all the time, because the hashes
don't tell you anything.

True.


Most often even before writing a script I try a couple of compiler versions
by hand if I have some extra info (this used to work a 3 years ago, broke in
the last couple of days, etc.).
A map of key hashes would probably be helpful with this kind of thing. 
Major releases, key branch->trunk merge points and the like.


It'd still be somewhat worse usability wise for you, but it ought to be 
manageable.


ANd like I said before, I'd support a git-hook which bumped some kind of 
index at each commit for your workflow.




Perhaps I could touch the cc1.sha1hash files with timestamps corresponding to
the date/time of the commit, and keep them sorted in some file manager by
timestamps, still it would be worse usability wise.
Not to mention we should keep the existing r123456 comments in bugzilla
working, and I'm not convinced keeping a SVN version of the repository
(frozen) for that purpose is the best idea.
I'd like to keep the old ones working, but new references should 
probably be using the hash id and commit name.


As for how to best keep the old r123456 links working, I don't know. 
Presumably those could be mapped behind the scenes to a git id.


Jeff

Re: Offer of help with move to git

2015-08-24 Thread Jeff Law


On 08/24/2015 01:46 PM, Frank Ch. Eigler wrote:

Joseph Myers  writes:


[...]
FWIW, Jason's own trial conversion with reposurgeon got up to at least
45GB memory consumption on a 32GB repository.


(The host sourceware.org box has 72GB.)
And if Jason really needs it, we've got considerably larger systems in 
our test farm that he could provision for this task.


Jeff

Re: fake/abnormal/eh edge question

2015-08-25 Thread Jeff Law


On 08/25/2015 12:39 PM, Steve Ellcey  wrote:

I have a question about FAKE, EH, and ABNORMAL edges.  I am not sure I
understand all the implications of each type of edge from the description
in cfg-flags.def.

I am trying to implement dynamic stack alignment for MIPS and I have code
that does the following:

prologue
copy incoming $sp to $12 (temp reg)
align $sp
copy $sp to $fp (after alignment so that $fp is also aligned)
entry block
copy $12 to virtual reg (DRAP) for accessing args and for
restoring $sp

exit block
copy virtual reg (DRAP) back to $12
epilogue
copy $12 to $sp to restore stack pointer


This works fine as long as there as a path from the entry block to the
exit block but in some cases (like gcc.dg/cleanup-8.c) we have a function
that always calls abort (a non-returning function) and so there is no
path from entry to exit and the exit block and epilogue get removed and
the copy of $sp to $12 also gets removed because GCC sees no uses of $12.

I want to preserve the copy of $sp to $12 and I also want to preserve the
.cfi psuedo-ops (and code) in the exit block and epilogue in order for
exception handling to work correctly.  One way I thought of doing this
is to create an edge from the entry block to the exit block but I am
unsure of all the implications of creating a fake/eh/abnormal edge to
do this and which I would want to use.

Presumably it's the RTL DCE pass that's eliminating this stuff?

Do you have the FRAME_RELATED bit set of those insns?

But what I don't understand is why preserving the code is useful if it 
can't be reached.  Maybe there's something about the dwarf2 unwinding 
that I simply don't understand -- I've managed to avoid learning about 
it for years.


jeff

Re: fake/abnormal/eh edge question

2015-08-25 Thread Jeff Law


On 08/25/2015 03:54 PM, Steve Ellcey wrote:

On Tue, 2015-08-25 at 14:44 -0600, Jeff Law wrote:


I want to preserve the copy of $sp to $12 and I also want to preserve the
.cfi psuedo-ops (and code) in the exit block and epilogue in order for
exception handling to work correctly.  One way I thought of doing this
is to create an edge from the entry block to the exit block but I am
unsure of all the implications of creating a fake/eh/abnormal edge to
do this and which I would want to use.

Presumably it's the RTL DCE pass that's eliminating this stuff?


Actually, it looks like is peephole2 that is eliminating the
instructions (and .cfi psuedo-ops).
Strange.  I'm not sure why peep2 would be deleting those instructions, 
except perhaps as a side effect of a cfgcleanup or somesuch.



that I simply don't understand -- I've managed to avoid learning about
it for years.


I am not entirely sure I need the code or if I just need the .cfi
psuedo-ops and that I need the code to generate the .cfi stuff.

I wish I could avoid the dwarf unwinder but that seems to be the main
problem I am having with stack realignment.  Getting the cfi stuff right
so that the unwinder works properly is proving very hard.
Yea, unfortunately I can't help much there.  I see dwarf-anything and my 
eyes just glaze over and I thank the powers that be that Jakub, Jason 
and others are around to handle that stuff.


jeff

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 01:31 PM, Eric S. Raymond wrote:

mib = mib 
Michael Bushnell.  Aagain, not active in forever. 
m...@geech.gnu.ai.mit.edu probably doesn't work anymore.



miles = miles 

Miles Bader.  mi...@gnu.ai.mit.edu


mkoch = mkoch 

Michael Koch?  konque...@gmx.de/



moore = moore 

Catherine, Tim?



mycroft = mycroft 
Charles Hannum.  Hasn't been active in forever.  mycr...@gnu.ai.mit.edu 
probably doesn't work anymore.


Might help if we had a reference to one or more changes from the folks. 
 Just knowing timeframes for example would likely resolve .


Jeff

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 02:09 PM, Eric S. Raymond wrote:

Jeff Law :

On 08/26/2015 01:31 PM, Eric S. Raymond wrote:

mib = mib 

Michael Bushnell.  Aagain, not active in forever. m...@geech.gnu.ai.mit.edu
probably doesn't work anymore.


miles = miles 

Miles Bader.  mi...@gnu.ai.mit.edu


mycroft = mycroft 

Charles Hannum.  Hasn't been active in forever.  mycr...@gnu.ai.mit.edu
probably doesn't work anymore.


Right, I recognize these people as long-time hard-core GNU
contributors.  It would be a bit surprising if they *weren't* in the
history anywhere.  Adding them now...

That's why those 3 popped out at me.


moore = moore 

Catherine, Tim?
The more I think about it, it's more likely Tim.  Catherine typically 
used clm@ and Tim used moore@.



Certainly if it was a change to the PA port, then it was Tim.

Jeff

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 02:35 PM, Eric S. Raymond wrote:

Joseph Myers :

On Wed, 26 Aug 2015, Eric S. Raymond wrote:


After comparing with the Subversion hists, passswd file, the are 30
unknowns left.  Can anyone identify any of these?

aluchko = aluchko 


Aaron Luchko 


Aha.  I thought that was him.  I found his SourceForge account.


bo = bo 


Bo Thorsen 


Oh, thank you.  I had *no* idea how I was going to pin down that one.  It's
a unpromising string for web searches.


ira = ira 


Ira Ruben 


irar = irar 


Ira Rosen 


I pretty much knew these two guys went with these two names, but couldn't
figure out which was which.  Thanks.


[others omitted]

All of the above are emails from the time of some commits, not necessarily
current.


That's OK.  Addresses will go stale. The important thing is to
preserve as good odds as possible that fuure data mining will be able
to recognize when different name/address pairs with the same
name-among-humans refer to the same person.

The remaining list is pretty short:

bson = bson 
fx = fx 
fx is active... Francois-Xavier Coudert fxcoud...@gcc.gnu.org  Not sure 
how I missed that the first time around.

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 02:44 PM, Ian Lance Taylor wrote:



friedman = friedman 


Noah Friedman (was ).

Yea.




fx = fx 


Dave Love (was ).
Hmm, not Francois-Xavier Coudert?  I guess if it's an old commit, then 
Dave Love is more likely.  Given most of the names we're pulling out are 
from old contributors, Dave is probably the most likely.





hassey = hassey 


John Hassey (was ).

Yes.




jrv = jrv 


James Van Artsdalen (was ).

Yes.


karl = karl 


Karl Berry (was k...@cs.umb.edu).

Yes.




moore = moore 


Timothy Moore (was ).

(Catherine Moore is clm).

I think the consensus is Tim.  He'll also be moore@*.cs.utah.edu


wood = wood 


Tom Wood (was ).

Yes.

jeff

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 02:50 PM, Eric S. Raymond wrote:

Jeff Law :

moore = moore 

Catherine, Tim?

The more I think about it, it's more likely Tim.  Catherine typically used
clm@ and Tim used moore@.


Certainly if it was a change to the PA port, then it was Tim.


What was his address?
Most were @cs.utah.edu, but he left Utah eons ago.  He was @redhat.com 
for a while, not sure after that.





About Catherine Moore: One of the things that is quite noticeable
about this list is the number of apparently female contributors, which
while regrettably small in absolute terms is still rather more than
I'm used to seeing in a sample this size.

This makes me curious.  Was any special effort made to attract female
hackers to the project? Is there a prevailing theory about why they
showed up in comparatively large numbers?

It would be interesting to know what, if any specific thing, was done
right here...
I can't recall any special effort.  Just always trying to encourage 
anyone to contribute in whatever way they could.


jeff

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 02:54 PM, Joseph Myers wrote:



click = click 


Nick Clifton 

Wow, never knew 'click' would be Nick.

ni...@redhat.com is probably better than ni...@cygnus.com

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 06:02 PM, Peter Bergner wrote:

On Wed, 2015-08-26 at 13:44 -0700, Ian Lance Taylor wrote:

On Wed, Aug 26, 2015 at 12:31 PM, Eric S. Raymond  wrote:

click = click 


You've got me on that one.  Any hints?


Just purely looking at the name, did Cliff Click ever
contribute to gcc in the past?

I don't think so.  It was my first thought when I say click@.

jeff

Re: 33 unknowns left

2015-08-26 Thread Jeff Law


On 08/26/2015 07:37 PM, Joel Sherrill wrote:



On August 26, 2015 8:28:40 PM CDT, Jeff Law  wrote:

On 08/26/2015 06:02 PM, Peter Bergner wrote:

On Wed, 2015-08-26 at 13:44 -0700, Ian Lance Taylor wrote:

On Wed, Aug 26, 2015 at 12:31 PM, Eric S. Raymond 

wrote:

click = click 


You've got me on that one.  Any hints?


Just purely looking at the name, did Cliff Click ever
contribute to gcc in the past?

I don't think so.  It was my first thought when I saw click@.


Didn't Amazon get a patent on the one click@?

I recall something like that.



Seriously the email review has been a walk down memory lane. :)

Very much so.  Many of those folks pre-date my involvement in GCC.

Jeff

Re: Repository for the conversion machinery

2015-08-27 Thread Jeff Law


On 08/27/2015 10:16 AM, Eric S. Raymond wrote:

Paulo Matos :



On 27/08/15 16:56, Paulo Matos wrote:


I noticed I am not on the list (check commit r225509, user pmatos) either.

And thanks for your help on this transition.



r188804 | mkuvyrkov 

Maxim Kuvyrkov

jeff

Re: Repository for the conversion machinery

2015-08-27 Thread Jeff Law


On 08/27/2015 10:04 AM, FX wrote:

If the former, then I don't know why they're not in the map.


In fact, I can look at the output of “svn log” for the MAINTAINERS file, which 
probably almost everyone with commit rights has modified.
This contains 442 usernames, compared to the map’s 290. And there are probably 
more, which we’ll miss if we have to rely on manual modifications of that list…

How was the map generated?

FX



PS: I found one username that first escaped my scripts because it contained a 
period, so I am raising a flag here, so the same doesn’t happen to you: m.hayes 
(commit 34779).


Michael Hayes?

Jeff

Re: Ambiguous usernames

2015-08-27 Thread Jeff Law


On 08/27/2015 11:27 AM, Eric S. Raymond wrote:



I'm pretty sure I know who bothner, brendan, drepper, eggert, ian,
jimb, meissner, and roland are; they've all had stable handles longer
than GCC has existed.

Yup.

 (Raise a glass to Brendan Kehoe; he was a fine

hacker and a good man and it's a damn shame we lost him.)

Absolutely.



Scrutiny should therefore fall particularly on amylaar, bje, bkoz,
dje, gavin, kenner, krab, law, meyering, mrs, raeburn, shebs, and
wilson.
What do you need here?  I can confirm that each of those handles 
corresponds to one specific individual person, with the exception of dje 
(which we know is Doug Evans and David Edelsohn) and krab, which I don't 
know the history behind.


Jeff

Re: Repository for the conversion machinery

2015-08-28 Thread Jeff Law


On 08/28/2015 09:26 AM, Joseph Myers wrote:

All the cygnus.com addresses are out of date.  More current replacements
for a few:

echristo = Eric Christopher 
merrill = Jason Merrill 
   (if someone appears with multiple usernames, probably make their address
consistent for all of them unless specifically requested otherwise)
rsavoye = Rob Savoye 
Given that I worked for Cygnus and still work with Red Hat, I can make a 
pass over all the @cygnus.com addresses and probably give something more 
up-to-date for most of them if that's useful.


jeff

Re: Predictive commoning leads to register to register moves through memory.

2015-08-28 Thread Jeff Law


On 08/28/2015 09:43 AM, Simon Dardis wrote:


Following Jeff's advice[1] to extract more information from GCC, I've
narrowed the cause down to the predictive commoning pass inserting
the load in a loop header style basic block. However, the next pass
in GCC, tree-cunroll promptly removes the loop and joins the loop
header to the body of the (non)loop. More oddly, disabling
conditional store elimination pass or the dominator optimizations
pass or disabling of jump-threading with --param
max-jump-thread-duplication-stmts=0 nets the above assembly code. Any
ideas on an approach for this issue?
I'd probably start by looking at the .optimized tree dump in both cases 
to understand the difference, then (most liklely) tracing that through 
the RTL optimizers into the register allocator.


jeff

Re: Repository for the conversion machinery

2015-08-28 Thread Jeff Law


On 08/28/2015 09:57 AM, Eric S. Raymond wrote:

Jeff Law :

Given that I worked for Cygnus and still work with Red Hat, I can make a
pass over all the @cygnus.com addresses and probably give something more
up-to-date for most of them if that's useful.


That would be *very* useful.

Here's my stab at all the @cygnus.com and @redhat.com addresses. 
There's several I lost track of through the years.



bill = Bill Cox 
Retired.  Not sure of his email address.

billm = Bill Moyer 
Now at Sonic.net.  bi...@ciar.org

noer = Geoffrey Noer 
Now at Panasas.  I don't have an email address though.

raeburn = Ken Raeburn 
Now at Permabit.  raeb...@permabit.com

abalkiss = Anthony Balkissoon 
No clue on this one, not with Red Hat anymore.

aluchko = Aaron Luchko 
Grad Student at University of Alberta.

bbooth = Brian Booth 
Student at Simon Fraser University.  b...@sfu.ca

djee = David Jee 
No clue.  Not with Red Hat anymore

fitzsim = Thomas Fitzsimmons 
No clue.  Not with Red Hat anymore

hiller = Matthew Hiller 
No clue.  Not with Red Hat anymore

jknaggs = Jeff Knaggs 
Not with Red Hat anymore

spolk = Syd Polk 
Now at Mozilla.  sydp...@gmail.com

apbianco = Alexandre Petit-Bianco 
Now at Google.  apbia...@serialhacker.org

bkoz = Benjamin Kosnik 
b...@gnu.org

cchavva = Chandra Chavva 
Cavium Networks  ccha...@caviumnetworks.com

clm = Catherine Moore 
Code Sourcery  c...@codesourcery.com

graydon = Graydon Hoare 
Now at Stellar Development.  I don't know his email address.

jimb = Jim Blandy 
Now at Mozilla I believe.  j...@red-bean.com

kgallowa = Kyle Galloway 
He's at twitter now, but I don't have an email address.

rsandifo = Richard Sandiford 
ARM rdsandif...@googlemail.com

tromey = Tom Tromey 
Now at Mozilla I believe.  t...@tromey.com

cagney = Andrew Cagney 
No longer with Red Hat.  Not sure where he is now.

chastain = Michael Chastain 
Not with Red Hat anymore.  Spent some time at Google, but 
I don't think he's there anymore either. 

dlindsay = Don Lindsay 
Not with Red Hat anymore.  Was at Cisco for a period of time
(linds...@cisco.com).  Not sure if he's still there.

trix = Tom Rix 
Not with Red Hat anymore.  No idea where he is now.


All these are still current:
aldyh = Aldy Hernandez 
amacleod = Andrew Macleod 
aoliva = Alexandre Oliva 
aph = Andrew Haley 
brolley = Dave Brolley 
carlos = Carlos O'Donell 
click = Nick Clifton 
davem = David S. Miller 
dj = DJ Delorie 
dmalcolm = David Malcolm 
fche = Frank Ch. Eigler 
fnasser = Fernando Nasser 
gary = Gary Benson 
gavin = Gavin Romig-Koch 
green = Anthony Green 
jakub = Jakub Jelinek 
jason = Jason Merrill 
jkratoch = Jan Kratochvil 
kevinb = Kevin Buettner 
kseitz = Keith Seitz 
ktietz = Kai Tietz 
law = Jeff Law 
merrill = Jason Merrill 
mpolacek = Marek Polacek 
nickc = Nick Clifton 
oliva = Alexandre Oliva 
palves = Pedro Alves 
pmuldoon = Phil Muldoon 
rth = Richard Henderson 
scox = Stan Cox 
tiemann = Michael Tiemann 
torvald = Torvald Riegel 
vmakarov = Vladimir Makarov 
wcohen = William Cohen

Re: Offer of help with move to git

2015-08-28 Thread Jeff Law


On 08/27/2015 10:13 PM, Eric S. Raymond wrote:

I'd like to use the --legacy flag so that old references to SVN commits are
easier to look up.


Your call, but ... I don't recommend it.  It's very cluttery, and I've found the
demand for that kind of lookup tends to drop off after conversion faster than
people expect it will.
I suspect we do this with  more regularity than most projects.  Hell, I 
regularly wish we had all the emacs backups files from the old mit 
machines (they were unfortunately purged regularly to make space).



Jeff

Re: Repository for the conversion machinery

2015-08-28 Thread Jeff Law


On 08/28/2015 12:29 PM, Eric S. Raymond wrote:

Jeff Law :

Here's my stab at all the @cygnus.com and @redhat.com addresses. There's
several I lost track of through the years.


Would you please resend this as a contrib map with the updated
addresses in it?  I find that when I hand-edit these in I make too
many cut'n'paste errors. Given a contrib map, repomapper -u can do it
all in one go.

If you don't know a current address for the person, we'll just leave the
redhat one in place - best we can do.

Will do, but won't get to it today.  Monday most likely.

jeff

Re: reload question about unmet constraints

2015-09-01 Thread Jeff Law


On 09/01/2015 01:44 AM, DJ Delorie wrote:

Given this test case for rl78-elf:

extern __far int a, b;
void ffr (int x)
{
   a = b + x;
}

I'm trying to use this patch:

Index: gcc/config/rl78/rl78-virt.md
===
--- gcc/config/rl78/rl78-virt.md  (revision 227360)
+++ gcc/config/rl78/rl78-virt.md(working copy)
@@ -92,15 +92,15 @@
 ]
"rl78_virt_insns_ok ()"
"v.inc\t%0, %1, %2"
  )

  (define_insn "*add3_virt"
-  [(set (match_operand:QHI   0 "rl78_nonfar_nonimm_operand" "=vY,S")
-   (plus:QHI (match_operand:QHI 1 "rl78_nonfar_operand" "viY,0")
- (match_operand:QHI 2 "rl78_general_operand" "vim,i")))
+  [(set (match_operand:QHI   0 "rl78_nonimmediate_operand" "=vY,S,Wfr")
+   (plus:QHI (match_operand:QHI 1 "rl78_general_operand" "viY,0,0")
+ (match_operand:QHI 2 "rl78_general_operand" "vim,i,vi")))
 ]
"rl78_virt_insns_ok ()"
"v.add\t%0, %1, %2"
  )

  (define_insn "*sub3_virt"


To allow the rl78 port to generate the "Wfr/0/r" case (alternative 3).
(Wfr = far MEM, v = virtual regs).

I expected gcc to see that the operation doesn't meet the constraints,
and move operands into registers to make it work (alternative 1,
"v/v/v").
That'd be my expectation as well.  Note that addXX patterns may be 
special. I can recall a fair amount of pain with them on oddball ports.






Instead, it just complains and dies.

dj.c:42:1: error: insn does not satisfy its constraints:
  }
  ^
(insn 10 15 13 2 (set (mem/c:HI (reg:SI 8 r8) [1 a+0 S2 A16 AS2])
 (plus:HI (mem/c:HI (plus:HI (reg/f:HI 32 sp)
 (const_int 4 [0x4])) [1 x+0 S2 A16])
 (mem/c:HI (symbol_ref:SI ("b") ) [1 b+0 
S2 A16 AS2]))) dj.c:41 13 {*addhi3_virt}
  (nil))
dj.c:42:1: internal compiler error: in extract_constrain_insn, at recog.c:2200


Reloads for insn # 10
Reload 0: reload_in (SI) = (symbol_ref:SI ("a")  )
 V_REGS, RELOAD_FOR_INPUT (opnum = 0), inc by 2
 reload_in_reg: (symbol_ref:SI ("a")  )
 reload_reg_rtx: (reg:SI 8 r8)
Reload 1: reload_in (HI) = (mem/c:HI (plus:HI (reg/f:HI 32 sp)
 (const_int 4 [0x4])) 
[2 x+0 S2 A16])
 reload_out (HI) = (mem/c:HI (plus:HI (reg/f:HI 32 sp)
 (const_int 4 [0x4])) 
[2 x+0 S2 A16])
 V_REGS, RELOAD_OTHER (opnum = 1), optional
 reload_in_reg: (mem/c:HI (plus:HI (reg/f:HI 32 sp)
 (const_int 4 [0x4])) 
[2 x+0 S2 A16])
 reload_out_reg: (mem/c:HI (plus:HI (reg/f:HI 32 sp)
 (const_int 4 [0x4])) 
[2 x+0 S2 A16])
Reload 2: reload_in (HI) = (mem/c:HI (symbol_ref:SI ("b")  ) [2 b+0 S2 A16 AS2])
 V_REGS, RELOAD_FOR_INPUT (opnum = 2), optional
 reload_in_reg: (mem/c:HI (symbol_ref:SI ("b")  ) [2 b+0 S2 A16 AS2])
Note that reload 1 and reload 2 do not have a reload_reg_rtx.  My 
memories of reload are fading fast (thank goodness), but I believe 
that's an indication that it's not reloading into a hard register.


So I'd start with looking at find_reloads/push_reload and figure out why 
it's not getting a suitable register.  It might be good to know what 
alternative is being targeted by reload.  ie, you'll be looking at 
goal_alternative* in find_reloads.


Again, my memories are getting stale here, so double-check the meaning 
of reload_reg_rtx ;-)



jeff

Re: incremental compiler project

2015-09-03 Thread Jeff Law


On 09/03/2015 10:36 AM, Manuel López-Ibáñez wrote:

On 02/09/15 22:44, David Kunsman wrote:

Hello, I just read over the incremental compiler project on the gcc
wiki...and I am excited to try to finish it.  I am just wondering if
it is even wanted anymore because it is 7-8 years old.  Does anybody
know if this project is wanted anymore?


The overall goal of the project is worthwhile, however, it is unclear
whether the approach envisioned in the wiki page will lead to the
desired benefits. See http://tromey.com/blog/?p=420 which is the last
status report that I am aware of. In addition to that, the
implementation itself would already be incredibly challenging for anyone
with many years of experience in GCC.
Agreed.  I think the google project went further, but with Lawrence 
retiring, I think it's been abandoned.


Jeff

Re: incremental compiler project

2015-09-04 Thread Jeff Law


On 09/04/2015 09:40 AM, David Kunsman wrote:

what do you think about the sub project in the wiki:

Parallel Compilation:

One approach is to make the front end multi-threaded. (I've pretty
much abandoned this idea. There are too many mutable tree fields,
making this a difficult project. Also, threads do not interact well
with fork, which is currently needed by the code generation approach.)
You should get in contact with David Malcolm as these issues are 
directly related to his JIT work.


This will entail removing most global variables, marking some with
__thread, and wrapping a few with locks.
Yes, but that's work that is already in progress.  Right now David's got 
a big log and context switch in place, but we really want to drive down 
the amount of stuff in that context switch.



Jeff

Re: incremental compiler project

2015-09-04 Thread Jeff Law


On 09/04/2015 10:14 AM, Jonathan Wakely wrote:

On 4 September 2015 at 16:57, Manuel López-Ibáñez wrote:

Clang++ is much faster yet it is doing more and tracking more data
than cc1plus.


How much faster these days? In my experience for optimized builds of
large files the difference is not so impressive (for unoptimized
builds clang is definitely much faster).
Which would generally indicate that the front-end and mandatory parts of 
the middle/backend are slow for GCC (relatively to clang/llvm), but the 
optimizers in GCC are faster.


That wouldn't be a huge surprise given home much time has been spent 
trying to keep the optimizers fast.


jeff

Re: How to allocate memory safely in RTL, preferably on the stack? (relating to the RTL-level if-converter)

2015-09-08 Thread Jeff Law


On 09/08/2015 12:05 PM, Abe wrote:

Dear all,

In order to be able to implement this idea for stores, I think I need
to make some changes to the RTL if-converter such that it will
sometimes add -- to the code being compiled -- a new slot/variable in
the stack frame.  This memory needs to be addressable via a pointer
in the code being generated, so AFAIK just allocating a new
pseudo-register won`t work and AFAIK using an RTL "scratch" register
also won`t work.  I also want to do my best to ensure that this
memory is thread-local.  For those reasons, I`m asking about the
stack.

Look at assign_stack_local.


Jeff

Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law


On 09/08/2015 12:39 PM, Aditya K wrote:

IIUC, in the haifa-sched.c, the default scheduling algorithm seems to
be top-down (before reload). Is there a way to schedule the other way
(bottom up), or both ways?
Not that I'm aware of.  Note that region scheduling allows insns to move 
between basic blocks to help fill the bubbles that can occur at the end 
of a block.




As a use case for bottom-up or some other heuristic: Currently, the
first priority in the selection is given to the longest path, in some
cases this may produce code with stalls at the end of the basic
block. Whereas in the case of combined top-down + bottom-up
scheduling we would end up having stalls in the middle of the basic
block.
GCC's original scheduler worked bottom-up until ~1997.  IBM Haifa's work 
turned it into a top-down model and was a small, but clear improvement.


There's certainly better things that can be done than strictly top-down 
or bottom-up, but revamping the scheduler again hasn't been seen as a 
major win for the most common processors GCC targets these days.  Thus 
it hasn't been a significant area of focus.


Jeff

Re: Why scheduler do not re-emit REG_DEAD notes?

2015-09-08 Thread Jeff Law


On 09/07/2015 10:05 AM, Konstantin Vladimirov wrote:

Hi,

In private backend for GCC 5.2.0, we do have target-specific scheduler
(running in TARGET_SCHED_FINISH hook), that do some instruction
packing/pairing on sched2 and relies on REG_DEAD notes, that should be
correct.

But they aren't because inside haifa-sched.c, that is being run first
in the sched2 pass, reemit_notes function processes only REG_SAVE_NOTE
case, and, after this scheduler, some insns with REG_DEAD on register,
say r1, might be moved before previous r1 usage (input dependency
case) and things become totally wrong.

Now I appllied some minimal patch locally to fix it (just added
separate REG_DEAD case).

But may be it is part of design and may be it is generally true, that
we can't rely on correct REG_DEAD notes in platform-specific
scheduler?
You can not rely on death notes within the scheduler.  That's been in 
its design as long as I can remember (circa 1992).


jeff

Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law


On 09/08/2015 01:40 PM, Vladimir Makarov wrote:



As I remember it is was written by Mike Tiemann.

Correct.


  Bottom-up scheduler as

a rule generates worse code than top-down one.
Indeed that was one of the key things we were looking to get from the 
Haifa scheduler along with improved superscalar support some support for 
region scheduling & speculation.



Yes, that is true for OOO execution processors which can rearrange insns
and execute them speculatively looking through several branches.  For
such processors, software pipelining is more important as the processors
can look only through a few branches as software pipelining could look
through any number of branches.  That is why Intel compiler did not have
any insn scheduler (but had software pipelining) until Intel Atom
introduction which was originally in-order processor.
Correct.  Latency scheduling just isn't that important for OOO and 
instead you look at scheduling to mitigate costs for large latency 
operations (ie, cache miss  and transcendental functions).  You might 
also attack secondary issues like throughput at the retirement stage for 
example.




Actually, I believe dealing with variable/unknown latency of load insns
(depending where data are placed in a cache or memory) would be more
important than bottom-up or hybrid scheduler.
Agreed.  This is in-line with what the HP guys were seeing as they 
transitioned to the PA8000.



  A balanced scheduling

dealing with this problem was implemented by Alexander Monakov about 7-8
years ago as a google internship work but it was not included as at that
time its advantages was not confirmed on SPEC2000.  It would be
interesting to reconsider and re-evaluate it on modern processors and
scientific benchmarks with big data.

Agreed.



For in-order processors, we also have another scheduler (selective one)
which does additional transformations (like register renaming and
non-modulo software pipelining) which could be more important than
top-down/bottom-up scheduling.  And it gave 1-2% improvement on Itanium
SPEC2000 in comparison with haifa scheduler.

Right.

Jeff

Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law

On 09/08/2015 01:24 PM, Aditya K wrote:

Subject: Re: Combined top-down and bottom-up instruction scheduler
To: hiradi...@msn.com; gcc@gcc.gnu.org
CC: vmaka...@redhat.com
From: l...@redhat.com
Date: Tue, 8 Sep 2015 12:51:24 -0600

On 09/08/2015 12:39 PM, Aditya K wrote:

IIUC, in the haifa-sched.c, the default scheduling algorithm seems to
be top-down (before reload). Is there a way to schedule the other way
(bottom up), or both ways?

Not that I'm aware of. Note that region scheduling allows insns to move
between basic blocks to help fill the bubbles that can occur at the end
of a block.

As a use case for bottom-up or some other heuristic: Currently, the
first priority in the selection is given to the longest path, in some
cases this may produce code with stalls at the end of the basic
block. Whereas in the case of combined top-down + bottom-up
scheduling we would end up having stalls in the middle of the basic
block.

GCC's original scheduler worked bottom-up until ~1997. IBM Haifa's work
turned it into a top-down model and was a small, but clear improvement.

There's certainly better things that can be done than strictly top-down
or bottom-up, but revamping the scheduler again hasn't been seen as a
major win for the most common processors GCC targets these days. Thus
it hasn't been a significant area of focus.

Do you have pointers on places to look for if I want to explore bottom-up, or 
maybe a combination of the two.

Not immediately handy.  I'd comb through PLDI through the 1990s and 
early 2000s and possibly Morgan's compiler book.

jeff

Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law


On 09/08/2015 03:12 PM, Evandro Menezes wrote:

cache miss  and transcendental functions).  You might also attack

secondary

issues like throughput at the retirement stage for example.


Our motivation stems from the fact that even modern, aggressively OOO
processors don't have orthogonal resources.  Some insns depend on expensive
circuitry (area or power wise) that is added only once, making such insns
simply scalar, though most of other insns enjoy multiple resources capable
of executing them as superscalar.  That's why we believe that a hybrid
approach might yield good results.  We don't have data, for it possibly
requires implementing it first.

I'd also argue that looking at an OOO pipeline in a steady state is not the
only approach.  It's also important to consider how quickly the pipeline can
be replenished or warmed up to reach a steady state.
Which is why I mentioned optimizing for throughput at the retirement 
stage rather than traditional latency scheduling.


That's from a real world case -- the PA8000 where retirement bandwidth 
was at a premium (relative to functional unit bandwidth).


jeff

Re: Advertisement in the GCC mirrors list

2015-09-09 Thread Jeff Law


On 09/09/2015 10:41 AM, Jonathan Wakely wrote:

Gerald, I think we've had similar issues with these mirrors in the
past as well, shall we just remove them from the list?

Please do.
jeff

Re: Ubsan build of GCC 6.0 fails with: cp/search.c:1011:41: error: 'otype' may be used uninitialized in this function

2015-09-09 Thread Jeff Law


On 09/09/2015 01:17 PM, Martin Sebor wrote:

On 09/09/2015 12:36 PM, Toon Moene wrote:

See:

https://gcc.gnu.org/ml/gcc-testresults/2015-09/msg00699.html

Full error message:

/home/toon/compilers/trunk/gcc/cp/search.c: In function 'int
accessible_p(tree, tree, bool)':
/home/toon/compilers/trunk/gcc/cp/search.c:1011:41: error: 'otype' may
be used uninitialized in this function [-Werror=maybe-uninitialized]
dfs_accessible_data d = { decl, otype };
  ^
Any ideas ?


It looks as though GCC assumes that TYPE can be null even though
it can't (if it was, TYPE_P (type) would then dereference a null
pointer). As a workaround until this is fixed, initializing OTYPE
with type instead of in the else block should get rid of the error.

Here's a small test case that reproduces the bogus warning:

cat t.c && /build/gcc-trunk/gcc/xg++ -B /build/gcc-trunk/gcc
-Wmaybe-uninitialized -O2 -c -fsanitize=undefined t.c
struct S { struct S *next; int i; };

int foo (struct S *s) {
 int i;

 if (s->i) {
 struct S *p;
 for (p = s; p; p = p->next)
 i = p->i;
 }
 else
 i = 0;

 return i;
}
t.c: In function ‘int foo(S*)’:
t.c:14:12: warning: ‘i’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
  return i;
More likely than not, the sanitization bits get in the way of VRP + jump 
threading rotating the loop.


jeff

Re: Using the asm suffix

2015-09-09 Thread Jeff Law


On 09/07/2015 06:56 PM, David Wohlferd wrote:

In order for the doc maintainers to approve this patch, I need to have
someone sign off on the technical accuracy.  Now that I have included
the points we have discussed (attached), hopefully we are there.

Original text: https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html
Proposed text: http://limegreensocks.com/gcc/Asm-Labels.html

Still pending is the line I removed about 'static variables in
registers' that belongs in the Reg Vars section.  I have additional
changes I want to make to Reg Vars sections, so once this patch is
accepted, I'll post that work.

dw

AsmLabels4.patch


Index: extend.texi
===
--- extend.texi (revision 226751)
+++ extend.texi (working copy)

OK.  Please install.

jeff

Re: How to allocate memory safely in RTL, preferably on the stack? (relating to the RTL-level if-converter)

2015-09-10 Thread Jeff Law


On 09/10/2015 12:28 PM, Abe wrote:

On 9/8/15 1:12 PM, Jeff Law wrote:


Look at assign_stack_local.


Thanks very much!

The above was very helpful, and I have started to make some progress on
this work.  I`ll report back when I have much more progress.  Would you
like me to CC further emails about this work to "l...@redhat.com"?
The list is probably more appropriate.  I suspect Bernd will probably 
want to get involved as well now that he's getting back into the swing 
of things.


Jeff

Re: Replacing malloc with alloca.

2015-09-13 Thread Jeff Law


On 09/13/2015 12:28 PM, Florian Weimer wrote:

* Ajit Kumar Agarwal:


The replacement of malloc with alloca can be done on the following
analysis.

If the lifetime of an object does not stretch beyond the immediate
scope. In such cases the malloc can be replaced with alloca.  This
increases the performance to a great extent.


You also need to make sure that the object is small (less than a page)
and that there is no deep recursion going on.  Otherwise, the program
may no longer work after the transformation with real-world restricted
stack sizes.  It may even end up with additional security issues.
You also have to make sure you're not inside a loop.  Even a small 
allocation inside a loop is problematical from a security standpoint.


You also need to look at what other objects might be on the stack and 
you have to look at the functional scope, not the immediate scope as 
alloca space isn't returned until the end of a function.


jeff

Re: Replacing malloc with alloca.

2015-09-14 Thread Jeff Law


On 09/14/2015 02:14 AM, Richard Earnshaw wrote:

On 13/09/15 20:19, Florian Weimer wrote:

* Jeff Law:


On 09/13/2015 12:28 PM, Florian Weimer wrote:

* Ajit Kumar Agarwal:


The replacement of malloc with alloca can be done on the following
analysis.

If the lifetime of an object does not stretch beyond the immediate
scope. In such cases the malloc can be replaced with alloca.  This
increases the performance to a great extent.


You also need to make sure that the object is small (less than a page)
and that there is no deep recursion going on.  Otherwise, the program
may no longer work after the transformation with real-world restricted
stack sizes.  It may even end up with additional security issues.



You also have to make sure you're not inside a loop.  Even a small
allocation inside a loop is problematical from a security standpoint.

You also need to look at what other objects might be on the stack and
you have to look at the functional scope, not the immediate scope as
alloca space isn't returned until the end of a function.


Ah, right, alloca is unscoped (except when there are variable-length
arrays).

Using a VLA might be the better approach (but the size concerns
remain).  Introducing VLAs could alter program behavior in case a
pre-existing alloca call, leading to premature deallocation.



You also have to consider that code generated for functions containing
alloca calls can also be less efficient than for functions that do not
call it (cannot eliminate frame pointers, for example).  So I'm not
convinced this would necessarily be a performance win either.
Yes, but I suspect that eliminating a single malloc/free pair dwarfs the 
cost of needing a frame pointer.   The problem is proving when its safe 
to turn a malloc/free into an alloca.  As folks have shown, it's 
non-trivial when the security aspects are considered.


I've speculated that from a security standpoint that projects ought to 
just ban alloca, particularly glibc.  It's been shown over and over 
again that folks just don't get it right and its ripe for exploitation. 
 It'd be a whole lot easier to convince folks to go this direction if 
GCC was good about that kind of optimization.


Jeff

Re: dejagnu version update?

2015-09-15 Thread Jeff Law


On 09/15/2015 01:23 PM, Bernhard Reutner-Fischer wrote:

On September 15, 2015 7:39:39 PM GMT+02:00, Mike Stump
 wrote:

On Sep 14, 2015, at 3:37 PM, Jeff Law  wrote:

Maybe GCC-6 can bump the required dejagnu version to allow for
getting rid of all these superfluous load_gcc_lib? *blink* :)

I'd support that as a direction.

Certainly dropping the 2001 version from our website in favor of
1.5

(which is what I'm using anyway) would be a step forward.

So, even ubuntu LTS is 1.5 now.  No harm in upgrading the website
to 1.5.  I don’t know of any reason to not update and just require
1.5 at this point.  I’m not a fan of feature chasing dejagnu, but
an update every 2-4 years isn’t unreasonable.

So, let’s do it this way…  Any serious and compelling reason to
not update to 1.5?  If none, let’s update to 1.5 in another week or
two, if no serious and compelling reasons not to.

My general plan is, slow cycle updates on dejagnu, maybe every 2
years. LTS style releases should have the version in it before the
requirement is updated.  I take this approach as I think this
should be the maximal change rate of things like make, gcc, g++,
ld, if possible.


Yea, although this means that 1.5.3 (a Version with the libdirs
tweak) being just 5 months old will have to wait another bump, I
fear. For my part going to plain 1.5 is useless WRT the load_lib
situation. I see no value in conditionalizing simplified libdir
handling on a lucky user with recentish stuff so i'm just waiting
another 2 or 4 years for this very minor cleanup.
Given we haven't updated the dejagnu reqs since ~2001, I think stepping 
forward would be appropriate and I'd support moving all the way to 1.5.3 
with the expectation that we'll be on a cadence of no faster than 2 
years going forward.


jeff

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1403 matches

Mail list logo