Configuration question

2006-11-10 Thread Steve Ellcey
I have run into a libstdc++ configuration issue and was wondering if it
is a known issue or not.

My build failed because the compiler I am using to build GCC and
libstdc++ does not have wchar support and does not define mbstate_t.
The compiler (and library) that I am creating however, do support wchar
and do define mbstate_t.  Both compilers are GCC, the old one does not
include a -D that the new one does.  mbstate_t (defined in the system
header files) is only seen when this define is set.

The problem is that the libstdc++ configure script is using the original
GCC to check for the existence of mbstate_t (doesn't find it) and using
that information to say that it needs to define mbstate_t when compiling
libstdc++, but libstdc++ is compiled with the newly built GCC which
does have an mbstate_t from the system header files.  Shouldn't the
libstdc++ configure script use the new GCC when checking things with
AC_TRY_COMPILE.  Or is this just not possible?  Is this why some tests
don't use AC_TRY_COMPILE but say "Fake what AC_TRY_COMPILE does"?  See
acinclude.m4 for these comments, there is no explanation about why it is
faking what AC_TRY_COMPILE does.

Steve Ellcey
[EMAIL PROTECTED]


Re: ICE while bootstrapping trunk on hppa2.0w-hp-hpux11.00

2006-11-15 Thread Steve Ellcey
| /raid/tecosim/it/devel/projects/develtools/src/gcc-4.3.0/gcc/libgcc2.c:1970:
| internal compiler error: Segmentation fault
| Please submit a full bug report,
| with preprocessed source if appropriate.
| See http://gcc.gnu.org/bugs.html> for instructions.

I am seeing this too.  I tracked it back to line 5613 of
tree-ssa-loop-ivopts.c (rewrite_use_compare).  There is
a line:

bound = cp->value;

and cp is null.  cp is set with a call to get_use_iv_cost and that
routine does return NULL in some cases so I think we need to check
for a NULL cp before dereferencing it.  I changed

if (bound)
to
if (cp && cp->value)

and set bound inside the if but now I am dying when compiling
decNumber.c so I don't have a bootstrap working yet.

Steve Ellcey
[EMAIL PROTECTED]


Re: Configuration question

2006-11-15 Thread Steve Ellcey
> >  Shouldn't the
> > libstdc++ configure script use the new GCC when checking things with
> > AC_TRY_COMPILE.  
> 
> Yes.
> 
> -benjamin

It looks like this has something to do with using autoconf 2.59 at the
top-level of GCC.  I am experimenting with updating the top-level GCC to
2.59 now that all of the GCC and src sub-trees have been updated to
2.59.  When I tried this on Linux I had no problems but on HP-UX (with
multilibs) it is not working correctly and the failure I get is that
AC_TRY_COMPILE is not using the right GCC when run.

When I undid my top-level change (went back to autoconf 2.14) the
libstdc++ configure worked correctly and the right GCC was used by
AC_TRY_COMPILE.  Most perplexing.

Steve Ellcey
[EMAIL PROTECTED]


Re: failed to compile gcc-4.3-20061209/gcc/varasm.c on OSX 10.3

2006-12-11 Thread Steve Ellcey
Andreas Tobler wrote:

> Dominique Dhumieres wrote:
>>...
>>cc1: warnings being treated as errors
>>../../gcc-4.3-20061209/gcc/varasm.c: In function 'elf_record_gcc_switches':
>>../../gcc-4.3-20061209/gcc/varasm.c:6268: warning: format '%llu' expects type 
>>'long long unsigned int', but argument 3 has type 'long int'
>>../../gcc-4.3-20061209/gcc/varasm.c:6275: warning: format '%llu' expects type 
>>'long long unsigned int', but argument 3 has type 'long int'
>>../../gcc-4.3-20061209/gcc/varasm.c:6283: warning: format '%llu' expects type 
>>'long long unsigned int', but argument 3 has type 'long int'
>>../../gcc-4.3-20061209/gcc/varasm.c:6302: warning: format '%llu' expects type 
>>'long long unsigned int', but argument 3 has type 'long int'
>>make[3]: *** [varasm.o] Error 1
>>
>> Any idea around about the cause and/or the way to fix it?
> 
> This is known to break on all 32-bit targets. (afaik) On 64-bit targets 
> it works.
> 
> You can either wait until the patch is reverted or the correct fix is done.

Do you know if there a GCC bug report for this defect?  I couldn't find
one in bugzilla.  I am seeing this problem with IA64 HP-UX on ToT.  I
tried the workaround you gave and that makes IA64 HP-UX work but causes
other platforms to fail so I am wondering when there will be a real fix
for this bootstrap problem.

Steve Ellcey
[EMAIL PROTECTED]


Running GCC tests on installed compiler

2007-01-12 Thread Steve Ellcey
Can someone one with some deja-knowledge help me figure out how to run
the GCC tests on an installed compiler and without having to do a GCC
build?

I started with 
  runtest -tool gcc --srcdir /proj/opensrc/nightly/src/trunk/gcc/testsuite

and that ran the tests, but it ran them with whatever gcc command it
found in PATH.  I tried setting and exporting CC before running runtest
and putting "CC="  on the runtest command line but neither of those
methods seemed to affect what gcc was run by runtest.

So then I tried to create a site.exp file and use that on the command
line, in site.exp I put:

  set CC "/proj/opensrc/be/ia64-hp-hpux11.23/bin/gcc"
  set srcdir "/proj/opensrc/nightly/src/trunk/gcc/testsuite"

I also tried using the site.exp file that I got from building GCC and
various combinations of the two but all these attempts ended with no
tests runs and the following lines in my log file:

  Running target unix
  Using 
/proj/opensrc/be/ia64-debian-linux-gnu/share/dejagnu/baseboards/unix.exp as 
board description file for target.
  Using /proj/opensrc/be/ia64-debian-linux-gnu/share/dejagnu/config/unix.exp as 
generic interface file for target.
  WARNING: Couldn't find tool config file for unix, using default.

When testing my just built GCC I was seeing:

  Running target unix
  Using 
/proj/opensrc/be/ia64-debian-linux-gnu/share/dejagnu/baseboards/unix.exp as 
board description file for target.
  Using /proj/opensrc/be/ia64-debian-linux-gnu/share/dejagnu/config/unix.exp as 
generic interface file for target.
  Using /proj/opensrc/nightly/src/trunk/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.
  Running 
/proj/opensrc/nightly/src/trunk/gcc/testsuite/gcc.c-torture/compile/compile.exp 
...

The built GCC seems to be picking up an extra .exp file (default.exp) but
I am not sure why or how to fix it so that my non-built compiler runs the
same way.  Can someone help me out here?

Steve Ellcey
[EMAIL PROTECTED]


store_expr, expr_size, and C++

2007-02-26 Thread Steve Ellcey

I am looking at PR target/30826 (an IA64 ABI bug) and have come up with
a patch that basically involves turning off the
CALL_EXPR_RETURN_SLOT_OPT optimization in some instances and forcing GCC
to create a temporary for the (large aggragete) return value of a
function and then copying that temporary value to the desired target.

The problem I am running into is with C++ code in store_expr.  I get to
this if statement:

  if ((! rtx_equal_p (temp, target)
   || (temp != target && (side_effects_p (temp)
  || side_effects_p (target
  && TREE_CODE (exp) != ERROR_MARK
  /* If store_expr stores a DECL whose DECL_RTL(exp) == TARGET,
 but TARGET is not valid memory reference, TEMP will differ
 from TARGET although it is really the same location.  */
  && !(alt_rtl && rtx_equal_p (alt_rtl, target))
  /* If there's nothing to copy, don't bother.  Don't call
 expr_size unless necessary, because some front-ends (C++)
 expr_size-hook must not be given objects that are not
 supposed to be bit-copied or bit-initialized.  */
  && expr_size (exp) != const0_rtx)

and I hit a gcc_assert when calling expr_size().  Even if I avoided this
somehow I would hit it later when calling:

emit_block_move (target, temp, expr_size (exp),
 (call_param_p
  ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));

So my question is:  is there a way to handle this copy/assignment in C++
without depending on expr_size.  I noticed PR middle-end/30017 (turning
memcpys into assignments) which seems to have some of the same issues of
getting expr_size for C++ expressions but that defect is still open so
it doesn't look like there is an answer yet.

Anyone have some ideas on this problem?

Steve Ellcey
[EMAIL PROTECTED]


bootstrap failure on real-install-headers-cpio

2007-03-02 Thread Steve Ellcey

Has anyone seen this bootstrap failure?  I only get it on my hppa*-hp-hpux*
builds, not on ia64-hp-hpux* or on Linux builds.  I assume it is related
to the include-fixed changes but I don't know why I only get it for some
platforms.  I get it with parallel and non-parallel builds.

Steve Ellcey
[EMAIL PROTECTED]

.
.
.
/bin/sh /proj/opensrc/nightly/src/trunk/gcc/../move-if-change tmp-macro_list 
macro_list
echo timestamp > s-macro_list
rm -rf include-fixed; mkdir include-fixed
chmod a+rx include-fixed
if [ -d ../prev-gcc ]; then \
  cd ../prev-gcc && \
  make real-install-headers-cpio DESTDIR=`pwd`/../gcc/ \
libsubdir=. ; \
else \
  (TARGET_MACHINE='hppa1.1-hp-hpux11.11'; srcdir=`cd 
/proj/opensrc/nightly/src/trunk/gcc; ${PWDCMD-pwd}`; \
SHELL='/bin/sh'; MACRO_LIST=`${PWDCMD-pwd}`/macro_list ; \
export TARGET_MACHINE srcdir SHELL MACRO_LIST && \
cd ../build-hppa1.1-hp-hpux11.11/fixincludes && \
/bin/sh ./fixinc.sh ../../gcc/include-fixed \
  `echo /usr/include | sed -e :a -e 's,[^/]*/\.\.\/,,' -e ta`  ); \
  rm -f include-fixed/syslimits.h; \
  if [ -f include-fixed/limits.h ]; then \
mv include-fixed/limits.h include-fixed/syslimits.h; \
  else \
cp /proj/opensrc/nightly/src/trunk/gcc/gsyslimits.h 
include-fixed/syslimits.h; \
  fi; \
fi
make[4]: Entering directory 
`/proj/opensrc/nightly/build-hppa1.1-hp-hpux11.11-trunk/obj_gcc/prev-gcc'
cd `${PWDCMD-pwd}`/include ; \
find . -print | cpio -pdum 
/proj/opensrc/nightly/build-hppa1.1-hp-hpux11.11-trunk/obj_gcc/prev-gcc/../gcc/./include
cannot write in 

make[4]: *** [real-install-headers-cpio] Error 2
make[4]: Leaving directory 
`/proj/opensrc/nightly/build-hppa1.1-hp-hpux11.11-trunk/obj_gcc/prev-gcc'
make[3]: *** [stmp-fixinc] Error 2
make[3]: Leaving directory 
`/proj/opensrc/nightly/build-hppa1.1-hp-hpux11.11-trunk/obj_gcc/gcc'
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory 
`/proj/opensrc/nightly/build-hppa1.1-hp-hpux11.11-trunk/obj_gcc'
make[1]: *** [stage2-bubble] Error 2
make[1]: Leaving directory 
`/proj/opensrc/nightly/build-hppa1.1-hp-hpux11.11-trunk/obj_gcc'
make: *** [bootstrap] Error 2


Updating libtool in GCC and srctree

2007-03-08 Thread Steve Ellcey

Now that autoconf has been updated to 2.59, I would like to update the
libtool that GCC and the binutils/gdb/etc use.  Unfortunately, I am not
having much luck coming up with a patch and figuring out what all needs
to be reconfigured.

Here is what I have tried so far.  In the libtool documentation it says
that to include libtool in your package you need to add config.guess,
config.sub, install-sh, and ltmain.sh to your package.  We already have
the install-sh that is in the latest libtool and our config.guess and
config.sub look to be newer than the ones in libtool so that just leaves
ltmain.sh.  I downloaded the 2.1a snapshot of libtool and found
ltmain.sh in libltdl/config/ltmain.sh, I copied that to the top level of
the src tree and then removed libtool.m4, ltconfig, ltcf-c.sh,
ltcf-cxx.sh, and ltcf-gcj.sh.

I was able to run autoconf on the top-level of the source tree with no
errors but when I did a configure/make I got the following error while
make was in the bfd subdirectory:

make[3]: Entering directory `/proj/opensrc/sje/svn.libtool/build-ia64-hp-hpux11.
23-trunk/obj_src/bfd'
make[3]: LIBTOOL@: Command not found
make[3]: *** [archive.lo] Error 127

So I went into bfd and tried to run autoconf there but I get the errors:

$ /proj/opensrc/be/ia64-hp-hpux11.23/bin/autoconf
configure.in:13: error: possibly undefined macro: AM_PROG_LIBTOOL
  If this token and others are legitimate, please use m4_pattern_allow.
  See the Autoconf documentation.
configure.in:20: error: possibly undefined macro: AM_DISABLE_SHARED

I tried changing the macros to AC_* but that didn't help, should I just 
use m4_pattern_allow or am I missing a bigger picture here?

Steve Ellcey
[EMAIL PROTECTED]


Re: Updating libtool in GCC and srctree

2007-03-08 Thread Steve Ellcey
> > I downloaded the 2.1a snapshot of libtool and found
> 
> Are you sure you want to use the (rather oldish) 2.1a snapshot?  I think
> you'll better off using the latest stable release which is 1.5.22.

I thought that 2.1a was a snapshot of ToT.  I have some recollection of
someone saying we would want to use ToT libtool as opposed to the latest
released one.

> > ltmain.sh in libltdl/config/ltmain.sh, I copied that to the top level of
> > the src tree and then removed libtool.m4,
> 
> You'll still need libtool.m4.

Are you sure?  According to
<http://www.gnu.org/software/libtool/manual.html#Distributing> we
shouldn't need libtool.m4 in our package.

Steve Ellcey
[EMAIL PROTECTED]


Re: Updating libtool in GCC and srctree

2007-03-09 Thread Steve Ellcey
I have made some progress in updating libtool in the src (binutils) tree
and I have attached the various changes (but not the actual new libtool
files) to this email in case anyone wants to see what I am doing.

I am having more trouble with the GCC tree.  I put the new libtool in
the toplevel directory, just like I did in the binutils src tree and
then I went to the boehm-gc (and libffi) directories to try and rerun
autoconf.  If I just run autoconf I get errors because I am not
including the new ltoptions.m4, ltsugar.m4, and ltversion.m4 files.  Now
in the binutils tree the acinclude.m4 files had explicit includes of
libtool.m4 and I added includes of ltoptions.m4, ltsugar.m4, and
ltversion.m4.  But boehm-gc has no acinclude.m4 file and while libffi
has an acinclude.m4 file, it doesn't have an include of libtool.m4.  So
my question is, how is the include of libtool.m4 getting into
aclocal.m4?  Is it by running aclocal?  I tried to run aclocal but I get
errors when I run it:

$ aclocal
autom4te: unknown language: Autoconf-without-aclocal-m4
aclocal: autom4te failed with exit status: 1

This is aclocal 1.9.6.  Any idea on what I need to do here to fix this
error?  Why do some acinclude.m4 files have explicit includes for
libtool files (libgfortran, libgomp, etc) but other's don't (libffi,
gcc).

Steve Ellcey
[EMAIL PROTECTED]


Here is what I have done so far in the src/binutils tree:

Top level src tree ChangeLog:
2007-03-09  Steve Ellcey  <[EMAIL PROTECTED]>
* ltmain.sh: Update from libtool ToT.
* libtool.m4: Update from libtool ToT.
* ltsugar.m4: New. Update from libtool ToT.
* ltversion.m4: New. Update from libtool ToT.
* ltoptions.m4: New. Update from libtool ToT.
* ltconfig: Remove.
* ltcf-c.sh: Remove.
* ltcf-cxx.sh: Remove.
* ltcf-gcj.sh: Remove.
* src-release: Update with new libtool file list.

Index: src-release
===
RCS file: /cvs/src/src/src-release,v
retrieving revision 1.22
diff -u -r1.22 src-release
--- src-release 9 Feb 2007 15:15:38 -   1.22
+++ src-release 9 Mar 2007 23:37:34 -
@@ -49,8 +49,8 @@
 DEVO_SUPPORT= README Makefile.in configure configure.ac \
config.guess config.sub config move-if-change \
COPYING COPYING.LIB install-sh config-ml.in symlink-tree \
-   mkinstalldirs ltconfig ltmain.sh missing ylwrap \
-   libtool.m4 ltcf-c.sh ltcf-cxx.sh ltcf-gcj.sh \
+   mkinstalldirs ltmain.sh missing ylwrap \
+   libtool.m4 ltsugar.m4, ltversion.m4, ltoptions.m4 \
Makefile.def Makefile.tpl src-release config.rpath
 
 # Files in devo/etc used in any net release.



bfd/ChangeLog
2007-03-09  Steve Ellcey  <[EMAIL PROTECTED]>
* acinclude.m4: Add new includes.
* configure.in: Change macro call order.
* configure: Regenerate.

Index: acinclude.m4
===
RCS file: /cvs/src/src/bfd/acinclude.m4,v
retrieving revision 1.16
diff -u -r1.16 acinclude.m4
--- acinclude.m431 May 2006 15:14:35 -  1.16
+++ acinclude.m49 Mar 2007 23:36:49 -
@@ -49,6 +49,9 @@
 fi
 AC_SUBST(EXEEXT_FOR_BUILD)])dnl
 
+sinclude(../ltsugar.m4)
+sinclude(../ltversion.m4)
+sinclude(../ltoptions.m4)
 sinclude(../libtool.m4)
 dnl The lines below arrange for aclocal not to bring libtool.m4
 dnl AM_PROG_LIBTOOL into aclocal.m4, while still arranging for automake


Index: configure.in
===
RCS file: /cvs/src/src/bfd/configure.in,v
retrieving revision 1.222
diff -u -r1.222 configure.in
--- configure.in1 Mar 2007 15:48:36 -   1.222
+++ configure.in9 Mar 2007 23:37:07 -
@@ -19,7 +19,10 @@
 dnl configure option --enable-shared.
 AM_DISABLE_SHARED
 
-AM_PROG_LIBTOOL
+AC_PROG_CC
+AC_GNU_SOURCE
+
+AC_PROG_LIBTOOL
 
 AC_ARG_ENABLE(64-bit-bfd,
 [  --enable-64-bit-bfd 64-bit support (on hosts with narrower word sizes)],
@@ -95,9 +98,6 @@
 
 # host stuff:
 
-AC_PROG_CC
-AC_GNU_SOURCE
-
 ALL_LINGUAS="fr tr ja es sv da zh_CN ro rw vi"
 ZW_GNU_GETTEXT_SISTER_DIR
 AM_PO_SUBDIRS



binutils/ChangeLog
2007-03-09  Steve Ellcey  <[EMAIL PROTECTED]>
* configure.in: Change macro call order.
* configure: Regenerate.

Index: configure.in
===
RCS file: /cvs/src/src/binutils/configure.in,v
retrieving revision 1.75
diff -u -r1.75 configure.in
--- configure.in28 Feb 2007 01:29:32 -  1.75
+++ configure.in9 Mar 2007 23:36:12 -
@@ -11,7 +11,9 @@
 changequote([,])dnl
 AM_INIT_AUTOMAKE(binutils, ${BFD_VERSION})
 
-AM_PROG_LIBTOOL
+AC_PROG_CC
+AC_GNU_SOURCE
+AC_PROG_LIBTOOL
 
 AC_ARG_ENABLE(targets,
 [  --enable-targetsalternative target configurations],
@@ -53,9 +55,6 @@
 AC_MSG_ERROR(Unrecognized host

Re: Updating libtool in GCC and srctree

2007-03-09 Thread Steve Ellcey
> Steve Ellcey <[EMAIL PROTECTED]> writes:
> 
> > $ aclocal
> > autom4te: unknown language: Autoconf-without-aclocal-m4
> > aclocal: autom4te failed with exit status: 1
> 
> Looks like you have an out-of-date autom4te.cache.
> 
> Andreas.

I removed autom4te.cache and reran aclocal.  Same results.

Steve Ellcey
[EMAIL PROTECTED]


Re: Updating libtool in GCC and srctree

2007-03-12 Thread Steve Ellcey
> So, you need to run aclocal with:
>   $ aclocal -I ../config -I ..
> 
> -- 
> albert chin ([EMAIL PROTECTED])

Thanks, that helps a lot.  For libstdc++-v3 I actually needed "-I ." as
well in order to find linkage.m4 so maybe "-I . -I .. -I ../config" is
the best option list to use on aclocal calls in the GCC tree.

libjava is the only subdir I can't seem to get configured with the new
libtool:

$ aclocal -I . -I .. -I ../config
$ autoconf
configure:15448: error: possibly undefined macro: AM_PROG_GCJdnl
  If this token and others are legitimate, please use m4_pattern_allow.
  See the Autoconf documentation.

I am not sure why I get this, nothing else seems to be requiring
m4_pattern_allow.  If I don't use any -I options on aclocal it works and
then I get a different error from autoconf (about TL_AC_GXX_INCLUDE_DIR
being possibly undefined).  I think I want the -I options though.

Steve Ellcey
[EMAIL PROTECTED]


Re: Updating libtool in GCC and srctree

2007-03-13 Thread Steve Ellcey
> On Mon, Mar 12, 2007 at 04:03:52PM -0700, Steve Ellcey wrote:
> > configure:15448: error: possibly undefined macro: AM_PROG_GCJdnl
> 
> Where'd that come from?  Wherever it is, it's a bug.  Maybe someone
> checked in a typo to the configure file.  "dnl" is a comment start
> token in autoconf (that's a very rough approximation of the situation).

It looks like it is coming from the new libtool.m4, I just sent email to
bug-libtool@gnu.org about it.  In the new libtool.m4 there is:

# LT_PROG_GCJ
# ---
AC_DEFUN([LT_PROG_GCJ],
[m4_ifdef([AC_PROG_GCJ], [AC_PROG_GCJ],
  [m4_ifdef([A][M_PROG_GCJ], [A][M_PROG_GCJ],
[AC_CHECK_TOOL(GCJ, gcj,)
  test "x${GCJFLAGS+set}" = xset || GCJFLAGS="-g -O2"
  AC_SUBST(GCJFLAGS)])])dnl
])

And I think the dnl at the end of the AC_SUBST line is the problem.
Removing it seems to fix the configure of libjava anyway.

> Yes, you always want to match ACLOCAL_AMFLAGS from Makefile.am.

Now that is a very useful thing to know.

I am trying to build now and am currently running into a problem building
libgfortran.

When doing the libtool link of the library I get:

ld: Can't find library or mismatched ABI for -lgfortranbegin
Fatal error.
collect2: ld returned 1 exit status
make[3]: *** [libgfortran.la] Error 1

I was able to build libstdc++-v3 and other libraries with no problem,
but I haven't figured out what is going on here yet.

Steve Ellcey
[EMAIL PROTECTED]


RFC: obsolete __builtin_apply?

2007-03-16 Thread Steve Ellcey

I have long been annoyed by the failure of the test builtin-apply4.c on
IA64 HP-UX and I know there are failures of tests using __builtin_apply
on other platforms as well.

My question is:  Is it time to obsolete __builtin_apply,
__builtin_apply_args, and __builtin_return?

It looks like the main sticking point is that libobjc uses
__builtin_apply, __builtin_apply_args, and __builtin_return.  There is a
FIXME comment about changing this to use libffi.  Do any of the objc
folks have this on their 'todo' plate?  I am not sure how big this task
would be.

My thinking is that if libobjc was changed then we could put in a
depreciated message on these builtins for 4.3 and maybe remove them for
4.4.

Comments?

Steve Ellcey
[EMAIL PROTECTED]


libgfortran Makefile question (using latest libtool)

2007-03-21 Thread Steve Ellcey

While attempting to build libgfortran with the latest libtool I got the
following error:

if /bin/sh ./libtool --mode=compile 
/proj/opensrc/sje/svn.libtool/build-ia64-hp-hpux11.23-trunk/obj_gcc/./gcc/xgcc 
-B/proj/opensrc/sje/svn.libtool/build-ia64-hp-hpux11.23-trunk/obj_gcc/./gcc/ 
-B/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/bin/
 
-B/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/lib/
 -isystem 
/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/include
 -isystem 
/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/sys-include
 -DHAVE_CONFIG_H -I. -I/proj/opensrc/sje/svn.libtool/src/trunk/libgfortran -I.  
-iquote/proj/opensrc/sje/svn.libtool/src/trunk/libgfortran/io 
-I/proj/opensrc/sje/svn.libtool/src/trunk/libgfortran/../gcc 
-I/proj/opensrc/sje/svn.libtool/src/trunk/libgfortran/../gcc/config 
-I../../.././gcc -D_GNU_SOURCE  -std=gnu99 -Wall -Wstrict-prototypes 
-Wmissing-prototypes -Wold-style-definition -Wextra -Wwrite-strings -O2 -g   
-mlp64 -MT backtrace.lo -M
 D -MP -MF ".deps/backtrace.Tpo" -c -o backtrace.lo `test -f 
'runtime/backtrace.c' || echo 
'/proj/opensrc/sje/svn.libtool/src/trunk/libgfortran/'`runtime/backtrace.c; \
then mv -f ".deps/backtrace.Tpo" ".deps/backtrace.Plo"; else rm -f ".deps/backtr
ace.Tpo"; exit 1; fi
libtool: compile: unable to infer tagged configuration
libtool: compile: specify a tag with `--tag'
make[6]: *** [fmain.lo] Error 1


Now, obviously, what I want to do is add --tag=CC to the libtool call.
But I can't figure out where to do this.  If I look at Makefile.in I see
where this is coming from but Makefile.in is generated from Makefile.am
so I shouldn't be editing Makefile.in.  When I look at Makefile.am I
don't see how we got this compile line.  What do I change to get
--tag=CC added to the libtool call?

The libstdc++-v3/src/Makefile has:

LTCXXCOMPILE = $(LIBTOOL) --tag CXX --mode=compile $(CXX) $(INCLUDES) \
   $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CXXFLAGS) $(CXXFLAGS)

But libgfortran doesn't have a line like this so how is it coming up
with this compile line?

Steve Ellcey
[EMAIL PROTECTED]


Re: libgfortran Makefile question (using latest libtool)

2007-03-21 Thread Steve Ellcey
> I think that should already be the default.  Try running ./libtool
> --config and look for the value of CC.  That value should match (modulo
> whitespace) the command line that is actually used.
> 
> Andreas.

It does not look like this is the default.  I don't see any use of --tag
in the libtool config output (nor do I see where the -MD, -MP, -MF flags
are coming from).

Steve Ellcey
[EMAIL PROTECTED]


% ./libtool --config | grep -e LT -e CC -e tag

LTCC="/proj/opensrc/sje/svn.libtool/build-ia64-hp-hpux11.23-trunk/obj_gcc/./gcc/xgcc
 -B/proj/opensrc/sje/svn.libtool/build-ia64-hp-hpux11.23-trunk/obj_gcc/./gcc/ 
-B/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/bin/
 
-B/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/lib/
 -isystem 
/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/include
 -isystem 
/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/sys-include"
# LTCC compiler flags.
LTCFLAGS="-std=gnu99 -O2 -g  -Wunknown-pragmas"
variables_saved_for_relink="PATH LD_LIBRARY_PATH  GCC_EXEC_PREFIX COMPILER_PATH 
LIBRARY_PATH"
CC="/proj/opensrc/sje/svn.libtool/build-ia64-hp-hpux11.23-trunk/obj_gcc/./gcc/xgcc
 -B/proj/opensrc/sje/svn.libtool/build-ia64-hp-hpux11.23-trunk/obj_gcc/./gcc/ 
-B/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/bin/
 
-B/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/lib/
 -isystem 
/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/include
 -isystem 
/proj/opensrc/sje/svn.libtool/gcc-ia64-hp-hpux11.23-trunk/ia64-hp-hpux11.23/sys-include"
archive_cmds="\$CC -shared \${wl}+h \${wl}\$soname \${wl}+nodefaultrpath -o 
\$lib \$libobjs \$deplibs \$compiler_flags"


Re: libgfortran Makefile question (using latest libtool)

2007-03-21 Thread Steve Ellcey
> From: Charles Wilson <[EMAIL PROTECTED]>
> 
> The --tag option is added by automake-1.9 or automake-1.10, but not 1.8:

Interesting, the Makefile.in in libgfortran claims to be from automake
1.9.6.  If I run this automake in a tree with the old (1.4 based
libtool) I don't get any --tags options in Makefile.in, but if I run
automake in the tree where I have the latest libtool then I see the
--tag option used.  So I guess just rerunning automake is sufficient to
fix this problem.

Steve Ellcey
[EMAIL PROTECTED]


Re: GCC mini-summit - compiling for a particular architecture

2007-04-20 Thread Steve Ellcey
> It came up in a few side conversations.  As I understand it, RMS has
> decreed that the -On optimizations shall be architecture independent.
> That said, there are "generic" optimizations which really only apply
> to a single architecture, so there is some precedent for bending this
> rule.
> 
> There were also suggestions of making the order of optimizations
> command line configurable and allowing dynamically loaded libraries to
> register new passes.
> 
> Ollie

This seems unfortunate.  I was hoping I might be able to turn on loop
unrolling for IA64 at -O2 to improve performance.  I have only started
looking into this idea but it seems to help performance quite a bit,
though it is also increasing size quite a bit too so it may need some
modification of the unrolling parameters to make it practical.

I notice the OPTIMIZATION_OPTIONS documentation does say:

| You should not use this macro to change options that are not
| machine-specific.  These should uniformly selected by the same
| optimization level on all supported machines.  Use this macro to enable
| machine-specific optimizations.

What is the rational for this?  Is it a question of making it easier to
reproduce a -O2 bug that happens on one machine on a different one too
so it is easier to find and fix?

Steve Ellcey
[EMAIL PROTECTED]


Re: GCC mini-summit - benchmarks

2007-04-23 Thread Steve Ellcey
Jim Wilson wrote:

> Kenneth Hoste wrote:
> > I'm not sure what 'tests' mean here... Are test cases being extracted
> > from the SPEC CPU2006 sources? Or are you refering to the validity tests
> > of the SPEC framework itself (to check whether the output generated by
> > some binary conforms with their reference output)?
> 
> The claim is that SPEC CPU2006 has source code bugs that cause it to
> fail when compiled by gcc.  We weren't given a specific list of problem.

HJ, can you give us the specifics on the SPEC 2006 failures you were
seeing?

I remember the perlbench failure, it was IA64 specific, and was due to
the SPEC config file spec_config.h that defines the attribute keyword to
be null, thus eliminating all attributes.  On IA64 Linux, in the
/usr/include/bits/setjmp.h header file, the __jmp_buf buffer is defined
to have an aligned attribute on it.  If the buffer isn't aligned the
perlbench program fails.

I believe another problem was an uninitialized local variable in a 
Fortran program, but I don't recall which program or which variable
that was.

Steve Ellcey
[EMAIL PROTECTED]


Problem with patch for PR tree-optimization/29789

2007-04-25 Thread Steve Ellcey
Richard,

Has anyone reported any problems with your tree-ssa-loop-im.c patch that
fixes PR tree-optimization/29789?  I have been looking at a failure with
the SPECfp2000 173.applu test.  I found that if I compile it with
version r124041 of the GCC gfortran compiler it works but if I compile
it with version r124042 it fails.  The difference between the two is
your checkin:

2007-04-22  Richard Guenther  <[EMAIL PROTECTED]>

PR tree-optimization/29789
* tree-ssa-loop-im.c (stmt_cost): Adjust cost of shifts.
(rewrite_reciprocal): New helper split out from
determine_invariantness_stmt.
(rewrite_bittest): Likewise.
(determine_invariantness_stmt): Rewrite (A >> B) & 1 to
A & (1 << B) if (1 << B) is loop invariant but (A >> B)
is not.

To make things harder, the problem only seems to happen if I build
bootstrap.  If I build a non-bootstrap compiler then the applu test
compiles and runs fine.  If I build a bootstrap compiler, I can compile
applu but the program core dumps when run.  Do you have any ideas about
what might be happening or what I might try in order to understand what
is going wrong.

Steve Ellcey
[EMAIL PROTECTED]


How to handle g++.dg/warn/multiple-overflow-warn-3.C failure

2007-04-26 Thread Steve Ellcey

I was wondering if anyone had some advice on how to handle the testcase
g++.dg/warn/multiple-overflow-warn-3.C.  The test case fails on my HP-UX
platforms because the underlying type for wchar_t on HP-UX is 'unsigned
int' and not 'int' like it is on Linux.  This means that the expression
does not overflow, we don't get a warning, and the test fails.

I could just xfail/xskip it for HP-UX but other platforms use unsigned
types for wchar_t and must be failing too.  I was hoping for something a
little more elegant.

I thought of changing all the wchar_t's to int's but I think that might
negate what the test is trying to check since there would be no implicit
conversions in the code any more and the test would probably never have
given multiple overflow warnings in the first case.

Steve Ellcey
[EMAIL PROTECTED]


Test g++.dg/warn/multiple-overflow-warn-3.C:


/* PR 30465 : Test for duplicated warnings in a conversion.  */
/* { dg-do compile } */
/* { dg-options "-Woverflow" } */

wchar_t
g (void)
{
  wchar_t wc = ((wchar_t)1 << 31) - 1; /* { dg-bogus "overflow .* overflow" } */
  /* { dg-warning "overflow" "" { target *-*-* } 8 } */
  return wc;
}


Re: RFC: obsolete __builtin_apply?

2007-04-27 Thread Steve Ellcey
Andrew,  are you still planning on applying the libobjc patch that
removes the use of __builtin_apply?

Steve Ellcey
[EMAIL PROTECTED]


Re: IA64 record alignment rules, and modes?

2005-02-28 Thread Steve Ellcey
> Question: If we assume that a TImode would've been a more efficient mode
> to represent the record type above, would it not have been acceptable for
> the compiler to promote the alignment of this type to 128, given there
> are no apparent restrictions otherwise, or are there other C conventions
> at work that dictate otherwise?  Is there a configuration tweak that
> would've led to using TImode rather than BLKmode?

I think using TImode might work in this specific example but there are
other cases where it would definitely not work.  This is especially true
on HP-UX, which is big-endian, and where the alignment of records and
integers is different.  I.e.  passing a integer argument vs.  passing a
record containing a single integer field is different.  And then there
is the whole issue of HFA's (homogenous floating point aggregates) to
consider.  In general coming up with a specific set of criteria where an
aggregate doesn't have to be treated as such is difficult on IA64.  For
more details about the IA64 ABI see:


http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,3309,00.html

Steve Ellcey
[EMAIL PROTECTED]


PR 19893 & array_ref bug

2005-03-09 Thread Steve Ellcey
I was looking at PR 19893 (gcc.dg/vect/vect-76 fails on ia64-hpux) and I
think it is caused by a non-platform specific bug, though it may not
cause vect-76 to fail on other platforms.  I was hoping someone might be
able to help me understand what is going on.

Here is a cut down test case (with no vector stuff in it):

typedef int aint __attribute__ ((__aligned__(16)));
aint ib[12];
int ic[12], *x, *y;
int main (void)
{
  x = &ib[4];
  y = &ic[4];
}

If you look at the assembly language generated on IA64 (HP-UX or Linux)
or probably on any platform, you will see that 'y' gets correctly set to
the address of ic[4].  But 'x' gets set to ib[0], instead of ib[4].
Things look good in all the tree dumps but the first rtl dump looks bad
so I believe things are going wrong during expansion.  Looking in
tree.def I see:

/* Array indexing.
   Operand 0 is the array; operand 1 is a (single) array index.
   Operand 2, if present, is a copy of TYPE_MIN_VALUE of the index.
   Operand 3, if present, is the element size, measured in units of
   the alignment of the element type.  */
DEFTREECODE (ARRAY_REF, "array_ref", tcc_reference, 4)

Now I think this the problem is with operand 3.  What value should it
have if the alignment is greater than the element size?  That is what I
have in the test case above and when I dump the array_ref for ib[4] I
see that I have an operand 3 and it is zero and I think this is causing
the test failure.  What value should operand 3 have in this situation?
Or should it have been left out?

Steve Ellcey
[EMAIL PROTECTED]


Re: PR 19893 & array_ref bug

2005-03-15 Thread Steve Ellcey
> This program should generate an error; it's illogical.  If the alignment 
> of the element is greater than the element size, then arrays of such a 
> type should be disallowed.  Otherwise, stuff in either the compiler or 
> the program itself could make the justified assumption that things of 
> that type are aligned more strictly than they actually are.
> 
> -- 
> Mark Mitchell

Interesting, I have created a patch (attached) that gives an error
whenever we try to create an array of elements and the alignment of the
elements is greater than the size of the elements.

The problem I have, and the reason I haven't sent it to gcc-patches, is
that it generates a bunch of regressions.  The regressions are all due
to bad tests but I am not sure how to fix the tests so that I can check
in the patch.

The regressions are in two places, gcc.dg/compat/struct-layout* and
gcc.dg/vect/*

Most of the gcc.dg/vect/* tests contain something like:

typedef float afloat __attribute__ ((__aligned__(16)));
afloat a[N];

The question is, since this is illegal, what should we use instead?
I don't know if the alignment is an integral part of what is being
tested or not since the tests have no comments in them. So I am not
sure if we should just delete the alignment attribute or make it
smaller.  If we make it smaller we need to know the size of float in
order to know if a particular alignment is legal or not.

The gcc.dg/compat/struct-layout problems seem to stem from
struct-layout-1_generate.c.  In generate_fields() it generates random
types, some of these are arrays of some base type.  Then based on
another random number we might add an attribute like alignment.  There
is no check to ensure that the alignment of the base type is less than or
equal to the size of the base type in those instances where we are
creating an array.

I would be interested in any advice on the best way to fix these tests
so that I can add my patch without causing regressions.

Steve Ellcey
[EMAIL PROTECTED]




Here is the patch that checks for the alignment of array elements and that
causes the regressions:


2005-03-15  Steve Ellcey  <[EMAIL PROTECTED]>

PR 19893
* stor-layout.c (layout_type): Add alignment check.


*** gcc.orig/gcc/stor-layout.c  Fri Mar 11 14:40:03 2005
--- gcc/gcc/stor-layout.c   Tue Mar 15 15:46:02 2005
*** layout_type (tree type)
*** 1632,1637 
--- 1632,1643 
  
build_pointer_type (element);
  
+   if (host_integerp (TYPE_SIZE_UNIT (element), 1)
+ && tree_low_cst (TYPE_SIZE_UNIT (element), 1) > 0
+ && (HOST_WIDE_INT) TYPE_ALIGN_UNIT (element)
+  > tree_low_cst (TYPE_SIZE_UNIT (element), 1))
+ error ("alignment of array elements is greater than element size");
+ 
/* We need to know both bounds in order to compute the size.  */
if (index && TYPE_MAX_VALUE (index) && TYPE_MIN_VALUE (index)
&& TYPE_SIZE (element))


Re: PR 19893 & array_ref bug

2005-03-15 Thread Steve Ellcey
> > The gcc.dg/compat/struct-layout problems seem to stem from
> > struct-layout-1_generate.c.  In generate_fields() it generates random
> > types, some of these are arrays of some base type.  Then based on
> > another random number we might add an attribute like alignment.  There
> > is no check to ensure that the alignment of the base type is less than or
> > equal to the size of the base type in those instances where we are
> > creating an array.
> 
> That could be fixed by adding the check you suggest, and then just 
> discarding the attribute.

I don't know if I have enough information to implement a test that
ignores the attribute only when the alignment is greater than the size.
Some of the attributes use __aligned__ with no value and that defaults
to whatever the maximum alignment is for the platform you are running on
and I don't know if I can determine that while running
struct-layout-1_generate.

The simplest solution would probably be to ignore __aligned__ attributes
completely when we have an array.  Or to do the change you suggested for
the vector tests and have the attribute attached to the array and not
the element type.

Steve Ellcey
[EMAIL PROTECTED]


Re: PR 19893 & array_ref bug

2005-03-16 Thread Steve Ellcey
What do people think about this idea for changing the vect tests using
gcc.dg/vect/vect-56.c as an example.  The arguments (pa, pb, pc) would
remain afloat type (vs. float) but the arrays would be changed from
'array of aligned floats' to an array of floats where the actual array
itself is aligned.

It seems like we are lying about the alignment of the pa, pb, pc
arguments but I don't see a way around this.  If we changed GCC to pad
the array elements (in order to obey the alignment request) wouldn't we
actually break our ability to vectorize things?

Steve Ellcey
[EMAIL PROTECTED]

*** vect-56.c.orig  Wed Mar 16 11:38:49 2005
--- vect-56.c   Wed Mar 16 11:39:46 2005
*** main1 (afloat * __restrict__ pa, afloat 
*** 40,48 
  int main (void)
  {
int i;
!   afloat a[N];
!   afloat b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
!   afloat c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
  
check_vect ();
  
--- 40,50 
  int main (void)
  {
int i;
!   float a[N] __attribute__ ((__aligned__(16)));
!   float b[N]  __attribute__ ((__aligned__(16))) =
! {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
!   float c[N]  __attribute__ ((__aligned__(16))) =
! {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
  
check_vect ();
  


Re: PR 19893 & array_ref bug

2005-03-16 Thread Steve Ellcey
> From: Gabriel Dos Reis <[EMAIL PROTECTED]>
> | 
> | Make them array arguments, instead of pointer arguments.  I'm not sure
> | if GCC is smart enough to still vectorize them in that case, but
> | that's the right way to express it.  An aligned array-of-floats decays
> | to an aligned pointer-to-float, i.e., the pointer is known to be
> | aligned, but the object pointed to is just a float not an aligned
> | float.
> 
> Agreed.
> 
> -- Gaby

But as Joseph pointed out we don't implement attributes on array
arguments so I get a warning when I try to use the __restrict__
attribute on the array arguments.  Without the __restrict__ attribute I
am sure we would not do any vectorization and then what is the point of
the test?

Steve Ellcey
[EMAIL PROTECTED]


GCC3 to GCC4 performance regression. Bug?

2005-03-17 Thread Steve Ellcey

I have been looking at a significant performance regression in the hmmer
application between GCC 3.4 and GCC 4.0.  I have a small cutdown test
case (attached) that demonstrates the problem and which runs more than
10% slower on IA64 (HP-UX or Linux) when compiled with GCC 4.0 than when
compiled with GCC 3.4.  At first I thought this was just due to 'better'
alias analysis in the P7Viterbi routine and that it was the right thing
to do even if it was slower.  It looked like GCC 3.4 does not believe
that hmm->tsc could alias mmx but GCC 4.0 thinks they could and thus GCC
4.0 does more loads inside the inner loop of P7Viterbi.  But then I
noticed something weird, if I remove the field M (which is unused in my
example) from the plan_s structure.  GCC 4.0 runs as fast as GCC 3.4.  I
don't understand why this would affect things.

Any optimization experts care to take a look at this test case and help
me understand what is going on and if this change from 3.4 to 4.0 is
intentional or not?

Steve Ellcey
[EMAIL PROTECTED]


 Test Case ---

#define L_CONST 500

void *malloc(long size);

struct plan7_s {
  int M;
  int **tsc;   /* transition scores [0.6][1.M-1]*/
};

struct dpmatrix_s {
  int **mmx;
};
struct dpmatrix_s *mx;



void
AllocPlan7Body(struct plan7_s *hmm, int M) 
{
  int i;

  hmm->tsc= malloc (7 * sizeof(int *));
  hmm->tsc[0] = malloc ((M+16) * sizeof(int));
  mx->mmx = (int **) malloc(sizeof(int *) * (L_CONST+1));
  for (i = 0; i <= L_CONST; i++) {
mx->mmx[i] = malloc (M+2+16);
  }
  return;
}  

void
P7Viterbi(int L, int M, struct plan7_s *hmm, int **mmx)
{
  int   i,k;
  
  for (i = 1; i <= L; i++) {
for (k = 1; k <= M; k++) {
  mmx[i][k] = mmx[i-1][k-1] + hmm->tsc[0][k-1];
}
  }
}

main ()
{
struct plan7_s *hmm;
char dsq[L_CONST];
int i;

hmm = (struct plan7_s *) malloc (sizeof (struct plan7_s));
mx = (struct dpmatrix_s *) malloc (sizeof (struct dpmatrix_s));
AllocPlan7Body(hmm, 10);
for (i = 0; i < 60; i++) {
P7Viterbi(500, 10, hmm, mx->mmx);
}
}


IA64 Pointer conversion question / convert code already wrong?

2005-04-21 Thread Steve Ellcey

I am looking at a bug/oddity in the HP-UX IA64 GCC compiler in ILP32
mode.  Here is some code (cut out from libffi):

typedef void *PTR64 __attribute__((mode(DI)));
extern void bar(PTR64);
void foo(void * x) { bar(x); }

Now the issue is whether or not this is legal and how x should get
extended.  I am assuming that it is legal and that, on IA64, we would
like the pointer extended via the addp4 instruction.

When I do not optimize this program I do not get any addp4 instructions,
when I do optimize the program I do get the desired addp4 instructions.

I believe the problem in the unoptomized case is in expand_expr_real_1,
where we have:

case NON_LVALUE_EXPR:
case NOP_EXPR:
case CONVERT_EXPR:
.
.
.
  else if (modifier == EXPAND_INITIALIZER)
op0 = gen_rtx_fmt_e (unsignedp ? ZERO_EXTEND : SIGN_EXTEND, mode, op0);

  else if (target == 0)
op0 = convert_to_mode (mode, op0,
   TYPE_UNSIGNED (TREE_TYPE
  (TREE_OPERAND (exp, 0;
  else
{
  convert_move (target, op0,
TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0;
  op0 = target;
}

The EXPAND_INITIALIZER if looks wrong (for IA64) because it assumes that
ZERO_EXTEND and SIGN_EXTEND are the only possibilities and and if op0 is
a pointer then we have a third possibility for ia64.  Is the use of
gen_rtx_fmt_e an optimization that could be replaced by convert_to_mode
or convert_move or is there some underlying reason why that has to be a
gen_rtx_fmt_e call for an initializer?

The existing convert_to_mode and convert_move calls look suspicious to
me too because they use the TYPE_UNSIGNED macro to determine wether to
do signed or unsigned extensions and I am not sure if that would be set
correctly for pointer types based on a platforms setting of
POINTERS_EXTEND_UNSIGNED.

Anyone have any insights?

Steve Ellcey
[EMAIL PROTECTED]


Re: IA64 Pointer conversion question / convert code already wrong?

2005-04-21 Thread Steve Ellcey
> This is a conversion between what, two pointer types?

Yes.  From 'void *' to 'void * __attribute__((mode(DI)))' where the
first is 32 bits (HP-UX ILP32 mode) and the second is 64 bits.

> If so, I think there should be a special case here to check for converting
> between two pointer types and call convert_memory_address if so.

I don't know why I didn't think of using convert_memory_address.  I just
tried it and it seems to work in my test case.  I will do a bootstrap
and test overnight to see how that goes.

> Also, I think convert_memory_address ought to have a
>   gcc_assert (GET_MODE (x) == to_mode);
> in the #ifndef case.

OK, I'll toss that in too.  It won't be seen on the HP-UX side but I'll
do a Linux build as well.

Steve Ellcey
[EMAIL PROTECTED]


Re: IA64 Pointer conversion question / convert code already wrong?

2005-04-25 Thread Steve Ellcey
> Also, I think convert_memory_address ought to have a
>   gcc_assert (GET_MODE (x) == to_mode);
> in the #ifndef case.

Interesting, I put this assertion in my code and I now cannot bootstrap
on HPPA.  Looking at the HPPA builds (where POINTERS_EXTEND_UNSIGNED if
not defined) I see the assertion fail because I enter
convert_memory_address with to_mode set to SImode and x set to
'(const_int 0 [0x0])'.

The call to convert_memory_address is being made from memory_address
(explow.c:404).

I am not sure if this is a bug, or if convert_memory_address should
allow this by doing nothing (current behaviour) or if
convert_memory_address should be changed so that it does the same
conversion on const_int values when POINTERS_EXTEND_UNSIGNED is
undefined as it does when POINTERS_EXTEND_UNSIGNED is defined.

Steve Ellcey
[EMAIL PROTECTED]


How can I write an empty conversion instruction

2005-05-06 Thread Steve Ellcey

I was wondering if anyone could tell me how to write an (empty)
instruction pattern that does a truncate/extend conversion on a register
'in place'.

All the conversions I see are like this one in ia64/ia64.md:

(define_insn "extendsfdf2"
  [(set (match_operand:DF 0 "fr_register_operand" "=f")
(float_extend:DF (match_operand:SF 1 "fr_register_operand" "f")))]
  ""
  "fnorm.d %0 = %1"
  [(set_attr "itanium_class" "fmac")])

Where the source and the destination may or may not be the same
register.

I am trying to create an empty extend operation I can use to 'convert' a
SFmode register into a DFmode register without actually generating any
code.  Since I don't want this extend called in place of the normal one
I defined it as an UNSPEC operation instead of a float_extend operation
and since it doesn't generate any code and it cannot move the result
from one register to another I need to define it with only one operand.

But my attempt to do this doesn't seem to work and I was wondering if
anyone could tell me why or perhaps point me to an example of an
instruction that does a conversion in place that might help me
understand how to write such an instruction.

My attempt:

(define_insn "nop_extendsfdf"
  [(set (match_operand:DF 0 "fr_register_operand" "+f")
(unspec:DF [(match_dup:SF 0)] UNSPEC_NOP_EXTEND))]
  ""
  ""
  [(set_attr "itanium_class" "ignore")
   (set_attr "predicable" "no")
   (set_attr "empty" "yes")])

I think the match_dup may be wrong since I am using it with SF but the
original match_operand has DF.  Do I need to make this modeless?  Or is
there some other way to create an empty conversion instruction.

Steve Ellcey
[EMAIL PROTECTED]


Re: How can I write an empty conversion instruction

2005-05-06 Thread Steve Ellcey
> You might want to try this instead:
> 
>   [(set (match_operand:DF 0 "fr_register_operand" "=f")
> (unspec:DF [(match_operand:SF 0 "fr_register_operand" "0")] 
> UNSPEC_NOP_EXTEND))]
> 
> -- 
> Daniel Jacobowitz
> CodeSourcery, LLC

Nope.  GCC doesn't like seeing two match_operand's for op 0.

Steve Ellcey
[EMAIL PROTECTED]


vector alignment question

2005-06-08 Thread Steve Ellcey

I noticed that vectors are always aligned based on their size, i.e.  an
8 byte vector has an aligment of 8 bytes, 16 byte vectors an alignment
of 16, a 256 byte vector an alignment of 256, etc.

Is this really intended?

I looked in stor-layout.c and found:

  /* Always naturally align vectors.  This prevents ABI changes
 depending on whether or not native vector modes are supported.  */
  TYPE_ALIGN (type) = tree_low_cst (TYPE_SIZE (type), 0);

so it seems to be intentional, but it still seems odd to me, especially
for very large vectors.

Steve Ellcey
[EMAIL PROTECTED]


Re: vector alignment question

2005-06-08 Thread Steve Ellcey
> On Wed, Jun 08, 2005 at 12:50:32PM -0700, Steve Ellcey wrote:
> > I noticed that vectors are always aligned based on their size, i.e.  an
> > 8 byte vector has an aligment of 8 bytes, 16 byte vectors an alignment
> > of 16, a 256 byte vector an alignment of 256, etc.
> > 
> > Is this really intended?
> 
> Yes.
> 
> > so it seems to be intentional, but it still seems odd to me, especially
> > for very large vectors.
> 
> Hardware usually requires such alignment.  Most folk don't use vectors
> larger than some bit of hardware supports.  One wouldn't want the ABI
> to depend on whether that bit of hardware were actually present, IMO.
> 
> r~

I guess that makes sense but I wonder if the default alignment should be
set to "MIN (size of vector, BIGGEST_ALIGNMENT)" instead so that we
don't default to an alignment larger than we know we can support.  Or
perhaps there should be a way to override the default alignment for
vectors on systems that don't require natural alignment.

Steve Ellcey
[EMAIL PROTECTED]


Re: MEMBER_TYPE_FORCES_BLK on IA-64/HP-UX

2005-07-03 Thread Steve Ellcey
> Steve Ellcey defined MEMBER_TYPE_FORCES_BLK when he first implemented 
> the ia64-hpux port.  At the time, I mentioned using PARALLELs was a 
> better solution, but this was a simpler way for him to get the initial 
> port working.  Since then, there have been a lot of bug fixes to the 
> ia64-hpux support by various people: Steve, Zack, Joseph, etc.  Looking 
> at the current code, it does appear that all cases are now handled by 
> PARALLELs, and that the definition of MEMBER_TYPE_FORCES_BLK no longer 
> appears to be necessary.
> 
> I don't have an ia64-hpux machine, so there is no easy way for me to 
> test this change.
> -- 
> Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

I am concerned about the use of MEMBER_TYPE_FORCES_BLK in stor-layout.c.
I believe that, if MEMBER_TYPE_FORCES_BLK is not defined, this code will
change the mode of a structure containing a single field from BLKmode
into the mode of the field.  I think this might mess up the parameter
passing of structures that contain a single field, particularly when
that field is smaller than 64 bits, like a single char, an int, or a
float.  I would definitely want to check the parameter passing of small
single field structures before removing MEMBER_TYPE_FORCES_BLK on
ia64-hpux.

Steve Ellcey
[EMAIL PROTECTED]


GCC testsuite timeout question (gcc.c-torture/compile/20001226-1.c)

2005-08-30 Thread Steve Ellcey

I was looking at a failure of the test
gcc.c-torture/compile/20001226-1.c on one of my machines and I see that
it is timing out on a slow machine that I have.  I tried to look around
to find out how and where the timeout limit was set and could not find
it.  Can someone explain to me how much time a compile is given and
where this limit is set?  By hand, I can compile the test in about 3 1/2
minutes on the machine in question (the machine may have been busier
when the failure occured and thus taken longer).

Steve Ellcey
[EMAIL PROTECTED]


Re: GCC testsuite timeout question (gcc.c-torture/compile/20001226-1.c)

2005-08-31 Thread Steve Ellcey
> > By hand, I can compile the test in about 3 1/2 minutes on the machine in
> > question (the machine may have been busier when the failure occured and thus
> > taken longer). 
> 
> I think it's a real regression (memory consumption/speed) of the compiler, it 
> is timing out on all the slow SPARC machines I have (it is OK with 4.0.x).
> IIRC I observed the same regression between 3.2.x and 3.3.x on even slower 
> machines, but 3.4.x fixed it.
> 
> -- 
> Eric Botcazou

Yes, I think you are right.  I can see a substantial slowdown in
compilation times on IA64 HP-UX at -O2 (though it doesn't time out
there).

gcc 4.0.0 - 81 seconds
gcc 3.4.1 - 38 seconds
gcc 3.4.0 - 37 seconds
gcc 3.3.5 - 89 seconds
gcc 3.3.1 - 91 seconds

3.3 is slow, 3.4 is faster, 4.0.0 seems slow agin, I don't have 4.0.*
hanging around to test.

Looking at a timing report based on the 4.0.0 compiler it looks like
half the compile time is spent in the phase "dominance frontiers".

I will investigate some more.

Steve Ellcey
[EMAIL PROTECTED]


Request for testsuite help (gcc.dg/compat)

2005-09-21 Thread Steve Ellcey
I was wondering if I could get some help/advice from a testsuite expert.
I have a patch that I want to submit that makes sure elements of an
array are not given an alignment greater than their size.

See http://gcc.gnu.org/ml/gcc/2005-03/msg00729.html

This test was causing a bunch of regressions, most of which have been
fixed now by Jakub and Dorit.  But the patch still causes a couple of
regressions in the gcc.dg/compat tests that I have been unable to fix.
The failures I get are:

FAIL: tmpdir-gcc.dg-struct-layout-1/t002 c_compat_x_tst.o compile
FAIL: tmpdir-gcc.dg-struct-layout-1/t002 c_compat_y_tst.o compile
FAIL: tmpdir-gcc.dg-struct-layout-1/t027 c_compat_x_tst.o compile
FAIL: tmpdir-gcc.dg-struct-layout-1/t027 c_compat_y_tst.o compile

There used to be more layout failures but Jakub submitted a patch
earlier (May 2005) that fixed all but these.  I know that the
gcc.dg-struct-layout-1_generate program creates a t002_test.h header
file and that that file contains:

T(582,void * atal8 a[2];double b;unsigned short int c;,F(582,a[0],(void 
*)&intarray[78],(void 
*)&intarray[187])F(582,b,198407.656250,218547.203125)F(582,c,55499U,5980U))

and that atal8 is a define for "__attribute__((aligned (8)))" which
means that we get "void * __attribute__((aligned (8))) a[2];" and that
is what is causing the problem (8 byte alignement of the elements in an
array where the elements are only 4 bytes long.

But what I have not been able to do is to figure out how to get
gcc.dg-struct-layout-1_generate to stop generating this type.

Even after looking at Jakubs patch that fixed the other layout failures,
I haven't been able to come up with a fix.

Can anyone help me with this?

Steve Ellcey
[EMAIL PROTECTED]


RFC: IPO optimization framework for GCC

2005-10-07 Thread Steve Ellcey
I have been given some time by my management to work on creating a
framework for IPO optimizations in GCC by creating an intermediate file
reader and writer for GCC.

I would like to start by getting any input and advice the members of the
GCC community might have for me.  I would also like to see if I can get
some names of folks who might be interested in helping or advising me on
this project.

My current thought is that if I can get a start made I would create a
branch for this work in CVS and a project page on the GCC Wiki.

In the meantime I would be interested in any opinions people have on
what level we should be writing things out at.  Generic?  Gimple?  RTL?
(Just kidding on that last one.)  Also any opinions on what format to
write things out in; binary form vs. an ascii file, XML?  ANDF?  If you
know of any good papers I should read I would like to hear about those
too.

Steve Ellcey
[EMAIL PROTECTED]


Re: RFC: IPO optimization framework for GCC

2005-10-10 Thread Steve Ellcey
Thanks to everyone who replied to my mail, I am currently waiting for
some follow-ups to replies I got off-list.  In the mean time I wonder if
we could talk about Devang's questions on what this might look like to
a user.

> From: Devang Patel <[EMAIL PROTECTED]>
>
> It is useful to get clear understanding of few simpler things before
> tackling IL issue.
> 
> First question is - What is the user interface ? Few alternatives :
> 
> 1)  gcc  -fenable-ipo input1.c input2.c input3.c -o output
> 
>  Here, writing IL on the disk, and reading it back, and optimizing it,
> etc.. are all hidden from users.

But at the cost of having to put all the source compiles on one GCC
command line.  We could probably do this today without reading or
writing anything to disk (as long as we didn't run out of memory).

> 2)  gcc  -fwrite-ipo input1.c -o input1.data
>  gcc  -fwrite-ipo input2.c -o input2.data
>  gcc  -fwrite-ipo input3.c -o input3.data
> 
>  gcc  -fread-ipo input1.data input2.data input3.data -o output
> 
> 3)  gcc  -fwrite-ipo input1.c -o input1.data
>  gcc  -fuse-ipo input1.data input2.c -o input2.data
>  gcc  -fuse-ipo input2.data input3.c -o output
> 
> 4)  gcc  -fwrite-ipo input1.c -o input1.data
>  gcc  -fwrite-ipo input2.c -o input2.data
>  gcc  -fwrite-ipo input3.c -o input3.data
> 
>  glo  -fread-ipo input1.data input2.data input3.data -o output

Could we just have -fwrite-ipo create a '.o' file that contains the
intermediate representation (instead of being a real object file).

Then when the linker is called it would call the compiler with all the
files that have intermediate code instead of object code and finish up
the compilation.  Actually, maybe we could add the restriction that
you have to use GCC to call the linker when doing IPO and that way 
GCC could finish up the compilations before it calls the linker.

> Second question is - When to put info on the disk? Few alternatives,
> 1) Before gimplfication
> 2) Before optimizing tree-ssa
> 3) After tree-ssa optimization is complete
> 4) Immediately after generating RTL
> 5) Halfway throuh RTL passes
> etc.. And answer to this question largely depend on the optimization
> passes that work on whole program info.

I would think one would want to put the info out before optimizing
tree-ssa since you would hope that the IPO data from other modules would
let you do better tree-ssa optimizations.

> I do not know whether these two questions are already answered or not.

I don't think anything has been answered yet.

Steve Ellcey
[EMAIL PROTECTED]


Subversion and firewalls question

2005-10-18 Thread Steve Ellcey

Anyone have advice on how to get subversion working through a corporate
firewall.

Currently I get:

| /usr/local/bin/svn co svn+ssh://gcc.gnu.org:/svn/gcc/trunk
| ssh: gcc.gnu.org:: no address associated with hostname.
| svn: Connection closed unexpectedly

I have cvs working, I ran socksify on cvs and ssh and that seemed to
work fine for those commands and I can do checkout/checkins with cvs.
When I try to socksify svn, I get an error:

[hpsje - sje_gcc_cmo] (root) $ /opt/socks/bin/socksify  /usr/local/bin/svn
/usr/local/bin/svn->/opt/socks/bin/svn ... Found nothing to change.

I think this might be because the library calls that need to be
intercepted by socks are not in svn but in a dynamic library that is
linked in by svn.

It looks like the neon subdirectory in svn understands --with-socks=
but I don't have a socks.h header file as part of my socks installation.
Is there an GNU Socks package I can build?  I see Dante, is that
what I want?

Is using --with-socks on my subversion build the right way to be
attacking this problem?

I am trying to get this to work from my HP-UX box, if that makes a
difference.

Steve Ellcey
[EMAIL PROTECTED]


Re: Subversion and firewalls question

2005-10-18 Thread Steve Ellcey
> > Currently I get:
> > 
> > | /usr/local/bin/svn co svn+ssh://gcc.gnu.org:/svn/gcc/trunk
> > | ssh: gcc.gnu.org:: no address associated with hostname.
> > | svn: Connection closed unexpectedly
> 
> This one might easy.
> 
> You added a : at the end of gcc.gnu.org :)

Blush

It worked.

Steve Ellcey
[EMAIL PROTECTED]


Re: Excess precision problem on IA-64

2005-10-27 Thread Steve Ellcey
> > This seems like any other target which has a fused multiply and add 
> > instruction like PPC.  Maybe a target option to turn on and off the fma
> > instruction like there is for PPC.
> 
> I'm under the impression that it's worse on IA-64 because of the "infinite 
> precision", but I might be wrong.
> 
> -- 
> Eric Botcazou

The HP compiler generates fused multiply and add by default and has several
settings for the +Ofltacc option to control this (and other optimizations
that affect floating point accuracy).

+Ofltacc=default

Allows contractions, such as fused multiply-add (FMA), but
disallows any other floating point optimization that can result
in numerical differences.

+Ofltacc=limited

Like default, but also allows floating point optimizations which
may affect the generation and propagation of infinities, NaNs,
and the sign of zero.

+Ofltacc=relaxed

In addition to the optimizations allowed by limited, permits
optimizations, such as reordering of expressions, even if
parenthesized, that may affect rounding error.  This is the same
as +Onofltacc.

+Ofltacc=strict

Disallows any floating point optimization that can result in
numerical differences.  This is the same as +Ofltacc.

It would be easy enough to add an option that turned off the use of the
fused multiply and add in GCC but I would hate to see its use turned off
by default.

Steve Ellcey
[EMAIL PROTECTED]


Re: Re: Does gcc-3.4.3 for HP-UX 11.23/IA-64 work?

2005-11-08 Thread Steve Ellcey
> > As mentioned before, there is a brace missing after the gcc_s_hpux64. 
> > This brace is needed to close off the shared-libgcc rule before the 
> > static-libgcc rule starts.  You then must delete a brace from the end of 
> > the !static rule which has one too many.
> 
> Yes, doing so gives the correct 'gcc -shared' output.

I am not convinced there is a bug here.  I think there may have been a
deliberate change between 3.4.* and 4.* about whether or not '-shared'
implied '-shared-libgcc', particularly for C code.  I notice that if I
compile using 3.4.4 and use '-shared -shared-libgcc' instead of just
'-shared' then it works as you want.

Steve Ellcey
[EMAIL PROTECTED]


Re: GMP on IA64-HPUX

2005-12-05 Thread Steve Ellcey
> > >   So, in short, my questions are: is gmp-4.1.4 supposed to work on
> > >   ia64-hpux?
> > >
> > > No, it is not.  It might be possible to get either the LP64 or
> > > the ILP32 ABI to work, but even that requires the workaround you
> > > mention.  Don't expect any HP compiler to compile GMP correctly
> > > though, unless you switch off optimization.
> > >
> 
> If it's really compiler problems, this is one more reason for pulling
> gmp to the toplevel gcc, so it can be built with a sane compiler.
> 
> Richard.

FYI:  What I do to compile gmp on IA64 HP-UX is to configure gmp with
'--host=none --target=none --build=none'.  This avoids all the target
specific code.  I am sure the performance stinks this way but since it
is used by the compiler and not in the run-time I haven't found it to be
a problem.  Of course I don't compile any big fortran programs either.

Steve Ellcey
[EMAIL PROTECTED]


GCC 3.4.5 status?

2005-12-05 Thread Steve Ellcey

Has GCC 3.4.5 been officially released?  I don't recall seeing an
announcement in gcc@gcc.gnu.org or [EMAIL PROTECTED] and when I
looked on the main GCC page and I see references to GCC 3.4.4 but not
3.4.5.  But I do see a 3.4.5 download on the GCC mirror site that I
checked and I see a gcc_3_4_5_release tag in the SVN tags directory.

I also notice we have a "Releases" link under "About GCC" in the top
left corner of the main GCC page that doesn't look like it has been
updated in quite a while for any releases.  Should this be updated or
removed?

Steve Ellcey
[EMAIL PROTECTED]


Re: GCC can't stop using GNU libiconv if it's in /usr/local

2006-01-17 Thread Steve Ellcey
> IMHO, the fact that GCC includes /usr/local/include by default in it's
> system header search path is brain damaged, but it's probably way too
> entrenched to revisit that. :-(
> 
>   --Kaveh
> --
> Kaveh R. Ghazi[EMAIL PROTECTED]

You can stop this by specifying --with-local-prefix=/not-usr-local when
configuring GCC.

I have built a GCC into a location like /be by specifying both
--prefix=/be and --with-local-prefix=/be

This GCC does not look in /usr/local/include (but does search
/be/include).

Steve Ellcey
[EMAIL PROTECTED]


Question about DRAP register and reserving hard registers

2015-06-16 Thread Steve Ellcey

I have a question about the DRAP register (used for dynamic stack alignment)
and about reserving/using hard registers in general.  I am trying to understand
where, if a drap register is allocated, GCC is told not to use it during
general register allocation.  There must be some code somewhere for this
but I cannot find it.

I am trying to implement dynamic stack alignment on MIPS and because there
is so much code for the x86 dynamic stack alignment I am trying to incorporate
bits of it as I understrand what I need instead of just turning it all on
at once and getting completely lost.

Right now I am using register 16 on MIPS to access incoming arguments
in a function that needs dynamic alignment, so it is my drap register if
my understanding of the x86 code and its use of a DRAP register is correct.
I copy the stack pointer into reg 16 before I align the stack pointer
(during expand_prologue).  So far the only way I have found to stop the
register allocator from also using reg 16 and thus messing up its value is to
set fixed_regs[16].  But I don't see the x86 doing this for its DRAP register
and I was wondering how it is handled there.

I think setting fixed_regs[16] is why C++ tests with exception handling are
not working for me because this register is not getting set and restored
(since it is thought to be fixed) during code that uses throw and catch.

Steve Ellcey
sell...@imgtec.com


Re: Question about DRAP register and reserving hard registers

2015-06-19 Thread Steve Ellcey
On Fri, 2015-06-19 at 09:09 -0400, Richard Henderson wrote:
> On 06/16/2015 07:05 PM, Steve Ellcey  wrote:
> >
> > I have a question about the DRAP register (used for dynamic stack alignment)
> > and about reserving/using hard registers in general.  I am trying to 
> > understand
> > where, if a drap register is allocated, GCC is told not to use it during
> > general register allocation.  There must be some code somewhere for this
> > but I cannot find it.
> 
> There isn't.  Because the vDRAP register is a pseudo.  The DRAP register is 
> only live from somewhere in the middle of the prologue to the end of the 
> prologue.
> 
> See ix86_get_drap_rtx, wherein we coordinate with the to-be-generated 
> prologue 
> (crtl->drap_reg), allocate the pseudo, and emit the hard-reg-to-pseudo copy 
> at 
> entry_of_function.
> 
> 
> r~

OK, that makes more sense now.  In my work on MIPS I was trying to cut
out some of the complexity of the x86 implementation and just use a hard
register as my DRAP register.  One of the issues I ran into, and perhaps
the one that caused x86 to use a virtual register, was saving and
restoring the register during setjmp/longjmp and C++ exception handling
usage.  I will trying switching to a virtual register and see if that
works better.

Other than exceptions, the main complexity in dynamic stack alignment
seems to involve the debug information.  I am still trying to understand
the handling of the drap register and dynamic stack alignment in
dwarf2out.c and dwarf2cfi.c.

Steve Ellcey
sell...@imgtec.com



Re: Question about DRAP register and reserving hard registers

2015-06-22 Thread Steve Ellcey
On Fri, 2015-06-19 at 09:09 -0400, Richard Henderson wrote:
> On 06/16/2015 07:05 PM, Steve Ellcey  wrote:
> >
> > I have a question about the DRAP register (used for dynamic stack alignment)
> > and about reserving/using hard registers in general.  I am trying to 
> > understand
> > where, if a drap register is allocated, GCC is told not to use it during
> > general register allocation.  There must be some code somewhere for this
> > but I cannot find it.
> 
> There isn't.  Because the vDRAP register is a pseudo.  The DRAP register is 
> only live from somewhere in the middle of the prologue to the end of the 
> prologue.
> 
> See ix86_get_drap_rtx, wherein we coordinate with the to-be-generated 
> prologue 
> (crtl->drap_reg), allocate the pseudo, and emit the hard-reg-to-pseudo copy 
> at 
> entry_of_function.
> 
> 
> r~

OK, I think I have this part of the code working on MIPS but
crtl->drap_reg is used in the epilogue as well as the prologue even if
it is not 'live' in between.  If I understand the code correctly the x86
prologue pushes the drap register on to the stack so that the epilogue
can pop it off and use it to restore the stack pointer.  Is my
understanding correct?

I also need the drap pointer in the MIPS epilogue but I would like to
avoid having to get it from memory.  Ideally I would like to restore it
from the virtual register that the prologue code / get_drap_rtx code put
it into.  I tried just doing a move from the virtual drap register to
the real one in expand_epilogue but that didn't work because it looks
like you can't access virtual registers from expand_prologue or
expand_epilogue.  I guess that is why the code to copy the hard drap reg
to the virtual drap_reg is done in get_drap_reg and not in
expand_prologue.  I thought about putting code in get_drap_reg to do
this copying but I don't see how to access the end of a function.  The
hard drap reg to virtual drap reg copy is inserted into the beginning of
a function with:

insn = emit_insn_before (seq, NEXT_INSN (entry_of_function ()));

Is there an equivalent method to insert code to the end of a function?
I don't see an 'end_of_function ()' routine anywhere.

Steve Ellcey
sell...@imgtec.com







Re: Question about DRAP register and reserving hard registers

2015-07-07 Thread Steve Ellcey
On Mon, 2015-06-29 at 11:10 +0100, Richard Henderson wrote:

> > I also need the drap pointer in the MIPS epilogue but I would like to
> > avoid having to get it from memory.  Ideally I would like to restore it
> > from the virtual register that the prologue code / get_drap_rtx code put
> > it into.  I tried just doing a move from the virtual drap register to
> > the real one in expand_epilogue but that didn't work because it looks
> > like you can't access virtual registers from expand_prologue or
> > expand_epilogue.  I guess that is why the code to copy the hard drap reg
> > to the virtual drap_reg is done in get_drap_reg and not in
> > expand_prologue.  I thought about putting code in get_drap_reg to do
> > this copying but I don't see how to access the end of a function.  The
> > hard drap reg to virtual drap reg copy is inserted into the beginning of
> > a function with:
> >
> > insn = emit_insn_before (seq, NEXT_INSN (entry_of_function ()));
> >
> > Is there an equivalent method to insert code to the end of a function?
> > I don't see an 'end_of_function ()' routine anywhere.
> 
> Because, while generating initial rtl for a function, the beginning of a 
> function has already been emitted, while the end of the function hasn't.
> 
> You'd need to hook into expand_function_end, right at the bottom, before the 
> call to use_return_register.
> 
> 
> r~

I ran into an interesting issue while doing this.  Right now the expand
pass calls construct_exit_block (which calls expand_function_end) before
it calls expand_stack_alignment.  That means that crtl->drap_reg, etc
are not yet set up when in expand_function_end.  I moved the
expand_stack_alignment call up before construct_exit_block to fix that.
I hope moving it up doesn't break anything.

Steve Ellcey
sell...@imgtec.com



Re: Question about DRAP register and reserving hard registers

2015-07-09 Thread Steve Ellcey
On Mon, 2015-06-29 at 11:10 +0100, Richard Henderson wrote:

> > OK, I think I have this part of the code working on MIPS but
> > crtl->drap_reg is used in the epilogue as well as the prologue even if
> > it is not 'live' in between.  If I understand the code correctly the x86
> > prologue pushes the drap register on to the stack so that the epilogue
> > can pop it off and use it to restore the stack pointer.  Is my
> > understanding correct?
> 
> Yes.  Although that saved copy is also used by unwind info.

Do you know how and where this saved copy is used by the unwind info?
I don't see any indication that the unwind library knows if a stack has
been dynamically realigned and I don't see where unwind makes use of
this value.

Steve Ellcey
sell...@imgtec.com



Basic GCC testing question

2015-07-10 Thread Steve Ellcey

I have a basic GCC testing question.  I built a native GCC and ran:

make RUNTESTFLAGS='dg.exp' check

Everything passed and according to the log file it used the unix.exp
as the target-board.  But if I try running:

make RUNTESTFLAGS='dg.exp --target-board=unix' check

Then I get failures.  They both say they are running target unix.
If I diff the two log files I see:

1,2c1,3
< Test Run By sellcey on Fri Jul 10 10:13:21 2015
< Native configuration is x86_64-unknown-linux-gnu
---
> Test Run By sellcey on Fri Jul 10 09:52:41 2015
> Target is unix
> Host   is x86_64-unknown-linux-gnu
12a14,15
> WARNING: Assuming target board is the local machine (which is 
probably wrong).
> You may need to set your DEJAGNU environment variable.

The reason I want to specify a target-board is so I can then modify it with
something like '--target-board=unix/-m32' but I think I need to specify a
board before I add any options don't I?

Steve Ellcey
sell...@imgtec.com


Re: Basic GCC testing question

2015-07-10 Thread Steve Ellcey
On Fri, 2015-07-10 at 14:27 -0500, Segher Boessenkool wrote:
> On Fri, Jul 10, 2015 at 10:43:43AM -0700, Steve Ellcey  wrote:
> > 
> > I have a basic GCC testing question.  I built a native GCC and ran:
> > 
> > make RUNTESTFLAGS='dg.exp' check
> > 
> > Everything passed and according to the log file it used the unix.exp
> > as the target-board.  But if I try running:
> > 
> > make RUNTESTFLAGS='dg.exp --target-board=unix' check
> 
> Does it work better if you spell --target_board ?
> 
> 
> Segher


Arg, I hate it when I do something stupid like that.  It would be ince
if runtest gave an error message when it had a bad/unknown argument, but
if it does I didn't see it anywhere.

Steve Ellcey



CFI directives and dynamic stack alignment

2015-08-03 Thread Steve Ellcey

I don't know if there are any CFI experts out there but I am working on
dynamic stack alignment for MIPS.  I think I have it working in the 'normal'
case but when I try to do stack unwinding through a routine with an aligned
stack, then I have problems.  I was wondering if someone can help me understand
what CFI directives to generate to allow stack unwinding.  Using
gcc.dg/cleanup-8.c as an example (because it fails with my stack alignment
code), if I generate code with no dynamic stack alignment (but forcing the
use of the frame pointer), the routine fn2 looks like this on MIPS:

fn2:
.frame  $fp,32,$31  # vars= 0, regs= 2/0, args= 16, gp= 8
.mask   0xc000,-4
.fmask  0x,0
.setnoreorder
.setnomacro
lui $2,%hi(null)
addiu   $sp,$sp,-32
.cfi_def_cfa_offset 32
lw  $2,%lo(null)($2)
sw  $fp,24($sp)
.cfi_offset 30, -8
move$fp,$sp
.cfi_def_cfa_register 30
sw  $31,28($sp)
.cfi_offset 31, -4
jal abort
sb  $0,0($2)

There are .cfi directives when incrementing the stack pointer, saving the
frame pointer, and copying the stack pointer to the frame pointer.

When I generate code to dynamically align the stack my code looks like
this:

fn2:
.frame  $fp,32,$31  # vars= 0, regs= 2/0, args= 16, gp= 8
.mask   0xc000,-4
.fmask  0x,0
.setnoreorder
.setnomacro
lui $2,%hi(null)
li  $3,-16  # 0xfff0
lw  $2,%lo(null)($2)
and $sp,$sp,$3
addiu   $sp,$sp,-32
.cfi_def_cfa_offset 32
sw  $fp,24($sp)
.cfi_offset 30, -8
move$fp,$sp
.cfi_def_cfa_register 30
sw  $31,28($sp)
.cfi_offset 31, -4
jal abort
sb  $0,0($2)

The 'and' instruction is where the stack gets aligned and if I remove that
one instruction, everything works.  I think I need to put out some new CFI
psuedo-ops to handle this but I am not sure what they should be.  I am just
not very familiar with the CFI directives.

I looked at ix86_emit_save_reg_using_mov where there is some special
code for handling the drap register and for saving registers on a 
realigned stack but I don't really understand what they are trying 
to do.

Any help?

Steve Ellcey
sell...@imgtec.com

P.S. For completeness sake I have attached my current dynamic
 alignment changes in case anyone wants to see them.

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 4f9a31d..386c2ce 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5737,6 +5737,29 @@ expand_stack_alignment (void)
   gcc_assert (targetm.calls.get_drap_rtx != NULL);
   drap_rtx = targetm.calls.get_drap_rtx ();
 
+  /* I am not doing this in get_drap_rtx because we are also calling
+ that from expand_function_end in order to get/set the drap_reg
+ and vdrap_reg variables and doing these instructions at that
+ point is not working.   */
+
+  if (drap_rtx != NULL_RTX)
+{
+  rtx_insn *insn, *seq;
+
+  start_sequence ();
+  emit_move_insn (crtl->vdrap_reg, crtl->drap_reg);
+  seq = get_insns ();
+  insn = get_last_insn ();
+  end_sequence ();
+  emit_insn_at_entry (seq);
+  if (!optimize)
+{
+  add_reg_note (insn, REG_CFA_SET_VDRAP, crtl->vdrap_reg);
+  RTX_FRAME_RELATED_P (insn) = 1;
+}
+}
+
+
   /* stack_realign_drap and drap_rtx must match.  */
   gcc_assert ((stack_realign_drap != 0) == (drap_rtx != NULL));
 
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index ce21a0f..b6ab30a 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -746,6 +746,8 @@ static const struct attribute_spec mips_attribute_table[] = {
   { "use_shadow_register_set",	0, 0, false, true,  true, NULL, false },
   { "keep_interrupts_masked",	0, 0, false, true,  true, NULL, false },
   { "use_debug_exception_return", 0, 0, false, true,  true, NULL, false },
+  { "align_stack", 0, 0, true, false, false, NULL, false },
+  { "no_align_stack", 0, 0, true, false, false, NULL, false },
   { NULL,	   0, 0, false, false, false, NULL, false }
 };
 
@@ -1528,6 +1530,61 @@ mips_merge_decl_attributes (tree olddecl, tree newdecl)
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+static bool
+mips_cfun_has_msa_p (void)
+{
+  /* For now, for testing, assume all functions use MSA
+ (and thus need alignment).  */
+#if 0
+  if (!cfun || !TARGET_MSA)
+return FALSE;
+
+  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+{
+  if (MSA_SUPPORTED_MODE_P (GET_MODE (insn)))
+	return TRUE;
+}
+
+  return FALSE;
+#else
+  return TRUE;
+#endif
+}
+
+bool
+mips_align_stack_p (void)
+{
+  bool want_alignment = TARGET_ALIGN_STACK && 

Re: CFI directives and dynamic stack alignment

2015-08-17 Thread Steve Ellcey
On Tue, 2015-08-11 at 10:05 +0930, Alan Modra wrote:

> > The 'and' instruction is where the stack gets aligned and if I remove that
> > one instruction, everything works.  I think I need to put out some new CFI
> > psuedo-ops to handle this but I am not sure what they should be.  I am just
> > not very familiar with the CFI directives.
> 
> I don't speak mips assembly very well, but it looks to me that you
> have more than just CFI problems.  How do you restore sp on return
> from the function, assuming sp wasn't 16-byte aligned to begin with?
> Past that "and $sp,$sp,$3" you don't have any means of calculating
> the original value of sp!  (Which of course is why you also can't find
> a way of representing the frame address.)

I have code in expand_prologue that copies the incoming stack pointer to
a temporary hard register and then I have code to the entry_block to
copy that register into a virtual register.  In the exit block that
virtual register is copied back to a temporary hard register and
expand_epilogue copies it back to $sp to restore the stack pointer.

This function (fn2) ends with a call to abort, which is noreturn, so the
optimizer sees that the epilogue is dead code and GCC determines that
there is no need to save the old stack pointer since it will never get
restored.   I guess I need to tell GCC to save the stack pointer in
expand_prologue even if it never sees a use for it.  I guess I need to
make the temporary register where I save $sp volatile or do something
else so that the assignment (and its associated .cfi) is not deleted by
the optimizer.

Steve Ellcey
sell...@imgtec.com



Adding an IPA pass question (pass names)

2015-08-19 Thread Steve Ellcey

I am trying to create a new IPA pass to scan the routines being compiled
by GCC and I thought I would put it in after the last IPA pass (comdats)
so I tried to register it with:

  opt_pass *p = make_pass_ipa_frame_header_opt (g);
  static struct register_pass_info f = 
{p, "comdats", 1, PASS_POS_INSERT_AFTER };
  register_pass (&f);

But when I build GCC I get:

/scratch/sellcey/repos/header2/src/gcc/libgcc/libgcc2.c:1:0: fatal error: pass 
'comdats' not found but is referenced by new pass 'frame-header-opt'

Does anyone know why this is the case?  "comdats" is what is used for
the name of pass_ipa_comdats in ipa-comdats.c.

Steve Ellcey
sell...@imgtec.com


Re: Adding an IPA pass question (pass names)

2015-08-19 Thread Steve Ellcey
On Wed, 2015-08-19 at 13:40 -0400, David Malcolm wrote:

> Is your pass of the correct type?  (presumably IPA_PASS).  I've run into
> this a few times with custom passes (which seems to be a "gotcha");
> position_pass can fail here:
> 
>   /* Check if the current pass is of the same type as the new pass and
>  matches the name and the instance number of the reference pass.  */
>   if (pass->type == new_pass_info->pass->type
> 
> 
> Hope this is helpful
> Dave

That seems to have been the problem.   I made my pass SIMPLE_IPA_PASS
and the comdats pass is just IPA_PASS.  I changed mine to IPA_PASS and
it now registers the pass.

Steve Ellcey
sell...@imgtec.com



Re: CFI directives and dynamic stack alignment

2015-08-24 Thread Steve Ellcey
On Tue, 2015-08-18 at 09:23 +0930, Alan Modra wrote:
> On Mon, Aug 17, 2015 at 10:38:22AM -0700, Steve Ellcey wrote:

> OK, then you need to emit a .cfi directive to say the frame top is
> given by the temp hard reg sometime after that assignment and before
> sp is aligned in the prologue, and another .cfi directive when copying
> to the pseudo.  It's a while since I looked at the CFI code in gcc,
> but arranging this might be as simple as setting RTX_FRAME_RELATED_P
> on the insns involved.
> 
> If -fasynchronous-unwind-tables, then you'll also need to track the
> frame in the epilogue.
> 
> > This function (fn2) ends with a call to abort, which is noreturn, so the
> > optimizer sees that the epilogue is dead code and GCC determines that
> > there is no need to save the old stack pointer since it will never get
> > restored.   I guess I need to tell GCC to save the stack pointer in
> > expand_prologue even if it never sees a use for it.  I guess I need to
> > make the temporary register where I save $sp volatile or do something
> > else so that the assignment (and its associated .cfi) is not deleted by
> > the optimizer.
> 
> Ah, I see.  Yes, the temp and pseudo are not really dead if they are
> needed for unwinding.

Yes, I was originally thinking I just had to make the temp and pseudo
regs volatile so that the assignments would not get removed but it
appears that I need the epilogue code too (even if I never get there
because of a call to abort which GCC knows is non-returning) so that I
have the needed .cfi directives there.  I am thinking I should add an
edge from the entry_block to the exit_block so that the exit block is
never removed by the optimizer.  I assume this edge would need to be
abnormal and/or fake but I am not sure which (if either) of these edges
would be appropriate for this.

Steve Ellcey
sell...@imgtec.com



fake/abnormal/eh edge question

2015-08-25 Thread Steve Ellcey
I have a question about FAKE, EH, and ABNORMAL edges.  I am not sure I 
understand all the implications of each type of edge from the description
in cfg-flags.def.

I am trying to implement dynamic stack alignment for MIPS and I have code
that does the following:

prologue
copy incoming $sp to $12 (temp reg)
align $sp
copy $sp to $fp (after alignment so that $fp is also aligned)
entry block
copy $12 to virtual reg (DRAP) for accessing args and for
restoring $sp

exit block
copy virtual reg (DRAP) back to $12
epilogue
copy $12 to $sp to restore stack pointer


This works fine as long as there as a path from the entry block to the
exit block but in some cases (like gcc.dg/cleanup-8.c) we have a function
that always calls abort (a non-returning function) and so there is no 
path from entry to exit and the exit block and epilogue get removed and
the copy of $sp to $12 also gets removed because GCC sees no uses of $12.

I want to preserve the copy of $sp to $12 and I also want to preserve the
.cfi psuedo-ops (and code) in the exit block and epilogue in order for
exception handling to work correctly.  One way I thought of doing this
is to create an edge from the entry block to the exit block but I am
unsure of all the implications of creating a fake/eh/abnormal edge to
do this and which I would want to use.

Steve Ellcey
sell...@imgctec.com


Re: fake/abnormal/eh edge question

2015-08-25 Thread Steve Ellcey
On Tue, 2015-08-25 at 14:44 -0600, Jeff Law wrote:

> > I want to preserve the copy of $sp to $12 and I also want to preserve the
> > .cfi psuedo-ops (and code) in the exit block and epilogue in order for
> > exception handling to work correctly.  One way I thought of doing this
> > is to create an edge from the entry block to the exit block but I am
> > unsure of all the implications of creating a fake/eh/abnormal edge to
> > do this and which I would want to use.
> Presumably it's the RTL DCE pass that's eliminating this stuff?

Actually, it looks like is peephole2 that is eliminating the
instructions (and .cfi psuedo-ops).

> 
> Do you have the FRAME_RELATED bit set of those insns?
> 
> But what I don't understand is why preserving the code is useful if it 
> can't be reached.  Maybe there's something about the dwarf2 unwinding 
> that I simply don't understand -- I've managed to avoid learning about 
> it for years.

I am not entirely sure I need the code or if I just need the .cfi
psuedo-ops and that I need the code to generate the .cfi stuff.

I wish I could avoid the dwarf unwinder but that seems to be the main
problem I am having with stack realignment.  Getting the cfi stuff right
so that the unwinder works properly is proving very hard.

Steve Ellcey
sell...@imgtec.com




GTY / gengtype question - adding a new header file

2015-08-31 Thread Steve Ellcey

I have a question about gengtype and GTY.  I was looking at adding some
code to mips.c and it occurred to me that that file was getting very
large (19873 lines).  So I wanted to add a new .c file instead but that
file needed some types that were defined in mips.c and not in a header file.
Specifically it needed the MIPS specific machine_function structure that
is defined in mips.c with:

struct GTY(())  machine_function {

I think I could just move this to mips.h and things would be fine but
I didn't want to do that because mips.h is included in tm.h and is visible
to the generic GCC code.  Currently machine_function is not visible to the
generic GCC code and so I wanted to put machine_function in a header file
that could only be seen/used by mips specific code.  So I created
mips-private.h and added it to extra_headers in config.gcc.

The problem is that if I include mips-private.h in mips.c instead of
having the actual definition of machine_function in mips.c then my
build fails and I think it is due to how and where gengtype scans for GTY
uses.

I couldn't find an example of a platform that has a machine specific header
file that was not visible to the generic GCC code and that has GTY types
in it so I am not sure what I need to do to get gengtype to scan
mips-private.h or if this is even possible (or wise).

Steve Ellcey
sell...@imgtec.com


Re: GTY / gengtype question - adding a new header file

2015-09-01 Thread Steve Ellcey
On Tue, 2015-09-01 at 08:11 +0100, Richard Sandiford wrote:

> config.gcc would need to add mips-private.h to target_gtfiles.

OK, that was what I missed.

> I'm not sure splitting the file is a good idea though.  At the moment
> the definitions of all target hooks must be visible to a single TU.
> Either you'd need to keep all the hooks in one .c file (leading
> to an artificial split IMO) or you'd need declare some of them
> in the private header.  Declaring them in the header file would only be
> consistent if the targetm definition was in its own file (so that _every_
> hook had a prototype in the private header).  That seems like unnecessary
> work though.

The code I want to add is actually a separate GCC pass so it breaks out
fairly cleanly.  It just needs access to the machine_function structure
and the types and structures included in that structure
(mips_frame_info, mips_int_mask, and mips_shadow_set).  It sets a couple
of new boolean variables in the machine_function structure which are
then used during mips_compute_frame_info.

I see what you mean about much of mips.c probably not being splittable
due to the target hook structure but machine specific passes may be the
exception to that rule.  We already have one pass in mips.c
(pass_mips_machine_reorg2), that might be something else that could be
broken out, though I haven't looked in detail to see what types or
structures it would need access to.

Steve Ellcey
sell...@imgtec.com



Re: GTY / gengtype question - adding a new header file

2015-09-01 Thread Steve Ellcey
On Tue, 2015-09-01 at 10:13 +0200, Georg-Johann Lay wrote:

> 
> I'd have a look at what BEs are using non-default target_gtfiles.
> 
> Johann

There are a few BEs that add a .c file to target_gtfiles, but no
platforms that add a .h file to target_gtfiles.  I do see a number
of platforms that define the machine_function structure in their header
file (aarch64.h, pa.h, i386.h) instead of their .c file though.

Maybe that is a better way to go for MIPS instead of doing something
completely new.  If I move machine_function, mips_frame_info,
mips_int_mask, and mips_shadow_set from mips.c to mips.h then I could
put my new machine specific pass in a separate .c file from mips.c and
not need to do anything with target_gtfiles.  The only reason I didn't
want to do this was so that machine_function wasn't visible to the rest
of GCC but that doesn't seem to have been an issue for other targets.

Steve Ellcey
sell...@imgtec.com




Build problem with libgomp on ToT?

2015-09-10 Thread Steve Ellcey
I just ran into this build failure last night:

/usr/bin/install: cannot create regular file 
`/scratch/sellcey/repos/nightly/install-mips-mti-linux-gnu/lib/gcc/mips-mti-linux-gnu/6.0.0/finclude/omp_lib_kinds.mod':
 File exists

This is on a parallel make install (-j 7) with multilibs.  I don't see an
obvious patch that could have caused this new failure, has anyone else run
into this?  I couldn't find anything in the bug database or in the mailing
lists.

Steve Ellcey
sell...@imgtec.com


TARGET_PROMOTE_PROTOTYPES question

2015-10-20 Thread Steve Ellcey
I have a question about the TARGET_PROMOTE_PROTOTYPES macro.  This macro
says that types like short or char should be promoted to ints when
passed as arguments, even if there is a prototype for the argument.

Now when I look at the code generated on MIPS or x86 it looks like there
is conversion code in both the caller and the callee.  For example:

int foo(char a, short b) { return a+b; }
int bar (int a) { return foo(a,a); }


In the rtl expand dump (on MIPS) I see this in bar:

(insn 6 3 7 2 (set (reg:SI 200)
(sign_extend:SI (subreg:HI (reg/v:SI 199 [ a ]) 2))) x.c:2 -1
 (nil))
(insn 7 6 8 2 (set (reg:SI 201)
(sign_extend:SI (subreg:QI (reg/v:SI 199 [ a ]) 3))) x.c:2 -1
 (nil))

Which insures that we pass the arguments as ints.
And in foo we have:

(insn 8 9 10 2 (set (reg/v:SI 197 [ a+-3 ])
(sign_extend:SI (subreg:QI (reg:SI 198) 3))) x.c:1 -1
 (nil))
(insn 10 8 11 2 (set (reg/v:SI 199 [ b+-2 ])
(sign_extend:SI (subreg:HI (reg:SI 200) 2))) x.c:1 -1
 (nil))

Which makes sure we do a truncate/extend before using the values.

Now I know that we can't get rid of these truncation/extensions 
entirely, but do we need both?  It seems like foo could say that
if the original registers (198 and 200) are argument registers
that were extended to SImode due to TARGET_PROMOTE_PROTOTYPES
then we don't need to do the truncation/extension in the callee
and could just use the SImode values directly.  Am I missing
something?  Or are we doing both just to have belts and suspenders
and want to keep it that way?

Steve Ellcey
sell...@imgtec.com


_Fract types and conversion routines

2015-10-27 Thread Steve Ellcey

I have a question about the _Fract types and their conversion routines.
If I compile this program:

extern void abort (void);
int main ()
{
  signed char a = -1;
  _Sat unsigned _Fract b = a;
  if (b != 0.0ur)
abort();
  return 0;
}

with -O0 and on a MIPS32 system where char is 1 byte and unsigned (int)
is 4 bytes I see a call to '__satfractqiuhq' for the conversion.

Now I think the 'qi' part of the name is for the 'from type' of the
conversion, a 1 byte signed type (signed char), and the 'uhq' part is
for the 'to' part of the conversion.  But 'uhq' would be a 2 byte
unsigned fract, and the unsigned fract type on MIPS should be 4 bytes
(unsigned int is 4 bytes).  So shouldn't GCC have generated a call to
__satfractqiusq instead?  Or am I confused?

Steve Ellcey
sell...@imgtec.com


Re: _Fract types and conversion routines

2015-10-28 Thread Steve Ellcey
On Wed, 2015-10-28 at 13:42 +0100, Richard Biener wrote:
> On Wed, Oct 28, 2015 at 12:23 AM, Steve Ellcey  wrote:
> >
> > I have a question about the _Fract types and their conversion routines.
> > If I compile this program:
> >
> > extern void abort (void);
> > int main ()
> > {
> >   signed char a = -1;
> >   _Sat unsigned _Fract b = a;
> >   if (b != 0.0ur)
> > abort();
> >   return 0;
> > }
> >
> > with -O0 and on a MIPS32 system where char is 1 byte and unsigned (int)
> > is 4 bytes I see a call to '__satfractqiuhq' for the conversion.
> >
> > Now I think the 'qi' part of the name is for the 'from type' of the
> > conversion, a 1 byte signed type (signed char), and the 'uhq' part is
> > for the 'to' part of the conversion.  But 'uhq' would be a 2 byte
> > unsigned fract, and the unsigned fract type on MIPS should be 4 bytes
> > (unsigned int is 4 bytes).  So shouldn't GCC have generated a call to
> > __satfractqiusq instead?  Or am I confused?
> 
> did it eventually narrow the comparison?  Just check some of the tree/RTL 
> dumps.
> 
> > Steve Ellcey
> > sell...@imgtec.com

Hm, it looks like it optimized this in expand.  In the last tree dump it
still looks like:

b_2 = (_Sat unsigned _Fract) a_1;

But in the expand phase it becomes:

(call_insn/u 13 12 14 2 (parallel [
(set (reg:UHQ 2 $2)
(call (mem:SI (symbol_ref:SI ("__satfractqiuhq") [flags 0x41]) 
[0  S4 A32])
(const_int 16 [0x10])))
(clobber (reg:SI 31 $31))
])

I think this is a legitimate optimization (though I am compiling at -O0
so I wonder if it should really be doing this).  The problem I am
looking at is that I want to remove 'TARGET_PROMOTE_PROTOTYPES' because
it causing us to promote/sign extend types in the caller and the callee.
The MIPS ABI requires it be done in the caller so it should not need to
be done in the callee as well

See https://gcc.gnu.org/ml/gcc/2015-10/msg00149.html

When I ran the testsuite, I got one regression: 
gcc.dg/fixed-point/convert-sat.c.

When looking at that failure I thought the problem might be that I was calling
__satfractqiuhq instead of __satfractqiusq, but that does not seem to be the
issue.  The call to __satfractqiuhq is correct, and the difference that I see
when I don't define TARGET_PROMOTE_PROTOTYPES is that the result of 
__satfractqiuhq
is not truncated/sign-extended to UHQ mode inside of __satfractqiuhq.
I am looking to see if I need to do something with TARGET_PROMOTE_FUNCTION_MODE
to handle _Fract types differently than what 
default_promote_function_mode_always_promote
does.

I tried updating PROMOTE_MODE to handle _Fract modes (by promoting UHQ to USQ 
or SQ) but
that caused more failures than before.  It seems to be only the return of 
partial word
_Fract types that is causing me a problem.

Steve Ellcey
sell...@imgtec.com



Re: _Fract types and conversion routines

2015-10-28 Thread Steve Ellcey

You can ignore that last email.  I think I finally found where the
problem is.  In the main program:

extern void abort (void);
int main ()
{
  signed char a = -1;
  _Sat unsigned _Fract b = a;
  if (b != 0.0ur)
abort();
  return 0;
}

If I compile with -O0, I see:

li  $2,-1   # 0x
sb  $2,24($fp)
lbu $4,24($fp)
jal __satfractqiuhq

We put -1 in register $2, store the byte, then load the byte as an
unsigned char instead of a signed char.  When TARGET_PROMOTE_PROTOTYPES
was defined it didn't matter because __satfractqiuhq did another sign
extend before using the value.  When I got rid of
TARGET_PROMOTE_PROTOTYPES, that extra sign extend went away and the fact
that we are doing a 'lbu' unsigned load instead of a 'lb' signed byte
load triggered the bug.  Now I just need to find out why we are doing an
lbu instead of an lb.

Steve Ellcey
sell...@imgtec.com




Re: _Fract types and conversion routines

2015-10-29 Thread Steve Ellcey

OK, I think I understand what is happening with the MIPS failure when
converting 'signed char' to '_Sat unsigned _Fract' after I removed
the TARGET_PROMOTE_PROTOTYPES macro.

This bug is a combination of two factors, one is that calls to library
functions (like __satfractqiuhq) don't necessarily get the right type
promotion (specifically with regards to signedness) of their arguments
and the other is that __satfractqiuhq doesn't deal with that problem
correctly, though I think it is supposed to.

Reading emit_library_call_value_1 I see comments like:

  /* Todo, choose the correct decl type of orgfun. Sadly this information
 isn't present here, so we default to native calling abi here.  */

So I think that when calling a library function like '__satfractqiuhq'
which takes a signed char argument or calling a library function like
__satfractunsqiuhq which takes an unsigned char argument
emit_library_call_value_1 cannot ensure that the right type of extension
(signed vs unsigned) is done on the argument when it is put in the
argument register.  Does this sound like a correct understanding of the
limitation in emit_library_call_value_1?

I don't see this issue on regular non-library calls, presumably because
the compiler has all the information needed to do correct explicit
conversions.

When I look at the preprocessed __satfractqiuhq code I see:

unsigned short _Fract
__satfractqiuhq (signed char a) {

signed char x = a;
low = (short) x;

When TARGET_PROMOTE_PROTOTYPES was defined this triggered explicit
code truncate/sign extend code that took care of the problem I am
seeing but when I removed it, GCC assumed the caller had taken care
of the truncate/sign extension and, because this is a library function,
that wasn't done correctly and I don't think it can be done correctly
because emit_library_call_value_1 doesn't have the necessary
information.

So should __satfractqiuhq be dealing with the fact that the argument 'a'
may not have been sign extend in the correct way?

I have tried a few code changes in fixed-bit.c (to no avail) but this
code is so heavily macro-ized it is tough to figure out what it should
be doing.

Steve Ellcey
sell...@imgtec.com




Question about PR 48814 and ivopts and post-increment

2015-12-01 Thread Steve Ellcey

I have a question involving ivopts and PR 48814, which was a fix for
the post increment operation.  Prior to the fix for PR 48814, MIPS
would generate this loop for strcmp (C code from glibc):

$L4:
lbu $3,0($4)
lbu $2,0($5)
addiu   $4,$4,1
beq $3,$0,$L7
addiu   $5,$5,1# This is a branch delay slot
beq $3,$2,$L4
subu$2,$3,$2   # This is a branch delay slot (only used after loop)


With the current top-of-tree we now generate:

addiu   $4,$4,1
$L8:
lbu $3,-1($4)
addiu   $5,$5,1
beq $3,$0,$L7
lbu $2,-1($5)  # This is a branch delay slot
beq $3,$2,$L8
addiu   $4,$4,1# This is a branch delay slot

subu$2,$3,$2   # Done only once now after exiting loop.

The main problem with the new loop is that the beq comparing $2 and $3
is right before the load of $2 so there can be a delay due to the time
that the load takes.  The ideal code would probably be:

addiu   $4,$4,1
$L8:
lbu $3,-1($4)
lbu $2,0($5)  # This is a branch delay slot
beq $3,$0,$L7
addiu   $5,$5,1
beq $3,$2,$L8
addiu   $4,$4,1# This is a branch delay slot

subu$2,$3,$2   # Done only once now after exiting loop.

Where we load $2 earlier (using a 0 offset instead of a -1 offset) and
then do the increment of $5 after using it in the load.  The problem
is that this isn't something that can just be done in the instruction
scheduler because we are changing one of the instructions (to modify the
offset) in addition to rearranging them and I don't think the instruction
scheduler supports that.

It looks like is the ivopts code that decided to increment the registers
first and use the -1 offsets in the loads after instead of using 0 offsets
and then incrementing the offsets after the loads but I can't figure out
how or why ivopts made that decision.

Does anyone have any ideas on how I could 'fix' GCC to make it generate
the ideal code?  Is there some way to do it in the instruction scheduler?
Is there some way to modify ivopts to fix this by modifying the cost
analysis somehow?  Could I (partially) undo the fix for PR 48814?
According to the final comment in that bugzilla report the change is
really only needed for C11 and that the change does degrade the optimizer
so could we go back to the old behaviour for C89/C99?  The code in ivopts
has changed enough since the patch was applied I couldn't immediately see
how to do that in the ToT sources.

Steve Ellcey
sell...@imgtec.com


Instruction scheduler rewriting instructions?

2015-12-03 Thread Steve Ellcey
Can the instruction scheduler actually rewrite instructions?  I didn't
think so but when I compile some code on MIPS with:

-O2 -fno-ivopts -fno-peephole2 -fno-schedule-insns2

I get:

$L4:
lbu $3,0($4)
addiu   $4,$4,1
lbu $2,0($5)
beq $3,$0,$L7
addiu   $5,$5,1

beq $3,$2,$L4
subu$2,$3,$2

When I changed -fno-schedule-insns2 to -fschedule-insns2, I get:

$L4:
lbu $3,0($4)
addiu   $5,$5,1
lbu $2,-1($5)
beq $3,$0,$L7
addiu   $4,$4,1

beq $3,$2,$L4
subu$2,$3,$2

I.e. The addiu of $5 and the load using $5 have been swapped around
and the load uses a different offset to compensate.  I can't see
where in the instruction scheduler that this would happen.  Any 
help?  This is on MIPS if that matters, though I didn't see any
MIPS specific code for this.  This issue is related to my earlier
question about PR 48814 and ivopts (thus the -fno-ivopts option).

The C code I am looking at is the strcmp function from glibc:

int
strcmp (const char *p1, const char *p2)
{
  const unsigned char *s1 = (const unsigned char *) p1;
  const unsigned char *s2 = (const unsigned char *) p2;
  unsigned char c1, c2;

  do
{
  c1 = (unsigned char) *s1++;
  c2 = (unsigned char) *s2++;
  if (c1 == '\0')
return c1 - c2;
}
  while (c1 == c2);

  return c1 - c2;
}


Steve Ellcey
sell...@imgtec.com


Re: Instruction scheduler rewriting instructions?

2015-12-03 Thread Steve Ellcey
On Thu, 2015-12-03 at 19:56 +, Ramana Radhakrishnan wrote:

> IIRC it's because the scheduler *thinks* it can get a tighter schedule
> - probably because it thinks it can dual issue the lbu from $4 and the
> addiu to $5. Can it think so ? This may be related -
> https://gcc.gnu.org/ml/gcc-patches/2012-08/msg00155.html
> 
> regards
> Ramana

No, the system I am tuning for (MIPS 24k) is single issue according to
its description.  At least I do see now where the instruction is getting
rewritten in the instruction scheduler, so that is helpful.  I am no
longer sure the scheduler is where the problem lies though.  If I
compile with -O2 -mtune=24kc I get this loop:

addiu   $4,$4,1
$L8:
addiu   $5,$5,1
lbu $3,-1($4)
beq $3,$0,$L7
lbu $2,-1($5)

beq $3,$2,$L8
addiu   $4,$4,1

If I use -O2 -fno-ivopts -mtune=24kc I get:

lbu $3,0($4)
$L8:
lbu $2,0($5)
addiu   $4,$4,1
beq $3,$0,$L7
addiu   $5,$5,1

beql$3,$2,$L8
lbu $3,0($4)

This second loop is better because there is more time between the loads
and where the loaded values are used in the beq instructions.  So I
think there is something missing or wrong in the cost analysis that
ivopts is doing that it decides to do the adds before the loads instead
of visa versa.

I have tried tweaking the cost of loads in mips_rtx_costs and in the
instruction descriptions in 24k.md but that didn't seem to have any
affect on the ivopts code.

Steve Ellcey
sell...@imgtec.com




Re: Question about PR 48814 and ivopts and post-increment

2015-12-04 Thread Steve Ellcey
On Fri, 2015-12-04 at 16:22 +0800, Bin.Cheng wrote:

> Dump before IVO is as below:
> 
>   :
>   # s1_1 = PHI 
>   # s2_2 = PHI 
>   s1_6 = s1_1 + 1;
>   c1_8 = *s1_1;
>   s2_9 = s2_2 + 1;
>   c2_10 = *s2_2;
>   if (c1_8 == 0)
> goto ;
>   else
> goto ;
> 
> And the iv candidates are as:
> candidate 1 (important)
>   var_before ivtmp.6
>   var_after ivtmp.6
>   incremented before exit test
>   type unsigned int
>   base (unsigned int) p1_4(D)
>   step 1
>   base object (void *) p1_4(D)
> candidate 2 (important)
>   original biv
>   type const unsigned char *
>   base (const unsigned char *) p1_4(D)
>   step 1
>   base object (void *) p1_4(D)
> candidate 3 (important)
>   var_before ivtmp.7
>   var_after ivtmp.7
>   incremented before exit test
>   type unsigned int
>   base (unsigned int) p2_5(D)
>   step 1
>   base object (void *) p2_5(D)
> candidate 4 (important)
>   original biv
>   type const unsigned char *
>   base (const unsigned char *) p2_5(D)
>   step 1
>   base object (void *) p2_5(D)
> 
> Generally GCC would choose normal candidates {1, 3} and insert
> increment before exit condition.  This is expected in this case.  But
> when there is applicable original candidates {2, 4}, GCC would prefer
> these in order to achieve better debugging.  Also as I suspected,
> [reg] and [reg-1] have same address cost on mips, that's why GCC makes
> current decision.
> 
> Thanks,
> bin

Yes, I agree that [reg] and [reg-1] have the same address cost, but
using [reg-1] means that the increment of reg happens before the access
and that puts the load of [reg-1] closer to the use of the value loaded
and that causes a stall.  If we used [reg] and incremented it after the
load then we would have at least one instruction in between the load and
the use and either no stall or a shorter stall.

I don't know if ivopts has anyway to do this type of analysis when
picking the IV.

Steve Ellcey
sell...@imgtec.com



libstdc++ / uclibc question

2015-12-21 Thread Steve Ellcey
Is anyone building GCC (and libstdc++ specifically) with uclibc?  I haven't
done this in a while and when I do it now I get this build failure:

/scratch/sellcey/repos/uclibc-ng/src/gcc/libstdc++-v3/include/ext/random.tcc: 
In member function '__gnu_cxx::{anonymous}::uniform_on_sphere_helper<_Dimen, 
_RealType>::result_type 
__gnu_cxx::{anonymous}::uniform_on_sphere_helper<_Dimen, 
_RealType>::operator()(_NormalDistribution&, _UniformRandomNumberGenerator&)':
/scratch/sellcey/repos/uclibc-ng/src/gcc/libstdc++-v3/include/ext/random.tcc:1573:44:
 error: expected unqualified-id before '(' token
while (__norm == _RealType(0) || ! std::isfinite(__norm));

I am thinking the issue may be isfinite, but I am not sure.  I notice there
are some tests like 26_numerics/headers/cmath/c99_classification_macros_c++.cc
that are xfailed for uclibc and I wonder if this is a related problem.

I could not find any uses of isfinite in other C++ files (except cmath)
and the tests that use it are the same ones that are xfailed for uclibc.

Steve Ellcey
sell...@imgtec.com


Re: __builtin_memcpy and alignment assumptions

2016-01-08 Thread Steve Ellcey
On Fri, 2016-01-08 at 12:56 +0100, Richard Biener wrote:
> On Fri, Jan 8, 2016 at 12:40 PM, Eric Botcazou  
> wrote:
> >> I think we only assume it if the pointer is actually dereferenced, 
> >> otherwise
> >> it just breaks too much code in the wild.  And while memcpy dereferences,
> >> it dereferences it through a char * cast, and thus only the minimum
> >> alignment is assumed.
> >
> > Yet the compiler was generating the expected code for Steve's testcase on
> > strict-alignment architectures until very recently (GCC 4.5 IIUC) and this
> > worked perfectly.

Yes, I just checked and I did get the better code in GCC 4.5 and I get
the current slower code in GCC 4.6.

> Consider
> 
> int a[256];
> int
> main()
> {
>   void *p = (char *)a + 1;
>   void *q = (char *)a + 5;
>   __builtin_memcpy (p, q, 4);
>   return 0;
> }
> 
> where the ME would be entitled to "drop" the char */void * conversions
> and use &a typed temps.

I am not sure how this works but I tweaked get_pointer_alignment_1 so
that if there was no align info or if get_ptr_info_alignment returned
false then the routine would return type based alignment information
instead of default 'void *' alignment.  In that case and using your
example, GCC still accessed p & q as pointers to unaligned data.

In fact if I used int pointers:

int a[256];
int main()
{
  int *p = (int *)((char *)a + 1);
  int *q = (int *)((char *)a + 5);
  __builtin_memcpy (p, q, 4);
  return 0;
}

GCC did unaligned accesses when optimizing, but when unoptimized (and
with my change) GCC did aligned accesses, which would not work on a
strict alignment machine like MIPS  This seems to match what happens
with:

int a[256];
int main()
{
  int *p = (int *)((char *)a + 1);
  int *q = (int *)((char *)a + 5);
  *p = *q;
  return 0;
}

When I optimize it, GCC does unaligned accesses and when unoptimized
GCC does aligned accesses which will not work on MIPS.

Steve Ellcey
sell...@imgtec.com






GCC compat testing and simulator question

2016-02-01 Thread Steve Ellcey

I have a question about the compatibility tests (gcc.dg/compat and
g++.dg/compat).  Do they work with remote/simulator testing?  I was
trying to run them with qemu and even though I am setting ALT_CC_UNDER_TEST
and ALT_CXX_UNDER_TEST it doesn't look like my alternative compiler
is ever getting run.

The README.compat file contains a line about 'make sure they work for
testing with a simulator' does that mean they are known not to work
with cross-testing and using a simulator?

I don't get any errors or warnings, and tests are being compiled with
GCC and run under qemu but it doesn't look like the second compiler is
ever run to compile anything.  I am using the multi-sim dejagnu board.

Steve Ellcey
sell...@imgtec.com


Re: glibc test tst-thread_local1.cc fails to compile with latest GCC

2016-10-21 Thread Steve Ellcey
On Fri, 2016-10-21 at 17:03 +0100, Jonathan Wakely wrote:
> 
> > Is there some C++ standard change that I am not aware of or some
> > other header file I need to include?
> No, what probably happened is GCC didn't detect a usable Pthreads
> implementation and so doesn't define std::thread. The  header
> uses this condition around the definition of std::thread:
> 
> #if defined(_GLIBCXX_HAS_GTHREADS) &&
> defined(_GLIBCXX_USE_C99_STDINT_TR1)

Yes, I finally realized I had built a GCC with '--enable-threads=no'
and was using that GCC to build GLIBC.  Once I rebuilt GCC with threads
I could build GLIBC and not get this error.

Steve Ellcey


Question about PR preprocessor/60723

2016-11-30 Thread Steve Ellcey
I am trying to understand the status of this bug and the patch
that fixes it.  It looks like a patch was submitted and checked
in for 5.0 to fix the problem reported and I see the new 
behavior caused by the patch in GCC 5.X compilers.  This behavior
caused a number of issues with configures and scripts that examined
preprocessed output as is mentioned in the bug report for PR 60723.
There was a later bug, 64864, complaining about the behavior and
that was closed as invalid.

But when I look at GCC 6.X or ToT compilers I do not see the same
behavior as 5.X.  Was this patch reverted or was a new patch submitted
that undid some of this patches behavior?  I couldn't find any revert or
new patch to replace the original one so I am not sure when or why
the code changed back after the 5.X releases.

Here is a test case that I am preprocessing with g++ -E:

#include 
class foo {
void operator= ( bool bit);
operator bool() const;
};

GCC 5.4 breaks up the operator delcarations with line markers and GCC 6.2
does not.

Steve Ellcey
sell...@caviumnetworks.com


Multilib build question involving MULTILIB_OSDIRNAMES

2014-07-14 Thread Steve Ellcey
I have a multilib question that I hope someone can help me with.

If I have this multilib setup while building a cross compiler:

MULTILIB_DEFAULTS { "mips32r2" }
MULTILIB_OPTIONS = mips32r2/mips64r2
MULTILIB_OSDIRNAMES = ../lib ../lib64

Everything works the way I want it to.  I have mips32r2 system libraries
in /lib under my sysroot and mips64r2 system libraries in /lib64 and
everything seems fine.

Now I want to make mips64r2 the default compilation mode for GCC but
I want to keep my sysroot setup (/lib for mips32r2 and /lib64 for mips64r2)
the same.  So I change MULTILIB_DEFAULTS to specify "mips64r2" and rebuild.

When I do this, a default build (targeting mips64r2) searches for system
libraries in /lib instead of /lib64.  Is there a way to fix this without
having to put mips64r2 system libraries in /lib?  Is this the expected
behaviour or is this a bug in handling MULTILIB_OSDIRNAMES?

Steve Ellcey
sell...@mips.com


Where does GCC pick passes for different opt. levels

2014-08-11 Thread Steve Ellcey
I have a basic question about optimization selection in GCC.  There used to
be some code in GCC (passes.c?) that would set various optimize pass flags
depending on if the 'optimize' flag was > 0, > 1, or > 2; later I think
there may have been a table.  This code seems gone now and I can't figure
out how GCC is selecting what optimization passes to run at what optimization
levels (-O1 vs. -O2 vs. -O3).  How is this handled in the top-of-tree GCC code?

I see passes.def but there doesn't seem to be anything in there to tie 
specific passes to specific optimization levels.  Likewise in common.opt
I see flags for various optimization passes but nothing to tie them to
-O1 or -O2, etc.

I'm probably missing something obvious, but a pointer would be much
appreciated.

Steve Ellcey


Re: Where does GCC pick passes for different opt. levels

2014-08-11 Thread Steve Ellcey

> default_options_table in opts.c.

Thanks Andrew and Marc, I knew it would be obvious once I saw it.

Steve



ICE in bitmap routines with LRA and inline assembly language

2014-09-04 Thread Steve Ellcey

I was wondering if anyone has seen this bug involving LRA and inline
assembly code.  On MIPS, I am getting the attached ICE.  Somehow
the 'first' pointer in the live_reload_and_inheritance_pseudos bitmap
structure is either getting clobbered or is not being correctly
initialized to begin with.  I am not sure which yet.

Steve Ellcey
sell...@mips.com


% cat x.c
int NoBarrier_AtomicIncrement(volatile int* ptr, int increment) {
  int temp, temp2;
  __asm__ __volatile__(".set push\n"
   ".set noreorder\n"
   "1:\n"
   "ll %0, 0(%3)\n"
   "addu %1, %0, %2\n"
   "sc %1, 0(%3)\n"
   "beqz %1, 1b\n"
   "addu %1, %0, %2\n"
   ".set pop\n"
   : "=&r" (temp), "=&r" (temp2)
   : "Ir" (increment), "r" (ptr)
   : "memory");
  return temp2;
}

% mips-mti-linux-gnu-gcc -O1 -c x.c
x.c: In function 'NoBarrier_AtomicIncrement':
x.c:16:1: internal compiler error: Segmentation fault
 }
 ^
0x9b199f crash_signal
/scratch/sellcey/nightly/src/gcc/gcc/toplev.c:339
0x5d3950 bitmap_element_link
/scratch/sellcey/nightly/src/gcc/gcc/bitmap.c:456
0x5d3950 bitmap_set_bit(bitmap_head*, int)
/scratch/sellcey/nightly/src/gcc/gcc/bitmap.c:673
0x87c370 init_live_reload_and_inheritance_pseudos
/scratch/sellcey/nightly/src/gcc/gcc/lra-assigns.c:413
0x87c370 lra_assign()
/scratch/sellcey/nightly/src/gcc/gcc/lra-assigns.c:1499
0x877966 lra(_IO_FILE*)
/scratch/sellcey/nightly/src/gcc/gcc/lra.c:2236
0x8337de do_reload
/scratch/sellcey/nightly/src/gcc/gcc/ira.c:5311
0x8337de execute
/scratch/sellcey/nightly/src/gcc/gcc/ira.c:5470
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.


dejagnu testsuite bug?

2014-09-05 Thread Steve Ellcey
I was looking through my 'make check' output (from a MIPS cross compiler)
and saw this error.  Has anyone else run into something like this?  I am
not entirely sure where to start looking for this problem and I am also
not sure if this is a new problem or not.  Normally I just grep for FAIL
and don't examine the testing output that closely.  I see the 'usual' C
and C++ faliiures after this error and the rest of the testsuite seems to
run fine.

Steve Ellcey
sell...@mips.com



Test Run By sellcey on Fri Sep  5 03:08:58 2014
Native configuration is x86_64-unknown-linux-gnu

===  tests ===

Schedule of variations:
multi-sim

Running target multi-sim
Using 
/scratch/sellcey/nightly/src/gcc/dejagnu/testsuite/../config/base-config.exp as 
tool-and-target-specific interface file.
Using /scratch/sellcey/nightly/src/gcc/./dejagnu/baseboards/multi-sim.exp as 
board description file for target.
Using /scratch/sellcey/nightly/src/gcc/./dejagnu/config/sim.exp as generic 
interface file for target.
Using /scratch/sellcey/nightly/src/gcc/./dejagnu/baseboards/basic-sim.exp as 
board description file for target.
Using /scratch/sellcey/nightly/src/gcc/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.
Using /scratch/sellcey/nightly/src/gcc/dejagnu/testsuite/config/default.exp as 
tool-and-target-specific interface file.
Running /scratch/sellcey/nightly/src/gcc/dejagnu/testsuite/libdejagnu/tunit.exp 
...
send: spawn id exp0 not open
while executing
"send_user -- "$message\n""
("default" arm line 2)
invoked from within
"switch -glob "$firstword" {
"PASS:" -
"XFAIL:" -
"KFAIL:" -
"UNRESOLVED:" -
"UNSUPPORTED:" -
"UNTESTED:" {
if {$all_flag} {
send_user -- ..."
(procedure "clone_output" line 10)
invoked from within
"clone_output "Running $test_file_name ...""
(procedure "runtest" line 7)
invoked from within
"runtest $test_name"
("foreach" body line 42)
invoked from within
"foreach test_name [lsort [find ${dir} *.exp]] {
if { ${test_name} == "" } {
continue
}
# Ignore this one if asked to.
if { ${ignore..."
("foreach" body line 54)
invoked from within
"foreach dir "${test_top_dirs}" {
if { ${dir} != ${srcdir} } {
# Ignore this directory if is a directory to be
# ignored.
if {[info..."
("foreach" body line 121)
invoked from within
"foreach pass $multipass {

# multipass_name is set for `record_test' to use (see framework.exp).
if { [lindex $pass 0] != "" } {
set multipass_..."
("foreach" body line 51)
invoked from within
"foreach current_target $target_list {
verbose "target is $current_target"
set current_target_name $current_target
set tlist [split $curren..."
(file "/scratch/sellcey/nightly/src/gcc/./dejagnu/runtest.exp" line 1627)^M
make[3]: *** [check-DEJAGNU] Error 1
make[3]: Leaving directory 
`/scratch/sellcey/nightly/obj-mips-mti-linux-gnu/gcc/dejagnu'
make[2]: *** [check-am] Error 2
make[2]: Target `check' not remade because of errors.
make[2]: Leaving directory 
`/scratch/sellcey/nightly/obj-mips-mti-linux-gnu/gcc/dejagnu'
make[1]: *** [check-dejagnu] Error 2



MULTILIB_OSDIRNAMES mapping question

2014-10-01 Thread Steve Ellcey

I have a question about MULTILIB_OSDIRNAMES and about specifying a
mapping in this variable.

According to fragments.texi:

  When it is a set of mappings of the form @var{gccdir}=@var{osdir},
  the left side gives the GCC convention and the right gives the
  equivalent OS defined location.

But when I try this it doesn't seem to work for me and if I am reading
the config/i386/t-linux64 file correctly it looks like instead of
having a mapping from one of the MULTILIB_DIRNAMES entries, which
is what I expected, it seems to map from the MULTILIB_OPTIONS instead.
I.e.  The @var{gccdir} entries in config/i386/t-linux64 start with an
'm' like the options do, but which is not part of the GCC directory names.

So is the documentation wrong, or am I misreading it, or is the code
wrong?  I would actually like the code to match the existing documentation
because on mips the ABI options contain equal signs (-mabi=32, -mabi=64) so
it would be hard/confusing to map an option to a directory when the
option itself contains an equal sign.

Steve Ellcey
sell...@mips.com


fast-math optimization question

2014-10-09 Thread Steve Ellcey
I have a -ffast-math (missing?) optimization question.  I noticed on MIPS
that if I compiled:

#include 
extern x;
void foo() { x = sin(log(x)); }

GCC will extend 'x' to double precision, call the double precision log and sin
functions and then truncate the result to single precision.

If instead, I have:

#include 
extern x;
void foo() { x = log(x); x = sin(x); }

Then GCC will call the single precision log and sin functions and not do
any extensions or truncations.  In addition to avoiding the extend/trunc
instructions the single precision log and sin functions are presumably
faster then the double precision ones making the entire code much faster.

Is there a reason why GCC couldn't (under -ffast-math) call the single
precision routines for the first case?

Steve Ellcey
sell...@mips.com


Re: fast-math optimization question

2014-10-09 Thread Steve Ellcey
On Thu, 2014-10-09 at 11:27 -0700, Andrew Pinski wrote:

> > Is there a reason why GCC couldn't (under -ffast-math) call the single
> > precision routines for the first case?
> 
> There is no reason why it could not.  The reason why it does not
> currently is because there is no pass which does the demotion and the
> only case of demotion that happens is with a simple
> (float)function((double)float_val);
> 
> Thanks,
> Andrew

Do you know which pass does the simple
'(float)function((double)float_val)' demotion?  Maybe that would be a
good place to extend things.

Steve Ellcey



Re: fast-math optimization question

2014-10-09 Thread Steve Ellcey
On Thu, 2014-10-09 at 19:50 +, Joseph S. Myers wrote:
> On Thu, 9 Oct 2014, Steve Ellcey wrote:
> 
> > Do you know which pass does the simple
> > '(float)function((double)float_val)' demotion?  Maybe that would be a
> > good place to extend things.
> 
> convert.c does such transformations.  Maybe the transformations in there 
> could move to the match-and-simplify infrastructure - convert.c is not a 
> particularly good place for optimization, and having similar 
> transformations scattered around (fold-const, convert.c, front ends, SSA 
> optimizers) isn't helpful; hopefully match-and-simplify will allow some 
> unification of this sort of optimization.

I did a quick and dirty experiment with the match-and-simplify branch
just to get an idea of what it might be like.  The branch built for MIPS
right out of the box so that was great and I added a couple of rules
(see below) just to see if it would trigger the optimization I wanted
and it did.  I was impressed with the match-and-simplify infrastructure,
it seemed to work quite well.  Will this branch be included in GCC 5.0?

Steve Ellcey
sell...@mips.com


Code added to match-builtin.pd:

 
(if (flag_unsafe_math_optimizations)
 /* Optimize "(float) expN(x)" [where x is type double] to
 "expNf((float) x)", i.e. call the 'f' single precision func */
 (simplify
  (convert (BUILT_IN_LOG @0))
  (if ((TYPE_MODE (type) == SFmode) && (TYPE_MODE (TREE_TYPE (@0)) == DFmode))
   (BUILT_IN_LOGF (convert @0
)

(if (flag_unsafe_math_optimizations)
 /* Optimize "(float) expN(x)" [where x is type double] to
 "expNf((float) x)", i.e. call the 'f' single precision func */
 (simplify
  (convert (BUILT_IN_SIN @0))
  (if ((TYPE_MODE (type) == SFmode) && (TYPE_MODE (TREE_TYPE (@0)) == DFmode))
   (BUILT_IN_SINF (convert @0
)





Cross compiling and multiple sysroot question

2015-01-08 Thread Steve Ellcey
(Reposting from gcc-help since I didn't get any replies there.)

I have a question about SYSROOT_SUFFIX_SPEC, MULTILIB_OSDIRNAMES, and
multilib cross compilers.  I was experimenting with a multilib cross compiler
and was using SYSROOT_SUFFIX_SPEC to specify different sysroots for different
multilibs, including big-endian and little-endian with 32 and 64 bits.

Now lets say I create two sysroots:
sysroot/be with a bin, lib, lib64, etc. directories
sysroot/le with the same set of directories

These would represent the sysroot of either a 64 bit big-endian or a 64 bit
little-endian linux system that could also run 32 bit executables.
I want my cross compiler to be able to generate code for either system.

So I set these macros and SPECs:
# m32 and be are defaults
MULTILIB_OPTIONS = m64 mel # In makefile fragment
MULTILIB_DIRNAMES = 64 el  # In makefile fragment
MULTILIB_OSDIRNAMES = m64=../lib64 # In makefile fragment
SYSROOT_SUFFIX_SPEC = %{mel:/el;:/eb}  # in header file

What seems to be happening is that the search for system libraries
like libc.so work fine.  It looks in sysroot/be/lib or sysroot/be/lib64
or in the equivalent little-endian directories.  I.e. it searches:

/lib  # 32 bits
/lib/../lib64 # 64 bits

But when it looks for libgcc_s.so or libstdc++.so it is searching:

//lib# 32 bits
//lib/../lib64   # 64 bits

It does not take into account SYSROOT_SUFFIX_SPEC.  In fact when I
do my build with this setup the little-endian libgcc_s.so files wind
up overwriting the big-endian libgcc_s.so files so two of my
libgcc_s.so files are completely missing from the install area.

Shouldn't SYSROOT_SUFFIX_SPEC be used for the gcc shared libraries 
as well as the sysroot areas?  I.e. install and search for libgcc_s.so.1 in:

/lib  # 32 bits
/lib/../lib64 # 64 bits

Steve Ellcey
sell...@imgtec.com


Re: Cross compiling and multiple sysroot question

2015-01-12 Thread Steve Ellcey
On Thu, 2015-01-08 at 22:12 +, Joseph Myers wrote:
> On Thu, 8 Jan 2015, Steve Ellcey  wrote:
> 
> > So I set these macros and SPECs:
> > # m32 and be are defaults
> > MULTILIB_OPTIONS = m64 mel # In makefile fragment
> > MULTILIB_DIRNAMES = 64 el  # In makefile fragment
> > MULTILIB_OSDIRNAMES = m64=../lib64 # In makefile fragment
> 
> In my experience, for such cases it's best to list all multilibs 
> explicitly in MULTILIB_OSDIRNAMES, and then to specify 
> STARTFILE_PREFIX_SPEC as well along the lines of:
> 
> #define STARTFILE_PREFIX_SPEC   \
>   "%{mabi=32: /usr/local/lib/ /lib/ /usr/lib/}  \
>%{mabi=n32: /usr/local/lib32/ /lib32/ /usr/lib32/}   \
>%{mabi=64: /usr/local/lib64/ /lib64/ /usr/lib64/}"

Thanks for the help Joseph, this combination worked and I was able to
build a working GCC using this setup.

> GCC never installs anything inside the sysroot (it could be a read-only 
> mount of the target's root filesystem, for example).  Listing all 
> multilibs explicitly (multilib=dir or multilib=!dir) in 
> MULTILIB_OSDIRNAMES allows you to ensure they don't overwrite each other.

GCC never installs anything inside sysroot's but some tools that people
have developed to build cross compiler toolchains copy the shared libgcc
libraries (libgcc_s, libstdc++, etc) from the GCC install area into
sysroot as part of the build of a cross compiler toolchain.

I was wondering if I could use the explicit list of MULTILIB_OSDIRNAMES
entries to layout those libraries in a way that would make it easy to
copy them into a sysroot if I wanted to.  The only thing I am not sure
about if there is a way to specify where I want the default (no option)
libraries to go.

I.e. I can use:

MULTILIB_OSDIRNAMES += mips64r2=mipsr2/lib32
MULTILIB_OSDIRNAMES += mips64r2/mabi.64=mipsr2/lib64

To create a mipsr2/lib32 and mipsr2/lib64 directory under /lib
for libgcc_s but I would like the default libraries in
/lib/mipsr2/lib instead of directly in /lib.  That way I
could use a single copy to put all of /lib/mipsr2 into my
sysroot.  Do you know if either of these would work:

MULTILIB_OSDIRNAMES += mips32r2=mipsr2/lib
MULTILIB_OSDIRNAMES += .=mipsr2/lib

I don't think the first one would work because -mips32r2 is the default
architecture and is not explicitly listed in MULTILIB_OPTIONS and I
don't think the second form is supported at all, but maybe there is some
other way to specify the location of the default libraries?

Steve Ellcey
sell...@imgtec.com



RE: Cross compiling and multiple sysroot question

2015-01-12 Thread Steve Ellcey
On Mon, 2015-01-12 at 20:58 +, Joseph Myers wrote:
> On Mon, 12 Jan 2015, Matthew Fortune wrote:
> 
> > MIPS does this too for mips64-linux-gnu as it has n32 for the default
> > multilib which gets placed in lib32. I don't honestly know how the multilib
> > spec doesn't end up building 4 multilibs though. I'm assuming the fact
> > that the default ABI is added to the DRIVER_SELF_SPECS may be the reason.
> 
> I suspect MULTILIB_DEFAULTS is relevant.

The problem I ran into with MULTILIB_DEFAULTS is that if you have:

MULTILIB_DEFAULTS = { mips32r2 }
MULTILIB_OPTIONS = mips32r2/mips64r2 mabi=64 EL

If you try to use:

MULTILIB_EXCEPTIONS = mips32r2/mabi=64*

It doesn't work.  The mips32r2 option seems to be stripped off before
MULTILIB_EXCEPTIONS is applied.  You need to use this instead:

MULTILIB_EXCEPTIONS = mabi=64*

Which is the same as you would use if you didn't specify mips32r2 in
MULTILIB_OPTIONS at all.  I expect MULTILIB_OSDIRNAMES to work the same
way and ignore any mapping entries with the mips32r2 option but maybe I
am wrong (I'm still testing it out).

Steve Ellcey
sell...@imgtec.com



libcc1.so bug/install location question

2015-01-20 Thread Steve Ellcey
I have a question about libcc1.so and where it is put in the install
directory.  My understanding is that GCC install files are put in a
directory containing the target name or contain the target
name as part of the filename (aka mips-linux-gnu-gcc) so that two GCC's
with different targets could be installed into the same installation
directory and not stomp on each other.

I tried this, building cross compilers for mips-mti-linux-gnu and
mips-img-linux-gnu and checked to see if any files overlapped between
the two.  The only overlap I found was with libcc1.  Both cross compilers
had a lib directory directly under the install directory that contained
a libcc1.so, libcc1.so.0, libcc1.so.0.0.0, and libcc1.la file in them.
The files in each install directory were different which makes sense since
I was building for two different targets.

Is this overlap of names intended or is it a bug?

Steve Ellcey


Re: Slow gcc.gnu.org/sourceware.org?

2015-01-27 Thread Steve Ellcey
On Tue, 2015-01-27 at 08:02 -0800, H.J. Lu wrote:
> For the past couple days, gcc.gnu.org/sourceware.org is
> quite slow for me when accessing git and bugzilla.  Am
> I the only one who has  experienced it?

I got some timeouts while updating my glibc git repo yesterday.
I had never run into that before.

Steve Ellcey
sell...@imgtec.com



Re: Slow gcc.gnu.org/sourceware.org?

2015-01-27 Thread Steve Ellcey
On Tue, 2015-01-27 at 09:36 -0700, Jeff Law wrote:
> On 01/27/15 09:20, Steve Ellcey wrote:
> > On Tue, 2015-01-27 at 08:02 -0800, H.J. Lu wrote:
> >> For the past couple days, gcc.gnu.org/sourceware.org is
> >> quite slow for me when accessing git and bugzilla.  Am
> >> I the only one who has  experienced it?
> >
> > I got some timeouts while updating my glibc git repo yesterday.
> > I had never run into that before.
> Are you using anonymous mode, or ssh-authenticated?  The former is 
> usually throttled as the load rises, the latter is not.
> 
> jeff

I was using anonymous mode.

Steve Ellcey



unfused fma question

2015-02-20 Thread Steve Ellcey
I have a question about *unfused* fma instructions.  MIPS has processors
with both fused and unfused multiple and add instructions and for fused
madd's it is clear what to do; define 'fma' instructions in the md file
and let convert_mult_to_fma decide whether or not to use them.

But for non-fused multiply and adds, it is less clear.  One could
define '*madd' instructions with the plus and mult operator and
let the peephole optimizer convert normal expressions that have 
these operators into (unfused) instructions.  This is what MIPS
currently does.

Or one could change convert_mult_to_fma to add a check if fma is fused
vs. non-fused in addition to the check for the flag_fp_contract_mode in
order to decide whether to convert expressions into an fma and then
define fma instructions in the md file.

I was wondering if anyone had an opinion about the advantages or
disadvantages of these two approaches.

Steve Ellcey
sell...@imgtec.com


RE: unfused fma question

2015-02-23 Thread Steve Ellcey
On Sun, 2015-02-22 at 10:30 -0800, Matthew Fortune wrote:
> Steve Ellcey  writes:
> > Or one could change convert_mult_to_fma to add a check if fma is fused
> > vs. non-fused in addition to the check for the flag_fp_contract_mode
> > in order to decide whether to convert expressions into an fma and then
> > define fma instructions in the md file.
> 
> I was about to say that I see no reason to change how non-fused multiply
> adds work i.e. leave them to pattern matching but I think your point was
> that when both fused and non-fused patterns are available then what
> should we do.

No, I am thinking about the case where there are only non-fused multiply
add instructions available.  To make sure I am using the right
terminology, I am using a non-fused multiply-add to mean a single fma
instruction that does '(a + (b * c))' but which rounds the result of '(b
* c)' before adding it to 'a' so that there is no difference in the
results between using this instruction and using individual add and mult
instructions.  My understanding is that this is how the mips32r2 madd
instruction works.

In this case there seems to be two ways to have GCC generate the fma
instruction.  One is the current method using combine_instructions with
an instruction defined as:


(define_insn "*madd" (set (0) (plus (mult (1) (2
"madd.\t%0,%3,%1,%2"


The other way would be to extend the convert_mult_to_fma so that instead
of:

  if (FLOAT_TYPE_P (type)
  && flag_fp_contract_mode == FP_CONTRACT_OFF)
return false

it has something like:

  if (FLOAT_TYPE_P (type)
  && (flag_fp_contract_mode == FP_CONTRACT_OFF)
  && !targetm.fma_does_rounding))
return false

And then define an instruction like:

(define_insn "fma" (set (0) (fma (1) (2) (3"
madd.\t%0,%3,%1,%2"


The question I have is whether one or the other of these two approaches
would be better at creating fma instructions (vs leaving mult/add
combinations) or be might be preferable for some other reason.

Steve Ellcey
sell...@imgtec.com




LRA spill/fill memory alignment question

2015-03-04 Thread Steve Ellcey
I have a question about spilling variables and alignment requirements.
There is currently code that allows one to declare local variables with
an alignment that is greater than MAX_STACK_ALIGNMENT.  In that case
expand_stack_vars calls allocate_dynamic_stack_space to create a
pointer to properly aligned stack space.  (There is actually a bug
in this code, PR 65315, but I have submitted a patch.)

But there does not seem to be any way to do spills and fills into
memory that has an alignment requirement greater than MAX_STACK_ALIGNMENT.
Is that correct?  I am looking at MIPS using the LRA allocator.  I was
hoping there was some way to spill 16 byte registers into a 16 byte
aligned spill slot even if the MAX_STACK_ALIGNMENT is 8 bytes.

I know x86 has some platform specific code to dynamically increase the
stack alignment and I think that is how they handle this situation but
I don't see any other platforms using that technique and I was wondering
if there is any more generalized method for spilling registers to memory
with an alignment requirement greater than MAX_STACK_ALIGNMENT.

Steve Ellcey
sell...@imgtec.com


Questions about dynamic stack realignment

2015-03-10 Thread Steve Ellcey
This email is a follow-up to some earlier email I sent about
alignment of spills and fills but did not get any replies to.

https://gcc.gnu.org/ml/gcc/2015-03/msg00028.html

After looking into that I have decided to look more into dynamically
realigning the stack so that my spills and fills would be aligned and I have
done some experiments with stack realignment and I am trying to understand
what hooks already exist and how to use them.

Currently mips just has:

#define STACK_BOUNDARY (TARGET_NEWABI ? 128 : 64)

I added:

#define MAX_STACK_ALIGNMENT 128
#define PREFERRED_STACK_BOUNDARY (TARGET_MSA ? 128 : STACK_BOUNDARY)
#define INCOMING_STACK_BOUNDARY STACK_BOUNDARY

To try and get GCC to realign the stack to 128 bits if we are compiling
with the -mmsa option.  After doing this I found I needed to create a
TARGET_GET_DRAP_RTX that would return a register rtx when a drap was
needed so I did that and I got things to compile but I don't see any
code that actually realigned the stack.  It is not clear to me from the
documentation if there is shared code somewhere that should be trying to
realign the stack by changing the stack pointer given these definitions
or if I also need to add my own code to exand_prologue to do the stack
realignment myself.

I am also not sure if I understand the drap (Dynamic Realign Argument Pointer)
register functionality correctly.  My guess/understanding was that the drap
was used to access arguments in cases where the regular stack pointer may have
been changed in order to be aligned.  Is that correct?

Any help/advice on how the hooks for dynamically realigned stack are supposed
to all work together would be appreciated.

Steve Ellcey


  1   2   3   4   >