Re: Delay slot filling - what still matters, and what doesn't matter so much anymore?

2013-04-18 Thread Joern Rennecke

Quoting Steven Bosscher :


Hello delay-slot target maintainers :-)


N.B., that also includes me as ARC maintainer.


First of all: What is still important to handle?

It's clear that the expectations in reorg.c are "anything goes" but
modern RISCs (everything since the PA-8000, say) probably have some
limitations on what is helpful to have, or not have, in a delay slot.


Actually, for several targets, annulled-true delay slots are really
'anything goes' including other instructions with delay slots.
These targets just have branch variants with and without delay slots,
and fill_eager_delay_slots in fact pessimizes the code.


According to the comments in pa.h about MASK_JUMP_IN_DELAY, having
jumps in delay slots of other jumps is one such thing: They don't
bring benefit to the PA-8000 and they don't work with DWARF2 CFI. As
far as I know, SPARC and MIPS don't allow jumps in delay slots, SH
looks like it doesn't allow it either


That's only the GCC port, because you get ICEs when you try it.  It would
really be better if we could just tell reorg that an unfilled delay slot is
OK if the branch is not likely.


Another thing I completely fail to grasp, is how the pipeline
scheduler and delay slots interact. Doesn't dbr_schedule destroy all
the good work schedule_insns has tried to do? If so, how much does
that hurt on modern RISCs?


Indeed, that is a problem.
The SH actually makes some scheduler adjustments to have an extra insn
available to put in the delay slot.


Re: Delay slot filling - what still matters, and what doesn't matter so much anymore?

2013-04-18 Thread Bernd Schmidt
On 04/17/2013 11:52 PM, Steven Bosscher wrote:
> According to the comments in pa.h about MASK_JUMP_IN_DELAY, having
> jumps in delay slots of other jumps is one such thing: They don't
> bring benefit to the PA-8000 and they don't work with DWARF2 CFI. As
> far as I know, SPARC and MIPS don't allow jumps in delay slots, SH
> looks like it doesn't allow it either, and CRIS can do it for short
> branches but doesn't do because the trade-off between benefit and
> machine description complexity comes out negative. On the scheduler
> implementation side: Branches as delayed insns in delay slots of other
> branches is impossible to express in the CFG (at least in GCC, but I
> think in general it can't be done cleanly). Therefore I want to drop
> support for branches in delay slots. What do you think about this?

I thought I'd mention C6X allows branches in delay slots (of course
reorg.c isn't involved in this). This is useful if one can prove that
the first branch isn't taken if the predicate of the second branch is
true. Otherwise, it has the same semantics as the PA (as described by
Jeff): you get to execute a few instructions at the first branch target
and then another jump happens away from there. This can be useful, for
example to implement short loops (without using the hardware loop
mechanisms) by scheduling a decrement/branch every cycle for 6 cycles
before the actual loop, but gcc does not use this functionality.

> What about multiple delay slots? It looks like reorg.c has code to
> handle insns with multiple delay slots, but there currently are no GCC
> targets in the FSF tree that have insns with multiple delay slots and
> that use define_delay. The C6X has many more delay slots than just 1
> (it can have up to 5 delay slots IIRC)

5 cycles with up to 8 insns each :) Didn't want to try that with reorg.c.

> but it is much more flexible
> than traditional RISCs when it comes to putting insns in delay slots
> (it uses predication so it can annul delayed insns on various
> conditions) and it uses a very clever (and effective??) delay slot
> filling mechanism via the normal scheduler, using back-tracking and
> "jump shadows" (see UNSPEC_JUMP_SHADOW in the cx6 back end). But C6X
> doesn't use reorg.c delay slot scheduling. I'm not aware of any
> non-VLIW, non-DSP targets with more than one delay slot per insn, and
> new VLIW/DSP ports with delay slots probably should look at c6x rather
> than using define_delay. Supporting only a single delay slot per
> delay_insn would make my scheduler a bit less complex. Would that be
> enough for everyone, or is it necessary to continue to support
> multiple delay slots per insn?

The mechanism used for C6X has the advantage of using the pipeline
description for accurate schedules and allowing more than one delay
slot. It can also add predication to fallthrough insns to make them
suitable for use in a delay slot. The downside is that it doesn't know
quite as many tricks as reorg.c. It's based on sched-ebb so it can only
take instructions from the fallthrough branch (something I've wanted to
fix but never had the time).

In general I think if a new target wants more than one delay slot, it
should try to use the C6X method instead of reorg.c. It would be nice
for someone to try it on a target like mips or PA as well; ISTR Richard
S was going to try at some point but I don't know if anything came of
that. I expect it to generate worse code than reorg.c at this stage but
improvements should be possible.


Bernd



register usage, fixed_reg_set, call_used_reg_set, call_fixed_reg_set

2013-04-18 Thread Hendrik Greving
Hi,

this might be a question not entirely valid w.r.t. new GCC versions
4.x+. I am using GCC 3.2.x

I have 2 questions regarding fixed_reg_set, call_used_reg_set,
call_fixed_reg_set.

Firstly, are fixed_reg_set, call_used_reg_set, call_fixed_reg_set
always (or supposed to..) the same as fixed_reg, call_used_reg and
call_fixed_reg, just as HARD_REG_SET representation?

And secondly, is call_used_reg_set always a subset of fixed_reg_set,
and call_fixed_reg_set always a subset of call_used_reg_set as well as
subset of fixed_reg_set?

Thanks for help,
Regards,
Hendrik Greving


GNU Make's -n option and $(MAKE) in makefiles (was: Cannot stat gcc/include-fixed/limits.h when installing GCC 4.7.2)

2013-04-18 Thread Patrick 'P. J.' McDermott
On 2013-04-16 15:00, Patrick 'P. J.' McDermott wrote:
[...]
> 
> I'm trying to build and install GCC 4.7.2, and I'm getting the following
> error from the "install-mkheaders" target of gcc/Makefile:
[...]
> 
> The deletion of syslimits.h, movement of limits.h to syslimits.h, and
> change to the metadata of syslimits.h all look like the behavior of the
> "stmp-fixinc" target.  But as far as I can tell that target isn't being
> updated with the top-level "install" target.  (Nor should it be, as far
> as I know.)
> 
> Does anyone have any idea what's happening here?  Why is
> gcc/include-fixed/limits.h being moved when updating the "install"
> target?

I've found the issue.

I'm using build helper utilities (similar to Debian's debhelper) to
build a distribution package of GCC.  The utility that runs `make
install` first checks for an "install" target by running `make -n
install` [1], which is supposed to perform a dry run and print commands
without executing them.  (debhelper's dh_auto_install does this check as
well [2].)

Running `make -n install` (or `MAKEFLAGS=n make install`) in GCC (4.7 at
least – I haven't tested 4.8 yet) instead actually executes commands.

Additionally, it updates the "stmp-fixinc" target of gcc/Makefile, which
as far as I can tell should not be updated with the top-level "install"
target.  As a result, gcc/include-fixed/limits.h is moved to
gcc/include-fixed/syslimits.h, which causes the installation of a fixed
limits.h in the "install-mkheaders" target of gcc/Makefile to fail.

Example:

$ ../src/configure [...]
[...]
$ make bootstrap-lean
[...]
$ cp -Rp . ../gcc-build.build  # Backup the build dir for comparison
$ make -n install
[...]
rm -rf include-fixed; mkdir include-fixed
chmod a+rx include-fixed
if [ -d ../prev-gcc ]; then \
  cd ../prev-gcc && \
  make real-install-headers-tar DESTDIR=`pwd`/../gcc/ \
libsubdir=. ; \
else \
  set -e; for ml in `cat fixinc_list`; do \
sysroot_headers_suffix=`echo ${ml} | sed -e 's/;.*$//'`; \
multi_dir=`echo ${ml} | sed -e 's/^[^;]*;//'`; \
fix_dir=include-fixed${multi_dir}; \
[...]
rm -f ${fix_dir}/syslimits.h; \
if [ -f ${fix_dir}/limits.h ]; then \
  mv ${fix_dir}/limits.h ${fix_dir}/syslimits.h; \
else \
  cp ../../src/gcc/gsyslimits.h ${fix_dir}/syslimits.h; \
fi; \
chmod a+r ${fix_dir}/syslimits.h; \
  done; \
fi
[...]
$ diff -Nur ../gcc-build.build . | diffstat -b
diff: 
../gcc-build.build/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/stamp-bits:
 Too many levels of symbolic links
diff: ./x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/stamp-bits: Too 
many levels of symbolic links
 include-fixed/limits.h|  172 
---
 include-fixed/syslimits.h |  180 
+++---
 2 files changed, 172 insertions(+), 180 deletions(-)

The execution of commands appears to be caused by the use of the "MAKE"
macro [3], e.g. this line in the long command in the "stmp-fixinc"
target:

  $(MAKE) real-$(INSTALL_HEADERS_DIR) DESTDIR=`pwd`/../gcc/ \

So this is partly an issue in the way GCC's makefiles use "$(MAKE)" in
long commands and mostly an issue in the (arguably non-standard and
surprising) way GNU Make (and System V make) treats commands that
contain "$(MAKE)".  (Relevant: threads [4][5][6] on GNU Make mailing
lists and the discussion of proposals for -n and $(MAKE) behavior in
POSIX.1 [7].)

It could be avoided by defining a new macro, e.g. `_MAKE = $(MAKE)` and
replacing all expansions of MAKE in commands with expansions of the new
macro, e.g.:

  $(_MAKE) real-$(INSTALL_HEADERS_DIR) DESTDIR=`pwd`/../gcc/ \

Thoughts?

[1]: 
http://git.proteanos.com/opkhelper/opkhelper.git/tree/lib/buildsystem/make.sh?id=bf055e8#n99
[2]: 
http://anonscm.debian.org/gitweb/?p=debhelper/debhelper.git;a=blob;f=Debian/Debhelper/Buildsystem/makefile.pm;h=c63b58e#l13
[3]: https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html
[4]: https://lists.gnu.org/archive/html/bug-make/2010-01/msg00014.html
[5]: https://lists.gnu.org/archive/html/help-make/2003-06/msg00048.html
[6]: https://lists.gnu.org/archive/html/help-make/2008-07/msg00017.html
[7]: 
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html#tag_20_76_18

-- 
Patrick "P. J." McDermott
http://www.pehjota.net/
http://www.pehjota.net/contact.html


Frontend question

2013-04-18 Thread Hendrik Greving
Hi,

this is w.r.t. an older GCC version, I took a quick look and it looks
like it's still roughly the same in recent GCC's.

In function c-decl.c:grokdeclarator: I am debugging something and am
wondering, what does an IDENTIFIER_POINTER (id->identifier.id.str)
contain? I see long strings in there, e.g. something like

   const unsigned char *) 0x345db66 "#cs476000M", '0' , "30007fe0_", '0' , "_0", 'f'
, "00__30007fe0_", '0'
, "_0", 'f' ,
"00__00"...

where do they come from? It is not source code is it?

I hope this is not too general of a question.

Thanks, Regards,
Hendrik Greving


Re: LRA assign same hard register with live range overlapped pseduos

2013-04-18 Thread Vladimir Makarov

On 04/17/2013 11:18 PM, Shiva Chen wrote:

Full test2.c.209r.reload is about 296kb and i can't send successfully.
Is there another way to send the dump file?

Did you try to compress it?  Another possibility would be send dump only 
for the particular function.




Re: Frontend question

2013-04-18 Thread Diego Novillo

On 2013-04-18 16:58 , Hendrik Greving wrote:

Hi,

this is w.r.t. an older GCC version, I took a quick look and it looks
like it's still roughly the same in recent GCC's.

In function c-decl.c:grokdeclarator: I am debugging something and am
wondering, what does an IDENTIFIER_POINTER (id->identifier.id.str)
contain? I see long strings in there, e.g. something like

const unsigned char *) 0x345db66 "#cs476000M", '0' , "30007fe0_", '0' , "_0", 'f'
, "00__30007fe0_", '0'
, "_0", 'f' ,
"00__00"...


Identifier strings are not NUL-terminated, what you are seeing is random 
memory contents at the end of the identifier.



Diego.


gcc-4.8-20130418 is now available

2013-04-18 Thread gccadmin
Snapshot gcc-4.8-20130418 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20130418/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 198075

You'll find:

 gcc-4.8-20130418.tar.bz2 Complete GCC

  MD5=972e5a9356f39ebb4cfe16b735b00936
  SHA1=325325b9c72e2a4673d57eed0dca82159c48e3c6

Diffs from 4.8-20130411 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: LRA assign same hard register with live range overlapped pseduos

2013-04-18 Thread Dave Korn
On 18/04/2013 21:50, Vladimir Makarov wrote:
> On 04/17/2013 11:18 PM, Shiva Chen wrote:
>> Full test2.c.209r.reload is about 296kb and i can't send successfully.
>> Is there another way to send the dump file?
>>
> Did you try to compress it?  Another possibility would be send dump only
> for the particular function.

  And there's always pastebin.com

cheers,
  DaveK