Re: Delay slot filling - what still matters, and what doesn't matter so much anymore?
Quoting Steven Bosscher : Hello delay-slot target maintainers :-) N.B., that also includes me as ARC maintainer. First of all: What is still important to handle? It's clear that the expectations in reorg.c are "anything goes" but modern RISCs (everything since the PA-8000, say) probably have some limitations on what is helpful to have, or not have, in a delay slot. Actually, for several targets, annulled-true delay slots are really 'anything goes' including other instructions with delay slots. These targets just have branch variants with and without delay slots, and fill_eager_delay_slots in fact pessimizes the code. According to the comments in pa.h about MASK_JUMP_IN_DELAY, having jumps in delay slots of other jumps is one such thing: They don't bring benefit to the PA-8000 and they don't work with DWARF2 CFI. As far as I know, SPARC and MIPS don't allow jumps in delay slots, SH looks like it doesn't allow it either That's only the GCC port, because you get ICEs when you try it. It would really be better if we could just tell reorg that an unfilled delay slot is OK if the branch is not likely. Another thing I completely fail to grasp, is how the pipeline scheduler and delay slots interact. Doesn't dbr_schedule destroy all the good work schedule_insns has tried to do? If so, how much does that hurt on modern RISCs? Indeed, that is a problem. The SH actually makes some scheduler adjustments to have an extra insn available to put in the delay slot.
Re: Delay slot filling - what still matters, and what doesn't matter so much anymore?
On 04/17/2013 11:52 PM, Steven Bosscher wrote: > According to the comments in pa.h about MASK_JUMP_IN_DELAY, having > jumps in delay slots of other jumps is one such thing: They don't > bring benefit to the PA-8000 and they don't work with DWARF2 CFI. As > far as I know, SPARC and MIPS don't allow jumps in delay slots, SH > looks like it doesn't allow it either, and CRIS can do it for short > branches but doesn't do because the trade-off between benefit and > machine description complexity comes out negative. On the scheduler > implementation side: Branches as delayed insns in delay slots of other > branches is impossible to express in the CFG (at least in GCC, but I > think in general it can't be done cleanly). Therefore I want to drop > support for branches in delay slots. What do you think about this? I thought I'd mention C6X allows branches in delay slots (of course reorg.c isn't involved in this). This is useful if one can prove that the first branch isn't taken if the predicate of the second branch is true. Otherwise, it has the same semantics as the PA (as described by Jeff): you get to execute a few instructions at the first branch target and then another jump happens away from there. This can be useful, for example to implement short loops (without using the hardware loop mechanisms) by scheduling a decrement/branch every cycle for 6 cycles before the actual loop, but gcc does not use this functionality. > What about multiple delay slots? It looks like reorg.c has code to > handle insns with multiple delay slots, but there currently are no GCC > targets in the FSF tree that have insns with multiple delay slots and > that use define_delay. The C6X has many more delay slots than just 1 > (it can have up to 5 delay slots IIRC) 5 cycles with up to 8 insns each :) Didn't want to try that with reorg.c. > but it is much more flexible > than traditional RISCs when it comes to putting insns in delay slots > (it uses predication so it can annul delayed insns on various > conditions) and it uses a very clever (and effective??) delay slot > filling mechanism via the normal scheduler, using back-tracking and > "jump shadows" (see UNSPEC_JUMP_SHADOW in the cx6 back end). But C6X > doesn't use reorg.c delay slot scheduling. I'm not aware of any > non-VLIW, non-DSP targets with more than one delay slot per insn, and > new VLIW/DSP ports with delay slots probably should look at c6x rather > than using define_delay. Supporting only a single delay slot per > delay_insn would make my scheduler a bit less complex. Would that be > enough for everyone, or is it necessary to continue to support > multiple delay slots per insn? The mechanism used for C6X has the advantage of using the pipeline description for accurate schedules and allowing more than one delay slot. It can also add predication to fallthrough insns to make them suitable for use in a delay slot. The downside is that it doesn't know quite as many tricks as reorg.c. It's based on sched-ebb so it can only take instructions from the fallthrough branch (something I've wanted to fix but never had the time). In general I think if a new target wants more than one delay slot, it should try to use the C6X method instead of reorg.c. It would be nice for someone to try it on a target like mips or PA as well; ISTR Richard S was going to try at some point but I don't know if anything came of that. I expect it to generate worse code than reorg.c at this stage but improvements should be possible. Bernd
register usage, fixed_reg_set, call_used_reg_set, call_fixed_reg_set
Hi, this might be a question not entirely valid w.r.t. new GCC versions 4.x+. I am using GCC 3.2.x I have 2 questions regarding fixed_reg_set, call_used_reg_set, call_fixed_reg_set. Firstly, are fixed_reg_set, call_used_reg_set, call_fixed_reg_set always (or supposed to..) the same as fixed_reg, call_used_reg and call_fixed_reg, just as HARD_REG_SET representation? And secondly, is call_used_reg_set always a subset of fixed_reg_set, and call_fixed_reg_set always a subset of call_used_reg_set as well as subset of fixed_reg_set? Thanks for help, Regards, Hendrik Greving
GNU Make's -n option and $(MAKE) in makefiles (was: Cannot stat gcc/include-fixed/limits.h when installing GCC 4.7.2)
On 2013-04-16 15:00, Patrick 'P. J.' McDermott wrote: [...] > > I'm trying to build and install GCC 4.7.2, and I'm getting the following > error from the "install-mkheaders" target of gcc/Makefile: [...] > > The deletion of syslimits.h, movement of limits.h to syslimits.h, and > change to the metadata of syslimits.h all look like the behavior of the > "stmp-fixinc" target. But as far as I can tell that target isn't being > updated with the top-level "install" target. (Nor should it be, as far > as I know.) > > Does anyone have any idea what's happening here? Why is > gcc/include-fixed/limits.h being moved when updating the "install" > target? I've found the issue. I'm using build helper utilities (similar to Debian's debhelper) to build a distribution package of GCC. The utility that runs `make install` first checks for an "install" target by running `make -n install` [1], which is supposed to perform a dry run and print commands without executing them. (debhelper's dh_auto_install does this check as well [2].) Running `make -n install` (or `MAKEFLAGS=n make install`) in GCC (4.7 at least – I haven't tested 4.8 yet) instead actually executes commands. Additionally, it updates the "stmp-fixinc" target of gcc/Makefile, which as far as I can tell should not be updated with the top-level "install" target. As a result, gcc/include-fixed/limits.h is moved to gcc/include-fixed/syslimits.h, which causes the installation of a fixed limits.h in the "install-mkheaders" target of gcc/Makefile to fail. Example: $ ../src/configure [...] [...] $ make bootstrap-lean [...] $ cp -Rp . ../gcc-build.build # Backup the build dir for comparison $ make -n install [...] rm -rf include-fixed; mkdir include-fixed chmod a+rx include-fixed if [ -d ../prev-gcc ]; then \ cd ../prev-gcc && \ make real-install-headers-tar DESTDIR=`pwd`/../gcc/ \ libsubdir=. ; \ else \ set -e; for ml in `cat fixinc_list`; do \ sysroot_headers_suffix=`echo ${ml} | sed -e 's/;.*$//'`; \ multi_dir=`echo ${ml} | sed -e 's/^[^;]*;//'`; \ fix_dir=include-fixed${multi_dir}; \ [...] rm -f ${fix_dir}/syslimits.h; \ if [ -f ${fix_dir}/limits.h ]; then \ mv ${fix_dir}/limits.h ${fix_dir}/syslimits.h; \ else \ cp ../../src/gcc/gsyslimits.h ${fix_dir}/syslimits.h; \ fi; \ chmod a+r ${fix_dir}/syslimits.h; \ done; \ fi [...] $ diff -Nur ../gcc-build.build . | diffstat -b diff: ../gcc-build.build/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/stamp-bits: Too many levels of symbolic links diff: ./x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/stamp-bits: Too many levels of symbolic links include-fixed/limits.h| 172 --- include-fixed/syslimits.h | 180 +++--- 2 files changed, 172 insertions(+), 180 deletions(-) The execution of commands appears to be caused by the use of the "MAKE" macro [3], e.g. this line in the long command in the "stmp-fixinc" target: $(MAKE) real-$(INSTALL_HEADERS_DIR) DESTDIR=`pwd`/../gcc/ \ So this is partly an issue in the way GCC's makefiles use "$(MAKE)" in long commands and mostly an issue in the (arguably non-standard and surprising) way GNU Make (and System V make) treats commands that contain "$(MAKE)". (Relevant: threads [4][5][6] on GNU Make mailing lists and the discussion of proposals for -n and $(MAKE) behavior in POSIX.1 [7].) It could be avoided by defining a new macro, e.g. `_MAKE = $(MAKE)` and replacing all expansions of MAKE in commands with expansions of the new macro, e.g.: $(_MAKE) real-$(INSTALL_HEADERS_DIR) DESTDIR=`pwd`/../gcc/ \ Thoughts? [1]: http://git.proteanos.com/opkhelper/opkhelper.git/tree/lib/buildsystem/make.sh?id=bf055e8#n99 [2]: http://anonscm.debian.org/gitweb/?p=debhelper/debhelper.git;a=blob;f=Debian/Debhelper/Buildsystem/makefile.pm;h=c63b58e#l13 [3]: https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html [4]: https://lists.gnu.org/archive/html/bug-make/2010-01/msg00014.html [5]: https://lists.gnu.org/archive/html/help-make/2003-06/msg00048.html [6]: https://lists.gnu.org/archive/html/help-make/2008-07/msg00017.html [7]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html#tag_20_76_18 -- Patrick "P. J." McDermott http://www.pehjota.net/ http://www.pehjota.net/contact.html
Frontend question
Hi, this is w.r.t. an older GCC version, I took a quick look and it looks like it's still roughly the same in recent GCC's. In function c-decl.c:grokdeclarator: I am debugging something and am wondering, what does an IDENTIFIER_POINTER (id->identifier.id.str) contain? I see long strings in there, e.g. something like const unsigned char *) 0x345db66 "#cs476000M", '0' , "30007fe0_", '0' , "_0", 'f' , "00__30007fe0_", '0' , "_0", 'f' , "00__00"... where do they come from? It is not source code is it? I hope this is not too general of a question. Thanks, Regards, Hendrik Greving
Re: LRA assign same hard register with live range overlapped pseduos
On 04/17/2013 11:18 PM, Shiva Chen wrote: Full test2.c.209r.reload is about 296kb and i can't send successfully. Is there another way to send the dump file? Did you try to compress it? Another possibility would be send dump only for the particular function.
Re: Frontend question
On 2013-04-18 16:58 , Hendrik Greving wrote: Hi, this is w.r.t. an older GCC version, I took a quick look and it looks like it's still roughly the same in recent GCC's. In function c-decl.c:grokdeclarator: I am debugging something and am wondering, what does an IDENTIFIER_POINTER (id->identifier.id.str) contain? I see long strings in there, e.g. something like const unsigned char *) 0x345db66 "#cs476000M", '0' , "30007fe0_", '0' , "_0", 'f' , "00__30007fe0_", '0' , "_0", 'f' , "00__00"... Identifier strings are not NUL-terminated, what you are seeing is random memory contents at the end of the identifier. Diego.
gcc-4.8-20130418 is now available
Snapshot gcc-4.8-20130418 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20130418/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 198075 You'll find: gcc-4.8-20130418.tar.bz2 Complete GCC MD5=972e5a9356f39ebb4cfe16b735b00936 SHA1=325325b9c72e2a4673d57eed0dca82159c48e3c6 Diffs from 4.8-20130411 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: LRA assign same hard register with live range overlapped pseduos
On 18/04/2013 21:50, Vladimir Makarov wrote: > On 04/17/2013 11:18 PM, Shiva Chen wrote: >> Full test2.c.209r.reload is about 296kb and i can't send successfully. >> Is there another way to send the dump file? >> > Did you try to compress it? Another possibility would be send dump only > for the particular function. And there's always pastebin.com cheers, DaveK