gcc-4.5.0 built successfully from source
./config-guess: i686-pc-linux-gnu gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/opt/gcc/libexec/gcc/i686-pc-linux-gnu/4.5.0/lto-wrapper Target: i686-pc-linux-gnu Configured with: ./configure --prefix=/opt/gcc Thread model: posix gcc version 4.5.0 (GCC) Ubuntu 10.04 LTS Linux CARBON 2.6.32-25-generic-pae #45-Ubuntu SMP Sat Oct 16 21:01:33 UTC 2010 i686 GNU/Linux dpkg -l libc6 Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Cfg-files/Unpacked/Failed-cfg/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name VersionDescription +++-==-==- ii libc6 2.11.1-0ubuntu Embedded GNU C Library: Shared libraries
Re: named address spaces: addr_space_convert never called
David Brown schrieb: > On 09/11/10 18:45, Georg Lay wrote: >> David Brown schrieb: ... >>> May I say I think it's great that you are looking into this? Program >>> space access on the AVR was the first thing I thought of when I heard of >>> the concept of named address spaces in C. >> >> It's great work that this is part of gcc now! Just remember all the >> hacks to get near/far support into a C16x target. >> >> Besides access and pointer size, maybe even thinks like >> >> int __far __atomic x; >> > > "__far" as a memory space makes sense. It may even make sense to have > three extra spaces so that (along with __pgm) you can use all the flash > space in an Mega256, if that is easier or more efficient than using > 24-bit pointers to flash. > > I don't think "__atomic" is appropriate for a memory space, however. I > don't know if you've read Linus Torvald's rant against "volatile", but > his point applies here too. Variables cannot be volatile or atomic in > themselves - it is /accesses/ that are volatile or atomic (declaring a > variable to be "volatile" is simply a shorthand for saying that all > accesses to it should be volatile). > > While volatile accesses are fairly cheap, and not /too/ hard to > understand (though many people misunderstand them), atomic accesses are > expensive (since they involve disabling interrupts) and hard to > understand. Supposing "ax" and "ay" are declared as "int __atomic ax, > ay". Should the statement "ax += 2" be atomic? Should "ax = ay"? I > think it is better to stick to the macros in - they are > clear and explicit. An alternative would be to implement the gcc atomic > builtins, since these are also explicit. The issue with avr is that neither int nor void* is atomic. If there is a situation line a->b->c->d and one access has to be atomic, it helps to keep the source clear an reduce overhead because just one (or maybe more than one) access needs to be atomic. Things like __atomic should refer to native types that the back end supports. __atomic DI on avr might be not appropriate and the BE could complain. Same applie to composites. If a composite has to be treated atomically as a whole, util/atomic.h is the tool of choice. __atomic just adds an other tool to your toolbox, and as so often in C, you must know what you are doing or it will bit you sooner or later, i.e. same as with volatile, signed overflow, strict aliasing, etc. And of course, overhead of atomic must be taken into account and be balanced with other approaches. Disabling interrupts is not a good thing in an OS, maybe in one with hard real time constraints (with hard as in *HARD*). But don't let us stick at __atomic. Was just brainstorming about what named address spaces (NAS) are capable to do or could be capable to do. Imagine a __track, that could help in getting finer grained profiling info in a context where you do /not/ have (potentially) unlimited resources like embedded systems, or where you want to bring profiling info closer to the setup, in which the software finally is intended to run. Again, hard realtime is a good example. Contemplating NAS in a more general context as being some kind of (target specific) qualifier that helps to implement named address spaces. Looking at NAS from the "named address space" perspective, I can understand the warnings and the concretization (don't know the word, other direction like "abstraction") of what a "subset of" means. Looking at NAS from the qualifier's perspective, the warnings are quite confusing, because qualifiers are not mutually exclusive. So there could be a mask of what qualifiers are on instead of an enum passed in TARGET_ADDR_SPACE_SUBSET_P and the target can decide what combination is legal, what warning is appropriate, or if it feels comfortable with standard setup. >> However, at present avr backend badly needs avr developers. It's not a >> good idea to build such enhancements atop of all the PRs there... >> > > I agree with that - prioritising is important. But it is also good to > move forward with new features. > > Of course, if you are feeling enthusiastic the next step would be an > __eeprom memory space that worked efficiently on all AVRs... hmmm, once thought about it. There is an erratum for some AVR with that (indirect access against direct). To work around that in gcc, gcc needs to have some knowledge. Moreover (not to familiar with that, the erratum occured because avr-libc made some implication on how gcc will inline functions, so that an indirect access finally collapses to a direct one. As I am sill using avr-gcc-3.4.6 which runs pretty smooth and the compiler has hot the knowledge, I didn't think about it further) accessing eeprom is not as easy as accessing flash. Perhaps some new eeprom- or SFR-relocs could help to shift the information from avr-gcc to avr-binutils. -- Georg Johann Lay
Re: named address spaces: addr_space_convert never called
On 11/11/2010 10:26, Georg Johann Lay wrote: David Brown schrieb: On 09/11/10 18:45, Georg Lay wrote: David Brown schrieb: ... May I say I think it's great that you are looking into this? Program space access on the AVR was the first thing I thought of when I heard of the concept of named address spaces in C. It's great work that this is part of gcc now! Just remember all the hacks to get near/far support into a C16x target. Besides access and pointer size, maybe even thinks like int __far __atomic x; "__far" as a memory space makes sense. It may even make sense to have three extra spaces so that (along with __pgm) you can use all the flash space in an Mega256, if that is easier or more efficient than using 24-bit pointers to flash. I don't think "__atomic" is appropriate for a memory space, however. I don't know if you've read Linus Torvald's rant against "volatile", but his point applies here too. Variables cannot be volatile or atomic in themselves - it is /accesses/ that are volatile or atomic (declaring a variable to be "volatile" is simply a shorthand for saying that all accesses to it should be volatile). While volatile accesses are fairly cheap, and not /too/ hard to understand (though many people misunderstand them), atomic accesses are expensive (since they involve disabling interrupts) and hard to understand. Supposing "ax" and "ay" are declared as "int __atomic ax, ay". Should the statement "ax += 2" be atomic? Should "ax = ay"? I think it is better to stick to the macros in - they are clear and explicit. An alternative would be to implement the gcc atomic builtins, since these are also explicit. The issue with avr is that neither int nor void* is atomic. If there is a situation line a->b->c->d and one access has to be atomic, it helps to keep the source clear an reduce overhead because just one (or maybe more than one) access needs to be atomic. Things like __atomic should refer to native types that the back end supports. __atomic DI on avr might be not appropriate and the BE could complain. Same applie to composites. If a composite has to be treated atomically as a whole, util/atomic.h is the tool of choice. __atomic just adds an other tool to your toolbox, and as so often in C, you must know what you are doing or it will bit you sooner or later, i.e. same as with volatile, signed overflow, strict aliasing, etc. And of course, overhead of atomic must be taken into account and be balanced with other approaches. Disabling interrupts is not a good thing in an OS, maybe in one with hard real time constraints (with hard as in *HARD*). But don't let us stick at __atomic. Was just brainstorming about what named address spaces (NAS) are capable to do or could be capable to do. Imagine a __track, that could help in getting finer grained profiling info in a context where you do /not/ have (potentially) unlimited resources like embedded systems, or where you want to bring profiling info closer to the setup, in which the software finally is intended to run. Again, hard realtime is a good example. I'd imagine there is lots that could be done with named address spaces. For example, on some targets you have smaller or faster code when addressing particular areas of memory - perhaps because it is internal tightly-coupled ram, or because you have smaller addressing modes (on the 68k, for example, there is a 16-bit signed absolute addressing mode. On the AVR, there is Y+q). These would be natural fits for named address spaces. Is there any way to add named address spaces at usage time (i.e., compile time for the target code rather than compile time for gcc)? For example, would it be possible to make artificial generic address spaces __space1, __space2, etc., which caused calls to hook functions when data was loaded or stored to the variable. The hook functions could be initially defined with weak linkage and overridden by user code. This would allow the user to implement __track, __bigendian, __eeprom, or whatever by writing their own hook functions and #define'ing __track as __space1. Obviously hook functions would mean some overhead, due to the function call. But if users found a particular memory space to be useful, then it could always be moved into the compiler later. It is also possible that LTO could be used to eliminate the overhead (does LTO work on the AVR port?). Contemplating NAS in a more general context as being some kind of (target specific) qualifier that helps to implement named address spaces. Looking at NAS from the "named address space" perspective, I can understand the warnings and the concretization (don't know the word, other direction like "abstraction") of what a "subset of" means. Looking at NAS from the qualifier's perspective, the warnings are quite confusing, because qualifiers are not mutually exclusive. So there could be a mask of what qualifiers are on instead of an enum passed in TARGET_ADDR_SPACE_SUBSET_P
PL/1 frontend
I would like to resurrect the PL/1 frontend which appears to have stopped development about 3 years ago. I plan to start with the preprocessor, which is a PL/1 subset interpreter. First of all, is this the correct list to ask questions about writing a frontend? I plan on using GCC 4.5.1 and most of the example frontends seem to be made for 4.4.3 or earlier and do not work against 4.5.1. Is there a good example that works on 4.5.1? I just need the minimal subset right now as the preprocessor does not output, except for debugging and the input stream for the main PL/1 processor. When I wrote a minimal frontend program, I get the following during linking: /a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/prev-gcc/xgcc -B/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/prev-gcc/ -B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux-gnu/lib/ -isystem /usr/local/i686-pc-linux-gnu/include -isystem /usr/local/i686-pc-linux-gnu/sys-include -g -O2 -fomit-frame-pointer -gtoggle -DIN_GCC -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -DHAVE_CONFIG_H -o pl11 \ pl1/pl11.o main.o libbackend.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a ../libcpp/libcpp.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a attribs.o -L/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/gmp/.libs -L/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/mpfr/.libs -L/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/mpc/src/.libs -lmpc -lmpfr -lgmp -rdynamic -ldl -L../zlib -lz -lelf libbackend.a(coverage.o):(.rodata+0x20): undefined reference to `gt_ggc_mx_lang_tree_node' libbackend.a(coverage.o):(.rodata+0x24): undefined reference to `gt_pch_nx_lang_tree_node' libbackend.a(dbxout.o):(.rodata+0x180): undefined reference to `gt_ggc_mx_lang_tree_node' libbackend.a(dbxout.o):(.rodata+0x184): undefined reference to `gt_pch_nx_lang_tree_node' With the last 4 lines repeated about 100 times for various parts of the backend. I am not sure if this is for debugging or inline code, but there is obviously something that I am missing. Any ideas? Tom Merrick IT Manager College of Science & Technology Texas A&M University CI 352 6300 Ocean Drive Corpus Christi, TX 78412-5806 Phone: 361-825-2435 Fax: 361-825-5789
Re: PL/1 frontend
On Thu, Nov 11, 2010 at 3:47 PM, Merrick, Thomas wrote: > I would like to resurrect the PL/1 frontend which appears to have stopped > development about 3 years ago. I plan to start with the preprocessor, which > is a PL/1 subset interpreter. > > First of all, is this the correct list to ask questions about writing a > frontend? I plan on using GCC 4.5.1 and most of the example frontends seem > to be made for 4.4.3 or earlier and do not work against 4.5.1. Is there a > good example that works on 4.5.1? I just need the minimal subset right now > as the preprocessor does not output, except for debugging and the input > stream for the main PL/1 processor. > > When I wrote a minimal frontend program, I get the following during linking: > > /a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/prev-gcc/xgcc > -B/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/prev-gcc/ > -B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux-gnu/bin/ > -B/usr/local/i686-pc-linux-gnu/lib/ -isystem > /usr/local/i686-pc-linux-gnu/include -isystem > /usr/local/i686-pc-linux-gnu/sys-include -g -O2 -fomit-frame-pointer > -gtoggle -DIN_GCC -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes > -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long > -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition > -Wc++-compat -DHAVE_CONFIG_H -o pl11 \ > pl1/pl11.o main.o libbackend.a ../libcpp/libcpp.a > ../libdecnumber/libdecnumber.a ../libcpp/libcpp.a ../libiberty/libiberty.a > ../libdecnumber/libdecnumber.a attribs.o > -L/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/gmp/.libs > -L/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/mpfr/.libs > -L/a0/staff/tmerrick/gcc-4.5.1/host-i686-pc-linux-gnu/mpc/src/.libs -lmpc > -lmpfr -lgmp -rdynamic -ldl -L../zlib -lz -lelf > libbackend.a(coverage.o):(.rodata+0x20): undefined reference to > `gt_ggc_mx_lang_tree_node' > libbackend.a(coverage.o):(.rodata+0x24): undefined reference to > `gt_pch_nx_lang_tree_node' > libbackend.a(dbxout.o):(.rodata+0x180): undefined reference to > `gt_ggc_mx_lang_tree_node' > libbackend.a(dbxout.o):(.rodata+0x184): undefined reference to > `gt_pch_nx_lang_tree_node' > > With the last 4 lines repeated about 100 times for various parts of the > backend. I am not sure if this is for debugging or inline code, but there is > obviously something that I am missing. Any ideas? You need to define a lang_tree_node type, even if its empty. See the lto frontend example. > > Tom Merrick > IT Manager > College of Science & Technology > Texas A&M University > CI 352 > 6300 Ocean Drive > Corpus Christi, TX 78412-5806 > Phone: 361-825-2435 > Fax: 361-825-5789 > >
Re: PL/1 frontend
On Thu, Nov 11, 2010 at 09:47, Merrick, Thomas wrote: > libbackend.a(coverage.o):(.rodata+0x20): undefined reference to > `gt_ggc_mx_lang_tree_node' > libbackend.a(coverage.o):(.rodata+0x24): undefined reference to > `gt_pch_nx_lang_tree_node' > libbackend.a(dbxout.o):(.rodata+0x180): undefined reference to > `gt_ggc_mx_lang_tree_node' > libbackend.a(dbxout.o):(.rodata+0x184): undefined reference to > `gt_pch_nx_lang_tree_node' You are likely missing a function like this: /* Tree walking support. */ static enum gimple_tree_node_structure_enum gimple_tree_node_structure (union lang_tree_node *t ATTRIBUTE_UNUSED) { return TS_GIMPLE_GENERIC; } And a counterpart structure union GTY((desc ("gimple_tree_node_structure (&%h)"), chain_next ("(union lang_tree_node *)TREE_CHAIN (&%h.generic)"))) lang_tree_node { union tree_node GTY ((tag ("TS_GIMPLE_GENERIC"), desc ("tree_node_structure (&%h)"))) generic; }; I believe that Ian Taylor had some tutorial notes on writing front ends. Ian, am I dreaming again? Diego.
Re: PL/1 frontend
Hey Tom, You need to define a lang_tree_node type, even if its empty. See the lto frontend example. You could also look at this site (http://code.redbrain.co.uk/cgit.cgi/gcc-dev/tree/gcc/gcalc?h=documentation). It contains the front-end code for a very simple front-end showing the general/expected structure of the front-end. The code isn't yet optimal, but it should help you to figure out what you really need to build-up the front-end. If you have any specific questions about front-ends and how to pass data to the middle-end feel free to ask on this list :-) Hope that helps, Andi
Re: pipeline description
If I have two insns: r2 = r3 r3 = r4 It seems to me that the dependency analysis creates a dependency between the two and prevent parallelization. Although there is a dependency (because of r3) I want GCC to parallelize them together. Since if the insns are processed together the old value of r3 is used for the first insn before it is written by the second insn. How do I let GCC know about these things (When exactly each operand is used and when it is written)? Is it in these hooks? In which port can I see a good example for that? Thanks, Roy. 2010/11/4 Ian Lance Taylor : > roy rosen writes: > >> I am writing now the pipeline description in order to get a parallel code. >> My machine has many restrictions regarding which instruction can be >> parallelized with another. >> I am under the assumption that for each insn only one >> define_insn_reservation is matched. >> Is that correct? > > Yes. > >> If so then the number of define_insn_reservation is >> very high since I have to put here all possible attribute >> permutations. > > That may be true for your machine. > > Most processors, in fact all that I have ever worked with myself, have a > relatively small number of resources and instructions use them in > predictable ways. There are generally not that many combinations. That > is the case that gcc's scheduling code is written for. > > It sounds like your machine might be quite different: each instruction > has different resource requirements and uses them in different ways. > > The scheduler has a lot of hooks which you can define in your cpu.c > file. Their names all start with TARGET_SCHED. Those hooks make it > easier to, for example, use the scheduler with a VLIW machine. You > should read over the docs and see if you can use them to simplify your > scenario. > > Ian >
Re: PL/1 frontend
On Thu, Nov 11, 2010 at 10:26, Andi Hellmund wrote: > Hey Tom, > >> You need to define a lang_tree_node type, even if its empty. See >> the lto frontend example. > > You could also look at this site > (http://code.redbrain.co.uk/cgit.cgi/gcc-dev/tree/gcc/gcalc?h=documentation). Nice. Thanks for doing this. If it's not already there, could you add a link to this in http://gcc.gnu.org/wiki/GettingStarted ? Thanks. Diego.
Re: PL/1 frontend
Nice. Thanks for doing this. If it's not already there, could you add a link to this in http://gcc.gnu.org/wiki/GettingStarted ? Sure, Phil (Herron) and I could post it there. Ideally, we should also have a more-or-less thorough documentation in the gcc internals document about howto implement front-ends, etc. We already have an initial version (I think it is in the documentation branch), but it is still missing quite a lot. So, if Ian possibly has further recommendations for front-ends, we could include them adequately. Thanks, Andi
Re: PL/1 frontend
Diego Novillo writes: > I believe that Ian Taylor had some tutorial notes on writing front > ends. Ian, am I dreaming again? The ntoes are in my summit paper which can be found at http://gcc.gnu.org/wiki/summit2010 . The paper is "The Go frontend for GCC." Ian
Re: pipeline description
roy rosen writes: > If I have two insns: > r2 = r3 > r3 = r4 > It seems to me that the dependency analysis creates a dependency > between the two and prevent parallelization. Although there is a > dependency (because of r3) I want GCC to parallelize them together. > Since if the insns are processed together the old value of r3 is used > for the first insn before it is written by the second insn. > How do I let GCC know about these things (When exactly each operand is > used and when it is written)? > Is it in these hooks? > In which port can I see a good example for that? I was under the impression that an anti-dependence in the same cycle was permitted to execute. But perhaps I am mistaken. In the one significant VLIW port I did, I pulled the instructions together in the final pass. Ian
Re: Idea - big and little endian data areas using named address spaces
On Wed, Nov 10, 2010 at 01:00:49PM +0100, David Brown wrote: > Would it be possible to use the named address space syntax to > implement reverse-endian data? Conversion between little-endian and > big-endian data structures is something that turns up regularly in > embedded systems, where you might well be using two different > architectures with different endianness. Some compilers offer > direct support for endian swapping, but gcc has no neat solution. > You can use the __builtin_bswap32 (but no __builtin_bswap16?) > function in recent versions of gcc, but you still need to handle the > swapping explicitly. > > Named address spaces would give a very neat syntax for using such > byte-swapped areas. Ideally you'd be able to write something like: > > __swapendian stuct { int a; short b; } data; > > and every access to data.a and data.b would be endian-swapped. You > could also have __bigendian and __litteendian defined to > __swapendian or blank depending on the native ordering of the > target. > > > I've started reading a little about how named address spaces work, > but I don't know enough to see whether this is feasible or not. > > > Another addition in a similar vein would be __nonaligned, for > targets which cannot directly access non-aligned data. The loads > and stores would be done byte-wise for slower but correct > functionality. Yes. In fact when I gave the talk on named address spaces I mentioned this, and during the summit last year, I made a toy demonstration set of patches in the PowerPC to add cross endian support. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com
gcc-4.5-20101111 is now available
Snapshot gcc-4.5-2010 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-2010/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 166624 You'll find: gcc-4.5-2010.tar.bz2 Complete GCC (includes all of below) MD5=76de4b448fd525d0ed9c456192369b8b SHA1=a18c6abff2334f7401af3521edcce4b3841be299 gcc-core-4.5-2010.tar.bz2C front end and core compiler MD5=492fabcf97981fdcd07fe8baead64591 SHA1=6ac290c36668089aab8ebf14fddfab3e9e115398 gcc-ada-4.5-2010.tar.bz2 Ada front end and runtime MD5=578e998e5e8d86c2b16547fae2a24c28 SHA1=553f983cf7459e28c3ca85d99dafeeec1ffcbad6 gcc-fortran-4.5-2010.tar.bz2 Fortran front end and runtime MD5=bc095a0b3c83ac6b801af8c8168a15ab SHA1=b68e8320c51482075570a99354514bd9bdb51fae gcc-g++-4.5-2010.tar.bz2 C++ front end and runtime MD5=c657d041eb6174b294068997e18f568f SHA1=515a80ffc377a800f24aed2938c66b10af423b8e gcc-java-4.5-2010.tar.bz2Java front end and runtime MD5=8b636a5b07931cf6da5cec43d010f15e SHA1=ed3d5c1ff5573377d549b4e95aa4158af4ade1d8 gcc-objc-4.5-2010.tar.bz2Objective-C front end and runtime MD5=3f6a40d79379c3fa1c778d2f7ca8c5d5 SHA1=158068e4d965e86d8d045bcfe3176cc286f01383 gcc-testsuite-4.5-2010.tar.bz2 The GCC testsuite MD5=8dc9e687f51db26772bb37c099f1aeec SHA1=ce4e4c99b3713e027248863ba7778d31ae7e8f35 Diffs from 4.5-20101104 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: PATCH RFA: Do not build java by default
On Sun, Oct 31, 2010 at 12:09 PM, Ian Lance Taylor wrote: > Currently we build the Java frontend and libjava by default. At the GCC > Summit we raised the question of whether should turn this off, thus only > building it when java is explicitly selected at configure time with > --enable-languages. Among the people at the summit, there was general > support for this, and nobody was opposed to it. > > Here is a patch which implements that. I'm sending this to the mailing > lists gcc@ and java@, as well as the relevant -patches@ lists, because > it does deserve some broader discussion. I count 33 messages on the topic and it is clear that there is no consensus. I am withdrawing this proposed patch. Ian
jump after return
When compiling at -O2 and -O3 optimization, I've found several instances of routines ending with: retq jmp with no branch pointing to the jmp instruction. Is there any specific reason for such code? Is there a way to avoid this? This is gcc 4.4.5, as installed on Ubuntu 10.10, compiling the SPEC OMP and CPU-2006 benchmarks. Thanks in advance. -- Alain.
Vendor specific dwarf
Is there a policy/agreement on how to deal with vendor specific dwarf?
Re: jump after return
Alain Ketterlin writes: > When compiling at -O2 and -O3 optimization, I've found several > instances of routines ending with: > > retq > jmp > > with no branch pointing to the jmp instruction. Is there any specific > reason for such code? Is there a way to avoid this? > > This is gcc 4.4.5, as installed on Ubuntu 10.10, compiling the SPEC > OMP and CPU-2006 benchmarks. This question borders on being more appropriate for the mailing list gcc-h...@gcc.gnu.org than for the gcc@gcc.gnu.org mailing list. Please consider whether to use gcc-help for future questions. Thanks. It always helps to give a specific example, and to mention the specific target--I infer x86, but 32-bit or 64-bit? Which specific processor, or in other words which -mtune option? In any case I would guess that it is due to -falign-functions. The x86 GNU assembler does alignment by inserting nops, and if it needs to align more than some number of bytes it does it using a jump instruction. For some x86 tuning targets the default for -falign-functions is 32, and gas will certainly use a jmp if it has to skip more than 15 bytes. Ian
Re: Vendor specific dwarf
Georg Johann Lay writes: > Is there a policy/agreement on how to deal with vendor specific dwarf? I think we're more or less happy to generate GNU specific DWARF where necessary, although the general rule is to try to get it into the standard. As far as I know we don't currently generate DWARF specific to other vendors, but I wouldn't be a bit surprised if there were some cases I don't know about. What case are you thinking about? Ian
Discussion: What is unspec_volatile?
Suppose an backend implements some unspec_volatile (UV) and has a destinct understanding of what it should be. If other parts of the compiler don't know exactly what to do, it's a potential source of trouble. - "May I schedule across the unspec_volatile (UV) or not?" - "May I move the UV from one BB into an other" - "May I doublicate UVs or reduce them?" (unrolling, ...) That kind of questions. If there is no clear statement/specification, we have a problem and a déja vu of the too well known unspecified/undefined/implementation defined like i = i++ but now within the compiler, for instance between backend and middle end.
Re: Discussion: What is unspec_volatile?
Georg Johann Lay writes: > Suppose an backend implements some unspec_volatile (UV) and has a > destinct understanding of what it should be. > > If other parts of the compiler don't know exactly what to do, it's a > potential source of trouble. > > - "May I schedule across the unspec_volatile (UV) or not?" > - "May I move the UV from one BB into an other" > - "May I doublicate UVs or reduce them?" (unrolling, ...) > > That kind of questions. > > If there is no clear statement/specification, we have a problem and a > déja vu of the too well known unspecified/undefined/implementation > defined like > > i = i++ > > but now within the compiler, for instance between backend and middle end. The unspec_volatile RTL code is definitely under-documented. In general, the rules about unspec_volatile are similar to the rules about the volatile qualifier. In other words, an unspec_volatile may be duplicated at compile time, e.g., if a loop is unrolled, but must be executed precisely the specified number of times at runtime. An unspec_volatile may move to a different basic block under the same conditions. If the compiler can prove that an unspec_volatile can never be executed, it can discard it. That is clear enough. What is much less clear is the extent to which an unspec_volatile acts as a scheduling barrier. The scheduler itself never moves any instructions across an unspec_volatile, so in that sense an unspec_volatile is a scheduling barrier. However, I believe that there are cases where the combine pass will combine instructions across an unspec_volatile, so in that sense an unspec_volatile is not a scheduling barrier. (The combine pass will not attempt to combine the unspec_volatile instruction itself.) It may be that those cases in combine are bugs. However, neither the documentation nor the implementation are clear on that point. Ian
Discussion: Should GNU C implement bit types?
Implementation as outlined as in http://gcc.gnu.org/ml/gcc/2010-11/msg00222.html
Re: Discussion: What is unspec_volatile?
Ian Lance Taylor schrieb: Georg Johann Lay writes: Suppose an backend implements some unspec_volatile (UV) and has a destinct understanding of what it should be. If other parts of the compiler don't know exactly what to do, it's a potential source of trouble. - "May I schedule across the unspec_volatile (UV) or not?" - "May I move the UV from one BB into an other" - "May I doublicate UVs or reduce them?" (unrolling, ...) That kind of questions. If there is no clear statement/specification, we have a problem and a déja vu of the too well known unspecified/undefined/implementation defined like i = i++ but now within the compiler, for instance between backend and middle end. The unspec_volatile RTL code is definitely under-documented. In general, the rules about unspec_volatile are similar to the rules about the volatile qualifier. In other words, an unspec_volatile may be duplicated at compile time, e.g., if a loop is unrolled, but must be executed precisely the specified number of times at runtime. An unspec_volatile may move to a different basic block under the same conditions. If the compiler can prove that an unspec_volatile can never be executed, it can discard it. That is clear enough. What is much less clear is the extent to which an unspec_volatile acts as a scheduling barrier. The scheduler itself never moves any instructions across an unspec_volatile, so in that sense an unspec_volatile is a scheduling barrier. However, I believe that there are cases where the combine pass will combine instructions across an unspec_volatile, so in that sense an unspec_volatile is not a scheduling barrier. (The combine pass will not attempt to combine the unspec_volatile instruction itself.) It may be that those cases in combine are bugs. However, neither the documentation nor the implementation are clear on that point. What about liveness? No hard reg, pseudo, mem will live avross the unspec volatile. Right? Or may the backend allow for data to live across that point and make the unspec "transparent"? Sounds paradoxical. It's more about passing information through the compile process without being discarded. Might debug info cross unspec volatiles? Can the back end take the decision? Georg
Target specific qualifiers in GNU C
Hook in and discuss http://gcc.gnu.org/ml/gcc/2010-11/msg00232.html