Re: RFC: GIMPLE tuples. Design and implementation proposal
I don't really like the idea for promoting subcodes to first-level codes, like you do for GS_COND NE and EQ. Looks complicated and confusing to me. What is the benefit of this? Fully agreed with Steven (also on the locators bit). Paolo
Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: Following up on the recent discussion about GIMPLE tuples (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized our main ideas and implementation proposal in the attached document. This should be enough to get the implementation going, but there will be many details that still need to be addressed. Thoughts/comments on the proposal? It looks decent, but I also would go one step further with location information and PHI node "canonicalization". Further for memory usage we may want to use available padding on gimple_statement_base as flags or somehow trick gcc to use tail-padding for inheritance... For traversal speed I'd also put the chain first in the structure: struct gimple_statement_base { struct gimple_statement_base *next; /* maybe code and subcode here - that would make it the same cache-line but does not allow for tail-padding re-use on 64bit archs. */ struct basic_block_def *bb; /* basic block of stmt is also frequently accessed */ struct gimple_statement_base *prev; source_locus locus; tree block; unsigned code : 16; unsigned subcode : 16; /* if we have padding here a unsigned flags : XX */ } struct gimple_statement_with_ops { struct gimple_statement_base base __attribute__((packed)); unsigned modified : 1; ... (the packed attribute does not change the size of base though :() Richard.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Is a need to build several tables in HTML of the codes (with subcodes). Each table has an explanation. It's like a roadmap.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Richard Guenther wrote on 04/10/07 08:01: > It looks decent, but I also would go one step further with location > information and > PHI node "canonicalization". What would be the "step further" for PHI nodes? We haven't really left anything to spare inside GS_PHI. For insn locators, the idea is to simply move the RTL insn locator code into GIMPLE. This can be even done early in the implementation process, but for simplicity we left it for later. If someone wants to work on that, then great. > Further for memory usage we may want to use > available padding on gimple_statement_base as flags or somehow trick gcc to > use > tail-padding for inheritance... There is only going to be padding on 64 bit hosts. Instructions with no subcode will use those bits as bitflags. > For traversal speed I'd also put the chain first in the structure: Sure. That's easy enough to experiment with while doing the implementation.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Andrew Pinski wrote on 04/10/07 01:43: > Yes clobbers for asm are strings and don't really need to be full > trees. This should help out more. though it does make it harder to > implement. Hmm, could be. The problem is that __asm__ are so infrequent that I'm not sure it's worth the additional headache. > For "GS COND", you forgot about all the unorder conditionals which > don't happen that often but can. Good point. Thanks. > Most of the labels can go away with better handling of the CFG. Gotos > really should pointing to the basic block rather than labels. Remember that when we emit GIMPLE, we are not working with a CFG. One could argue that we would want to make the GOTO target a union of a label and block, but a label-to-block map is just as good, and easier to work with. > I also think we can improve our current gimplification which produces > sometimes ineffient gimplification which will change your numbers of > how many copies exist, see PRs 27798, 27800, 23401, 27810 (this one > especially as combine.i numbers show that). True. But remember that the stats were gathered inside the SSA verification routines, which happens during optimization. All those gimplification inefficiencies are quickly eliminated by the first few scalar cleanups. I very much doubt that you would see significantly different instruction distribution profiles if you made improvements in the gimplifier. > Also I noticed in your pdf, you have "PHI NODE" as 12%, we can improve > the memory usage for this statement by removing the usage of > TREE_CHAIN/TREE_TYPE, so we can save 4/8 bytes for those 12% without > doing much work. I can send a patch in the next week or so (I am busy > at a conference the next two days but I can start writting a patch > tomorrow). Well, if you want to do this patch for 4.3 and it gives us sufficient benefit, go for it. But it will have no effect on the tuples branch.
Re: RFC: GIMPLE tuples. Design and implementation proposal
J.C. Pizarro wrote on 04/10/07 01:24: > 1. Are there fields for flags, annotations, .. for special situations? Yes, some instructions have no sub-codes and will use the subcodes field for flags. Annotations and such will be discouraged as much as possible. If an attribute is very frequently used and causes significant degraded behaviour in pointer-maps or hash tables, then we can see about adding it somewhere. > 2. These structures are poorly specified. > Have they advanced structures like lists, e.g., list of predecessors >instructions of loops, predecessors instructions of forwarded >jumps, etc. instead of poor "prev"? I think you are missing the point. This structure defines the bits needed for representing GIMPLE. All we need is a double-chain of instructions. These chains are embedded inside basic blocks. > 3. Are there fields for more debug information? More debug information? What debug information are you looking for?
Re: RFC: GIMPLE tuples. Design and implementation proposal
J.C. Pizarro wrote on 04/10/07 02:08: > However, they've appeared the "conditional moves" to don't jump > and consecuently to reduce the penalization of the conditional jump. We already have conditional moves. Notice that subcodes for GS_ASSIGN have the same meaning as they do today. GS_ASSIGN:{EQ_EXPR,NE_EXPR...} will have three operands.
Re: Possible bug in preprocessor
Jim Wilson wrote: > JoseD wrote: > > @James > > What do you mean by 16.3.3/3? GCC's version ? > > This is a reference to the ISO C standard. No. It's a reference to the ISO C++ standard. 16.3.3/3 includes the sentence "If the result [of the ## operator] is not a valid preprocessing token, the behavior is undefined." The passage in the ISO C standard is 6.10.3.3/3, and includes the same text. -- Richard Smith
Re: RFC: GIMPLE tuples. Design and implementation proposal
Steven Bosscher wrote on 04/10/07 02:43: > On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: >> Thoughts/comments on the proposal? > > This looks a lot like the RTL insn! > > For locus, you can use just an "int" instead of a word if you use the > same representation for locations as we do for RTL (INSN_LOCATOR). You > mention this step as "straightforward to implement" after the > conversion to tuples is complete. Why did you decide not to just use > the scheme right from the start? One less thing to think about. That's the only reason, if anyone wants to work on this from day 1, then great. But since this is easily fixable afterwards, and we'll already have enough headaches with the basic conversion, it seemed simpler just to let this be for now. > this: You can use the same locator information in GIMPLE as in RTL, > which saves a conversion step; and you free up 32 bits on 64-bits > hosts, which is nice when you add the inevitable so-many-bits for > flags to the GIMPLE tuples ;-) Absolutely. The advantages are very clear. Location information at every insn is extremely redundant. > I don't really like the idea for promoting subcodes to first-level > codes, like you do for GS_COND NE and EQ. Looks complicated and > confusing to me. What is the benefit of this? Mostly, speed of recognition. I'm not totally against dropping this. As Andrew M. mentioned during our internal discussions, we will now have to implement predicates that recognize all the insns in the "GS_COND" family. This is something that we can do some experimentation. > Looks like a nice document overall. I hope we can keep it up to date, > it's a good start for new-GIMPLE documentation ;-) That's certainly a challenge. I could mumble something about doxygen and how much easier it would be if we could embed this in the source code. But the synchronization problem still remains. Many times we have comments totally out of sync with the code.
Re: RFC: GIMPLE tuples. Design and implementation proposal
J.C. Pizarro wrote on 04/10/07 08:17: > Is a need to build several tables in HTML of the codes (with subcodes). > Each table has an explanation. It's like a roadmap. Hmm, what?
Re: RFC: GIMPLE tuples. Design and implementation proposal
2007/4/10, Diego Novillo <[EMAIL PROTECTED]> wrote: More debug information? What debug information are you looking for? By example, worth weigths, use's frecuencies, statistical data, ... of GIMPLE. To debug the GIMPLE too. How you debug the failed GIMPLE? J.C. Pizarro
Re: RFC: GIMPLE tuples. Design and implementation proposal
J.C. Pizarro wrote on 04/10/07 10:24: > By example, worth weigths, use's frecuencies, statistical data, ... of GIMPLE. > To debug the GIMPLE too. That's kept separately. Pointer maps, hash tables... > How you debug the failed GIMPLE? Lots of debug_*() functions available. You also use -fdump-tree-... a lot. In the future, I would like us to be able to inject GIMPLE directly at any point in the pipeline to give us the illusion of unit testing.
Re: RFC: GIMPLE tuples. Design and implementation proposal
2007/4/10, Diego Novillo <[EMAIL PROTECTED]>: J.C. Pizarro wrote on 04/10/07 08:17: > Is a need to build several tables in HTML of the codes (with subcodes). > Each table has an explanation. It's like a roadmap. Hmm, what? Forget it, it's not so important.
Re: RFC: GIMPLE tuples. Design and implementation proposal
2007/4/10, Diego Novillo <[EMAIL PROTECTED]> wrote: J.C. Pizarro wrote on 04/10/07 10:24: > By example, worth weigths, use's frecuencies, statistical data, ... of GIMPLE. > To debug the GIMPLE too. That's kept separately. Pointer maps, hash tables... > How you debug the failed GIMPLE? Lots of debug_*() functions available. You also use -fdump-tree-... a lot. In the future, I would like us to be able to inject GIMPLE directly at any point in the pipeline to give us the illusion of unit testing. Of course. J.C. Pizarro :)
Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: Steven Bosscher wrote on 04/10/07 02:43: > On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: >> Thoughts/comments on the proposal? > > This looks a lot like the RTL insn! > > For locus, you can use just an "int" instead of a word if you use the > same representation for locations as we do for RTL (INSN_LOCATOR). You > mention this step as "straightforward to implement" after the > conversion to tuples is complete. Why did you decide not to just use > the scheme right from the start? One less thing to think about. That's the only reason, if anyone wants to work on this from day 1, then great. But since this is easily fixable afterwards, and we'll already have enough headaches with the basic conversion, it seemed simpler just to let this be for now. > this: You can use the same locator information in GIMPLE as in RTL, > which saves a conversion step; and you free up 32 bits on 64-bits > hosts, which is nice when you add the inevitable so-many-bits for > flags to the GIMPLE tuples ;-) Absolutely. The advantages are very clear. Location information at every insn is extremely redundant. > I don't really like the idea for promoting subcodes to first-level > codes, like you do for GS_COND NE and EQ. Looks complicated and > confusing to me. What is the benefit of this? Mostly, speed of recognition. I'm not totally against dropping this. As Andrew M. mentioned during our internal discussions, we will now have to implement predicates that recognize all the insns in the "GS_COND" family. This is something that we can do some experimentation. Will this replace the tree code class or is it merely sub-classing and the sub-code is the real code we have now? Richard.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Richard Guenther wrote on 04/10/07 10:45: > On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: >> Steven Bosscher wrote on 04/10/07 02:43: >>> I don't really like the idea for promoting subcodes to first-level >>> codes, like you do for GS_COND NE and EQ. Looks complicated and >>> confusing to me. What is the benefit of this? >> Mostly, speed of recognition. I'm not totally against dropping this. >> As Andrew M. mentioned during our internal discussions, we will now have >> to implement predicates that recognize all the insns in the "GS_COND" >> family. >> >> This is something that we can do some experimentation. > > Will this replace the tree code class or is it merely sub-classing and > the sub-code > is the real code we have now? Sorry, I don't understand what you are trying to say.
Re: Inclusion in an official release of a new throw-like qualifier
Sergio Giro wrote: I perceived that many people think that the throw qualifiers, as described by the standard, are not useful Yes. But that's not a reason to add a slightly different non-standard feature that would require people already using standard exception specifications to rewrite everything. That's just a non-starter. You asked how likely your proposed extension is to be included in an official GCC release. My answer is, very unlikely. However, a flag to perform static checking of standard exception specifications and warn about violations is very likely to be included. Jason
Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: Richard Guenther wrote on 04/10/07 10:45: > On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: >> Steven Bosscher wrote on 04/10/07 02:43: >>> I don't really like the idea for promoting subcodes to first-level >>> codes, like you do for GS_COND NE and EQ. Looks complicated and >>> confusing to me. What is the benefit of this? >> Mostly, speed of recognition. I'm not totally against dropping this. >> As Andrew M. mentioned during our internal discussions, we will now have >> to implement predicates that recognize all the insns in the "GS_COND" >> family. >> >> This is something that we can do some experimentation. > > Will this replace the tree code class or is it merely sub-classing and > the sub-code > is the real code we have now? Sorry, I don't understand what you are trying to say. Well, we now have a tcc_comparison for example - is the gimple statement code something like that? Or will the grouping of gimple statement codes to code classes persist? If we were able to encode the class directly in the gimple statement we can avoid a memory reference to look up the operands class. Richard.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Richard Guenther wrote on 04/10/07 11:02: > Well, we now have a tcc_comparison for example - is the gimple statement code > something like that? Or will the grouping of gimple statement codes > to code classes > persist? If we were able to encode the class directly in the gimple > statement we > can avoid a memory reference to look up the operands class. For most situations, I would like to avoid the class lookup and be able to go off the statement code directly. I have to admit that I am not totally convinced that this promotion of subcodes to first-level codes is a good idea. Richard suggested it to avoid having to lookup the subcode when recognizing frequent codes like copy assignments. But that also means that we now have to add more cases to switch() and or need || predicates to determine what kind of GS_ASSIGN we are dealing with. I'm ready to be convinced either way. OTOH, either approach would not make the design drastically different. We could explore both options and get numbers.
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, 2007-04-10 at 00:49 -0400, Diego Novillo wrote: > Following up on the recent discussion about GIMPLE tuples > (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized > our main ideas and implementation proposal in the attached document. > > This should be enough to get the implementation going, but there will be > many details that still need to be addressed. > > Thoughts/comments on the proposal? As I mentioned earlier, I suspect that promoting subcodes to primary codes carries enough of a penalty that its unlikely to be a good idea most of the time. As long as the interface routines hide all these accesses its cheap enough to experiment with tho. I also think 16 bits for a subcode is excessive, and if some of the primary opcodes are already planning to use this field as flags, perhaps we should make that part of the design right from the start. It wont look like a hack later :-) Primary opcodes should start numbering their subcodes at 0, and I would think 8 bits would be more than enough for some time. You can then reserve/label the other 8 bits as flags. At least as a starting point, do something along those lines. Andrew
Re: Information regarding -fPIC support for Interix gcc
On Tue, Apr 10, 2007 at 05:05:36AM +0800, Mayank Kumar wrote: > That information was really very helpful. I have been able to localize the bug. The issue is in the assembler. When I create a object file using the assembler(as test.s -o test.o), the contents of .rdata which contains the jump table is all wrong. At this point you're probably better off asking for help from the binutils list, where you can find the experts on the assembler and linker.
A microoptimization of isnegative or greaterthan2millions.
/* Given X an unsigned of 32 bits, and Y a bool. Try to translate optimizing * * Y = X > 2147483647; to Y = ((signed)X) < 0; * Y = X >= 2147483648; to Y = ((signed)X) < 0; * * [ Another optimization is to Y = (X >> 31) ] * * The opposite (ELSE): * * Y = X <= 2147483647; to Y = ((signed)X) >= 0; * Y = X < 2147483648; to Y = ((signed)X) >= 0; * * [ Another optimization is to Y = ((~X) >> 31) ] * * 2147483647=0x7FFF 2147483648=0x8000 * * The unsigned comparison is become to signed comparison. * * by J.C. Pizarro */ #include /* isnegative means greaterthan2millions */ int isnegative_1(unsigned int X) { int result; // Y is the conditional expression of if-else. if (X > 2147483647) result = 1; elseresult = 0; return result; } int isnegative_2(unsigned int X) { int result; // Y is the conditional expression of if-else. if (X >= 2147483648U) result = 1; else result = 0; return result; } int isnegative_3(unsigned int X) { int result; // Y is the conditional expression of if-else. if (X <= 2147483647) result = 0; else result = 1; return result; } int isnegative_4(unsigned int X) { int result; // Y is the conditional expression of if-else. if (X < 2147483648U) result = 0; else result = 1; return result; } int isnegative_optimized_1(unsigned int X) { int result; // Y is the conditional expression of if-else. if (((signed)X) < 0) result = 1; else result = 0; return result; } int isnegative_optimized_2(unsigned int X) { int result; // Y is the conditional expression of if-else. if (((signed)X) >= 0) result = 0; else result = 1; return result; } int isnegative_optimized_3(unsigned int X) { int result; // Y is the conditional expression of if-else. if (X >> 31) result = 1; else result = 0; return result; } int isnegative_optimized_4(unsigned int X) { int result; // Y is the conditional expression of if-else. if ((~X) >> 31) result = 0; elseresult = 1; return result; } int are_equivalent_isnegative(unsigned int X) { int equiv=1,isneg; isneg = isnegative_1(X); equiv = equiv && (isnegative_2(X) == isneg); equiv = equiv && (isnegative_3(X) == isneg); equiv = equiv && (isnegative_4(X) == isneg); equiv = equiv && (isnegative_optimized_1(X) == isneg); equiv = equiv && (isnegative_optimized_2(X) == isneg); equiv = equiv && (isnegative_optimized_3(X) == isneg); equiv = equiv && (isnegative_optimized_4(X) == isneg); return equiv; } int main(int argc,char *argv[]) { long long X; int testOK=1; for (X=0LL;(X<=0x0LL)&&testOK;X++) { testOK = are_equivalent_isnegative((unsigned int)(X&0x)); } if (testOK) printf("Full test of isnegative is PASSED.\n"); elseprintf("Full test of isnegative is FAILED.\n"); return 0; } -- # gcc version 4.1.3 20070326 (prerelease) Full test of isnegative is PASSED. notl%eax shrl$31, %eax xorl$1, %eax IS WORSE THAN shrl$31, %eax - xorl%eax, %eax cmpl$0, 4(%esp) sets%al IS WORSE THAN movl4(%esp), %eax shrl$31, %eax -- J.C. Pizarro isnegative_20070410-1.tar.gz Description: GNU Zip compressed data
Re: A microoptimization of isnegative or greaterthan2millions.
I'm sorry, i did want to say 2 billions, not 2 millions. J.C. Pizarro
Re: RFC: GIMPLE tuples. Design and implementation proposal
Diego Novillo <[EMAIL PROTECTED]> writes: > J.C. Pizarro wrote on 04/10/07 02:08: > > > However, they've appeared the "conditional moves" to don't jump > > and consecuently to reduce the penalization of the conditional jump. > > We already have conditional moves. Notice that subcodes for GS_ASSIGN > have the same meaning as they do today. GS_ASSIGN:{EQ_EXPR,NE_EXPR...} > will have three operands. Don't you need four operands for a conditional move? Is that what you meant? Ian
Re: A microoptimization of isnegative or greaterthan2millions.
"J.C. Pizarro" <[EMAIL PROTECTED]> writes: > /* Given X an unsigned of 32 bits, and Y a bool. Try to translate optimizing > * > * Y = X > 2147483647; to Y = ((signed)X) < 0; > * Y = X >= 2147483648; to Y = ((signed)X) < 0; > * > * [ Another optimization is to Y = (X >> 31) ] As far as I can tell, you are recommending that gcc generate a different code sequence than it currently does. The most helpful approach you can use for such a suggestion is to open a bug report marked as an enhancement. See http://gcc.gnu.org/bugs.html. Postings to gcc@gcc.gnu.org are not wrong, but they will almost certainly get lost. An entry in the bug database will not get lost. Thanks. Ian
Re: RFC: GIMPLE tuples. Design and implementation proposal
Diego Novillo <[EMAIL PROTECTED]> writes: > Following up on the recent discussion about GIMPLE tuples > (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized > our main ideas and implementation proposal in the attached document. > > This should be enough to get the implementation going, but there will be > many details that still need to be addressed. > > Thoughts/comments on the proposal? As you know, we have a lot of basic optimizations in fold-const.c that operatee on trees. We also have a lot of basic optimizations in simplify-rtx.c that operate on RTL. These sets of optimizations are independent implementations. What can we do to avoid implementing a third set of basic optimizations that operate on GIMPLE? I seem to recall that at one point somebody worked on a gensimplify program or something like that. Would it make sense to revive that approach, and use it to generate simplifiers for trees, GIMPLE, and RTL, to avoid triplification of these basic optimizations? Or should we instead rewrite fold-const.c to work on GIMPLE rather than trees, thus essentially removing constant folding from the front-ends? If we follow that path somebody would need to think about the effect on warnings issued by the front-end, and on __builtin_constant_p. Ian
Re: RFC: GIMPLE tuples. Design and implementation proposal
Hi Diego Novillo Your "Tuple representation" of the GIMPLE instructions was: HEADER: code16 bits subcode 16bits nextword prevword bb word locus word block word BODY: OP0 word .. OPN word I've a little idea, Can i remove the word "prev"? Thanks to "bb", i can traverse the short list of the small basic block getted from its hashtable. If i do it then it's one word less :) Same with the word "block" from "bb". Two word less :) My proposed initial "Tuple representation" of the GIMPLE instructions will be: HEADER: code16 bits subcode 8bits markbits 8 bits nextword# removed prev word bb word locus word# removed block word BODY: OP0 word .. OPN word Thanks friendly J.C. Pizarro
Re: RFC: GIMPLE tuples. Design and implementation proposal
Ian Lance Taylor wrote on 04/10/07 13:53: > Don't you need four operands for a conditional move? Is that what you > meant? Ah, yes. The two comparison operands and the two assignment values.
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Apr 10, 2007, at 10:53 AM, Ian Lance Taylor wrote: I seem to recall that at one point somebody worked on a gensimplify program or something like that. Would it make sense to revive that approach, and use it to generate simplifiers for trees, GIMPLE, and RTL, to avoid triplification of these basic optimizations? Hum... Almost sounds like a job for a template... [ takes step backwards ]
Re: Integer overflow in operator new
On 4/9/07, Dave Korn <[EMAIL PROTECTED]> wrote: On 09 April 2007 21:49, Lawrence Crowl wrote: >>> The optimization above would be wrong for such machines because >>> the allocation would be smaller than the requested size. >> >> To request a size of ~size_t(0) is to request a size >> of 0x or 0xULL that the allocator >> will always "sorry, there is not memory of 0x or >> 0xULL bytes. > > Except on a segmented machine, where it will say "here you go!" > On segmented architectures sizeof(size_t) < sizeof(void*). Well, we could simply use uintptr_t instead of size_t. The standards are pretty insistent on uintptr_t. It appears, however, that gcc essentially requires a linear address space, so my objection is a moot. But it anyone were to do a segmented gcc, -- Lawrence Crowl
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, 2007-04-10 at 19:54 +0200, J.C. Pizarro wrote: > Hi Diego Novillo > > Your "Tuple representation" of the GIMPLE instructions was: > HEADER: > code16 bits > subcode 16bits > nextword > prevword > bb word > locus word > block word > BODY: > OP0 word > .. > OPN word > > I've a little idea, > > Can i remove the word "prev"? > Thanks to "bb", i can traverse the short list of > the small basic block getted from its hashtable. Do you mean implement this as a single linked list and then to find the previous instruction, start at the beginning of the block and traverse forward? Back in the early days of tree-ssa we did such a thing with the tree iterators. It was too slow. there are many LARGE basic blocks which make the computation too expensive for any pass that goes backwards through a block. Andrew
Re: RFC: GIMPLE tuples. Design and implementation proposal
Diego Novillo <[EMAIL PROTECTED]> writes: > Following up on the recent discussion about GIMPLE tuples > (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized > our main ideas and implementation proposal in the attached document. > > This should be enough to get the implementation going, but there will be > many details that still need to be addressed. > > Thoughts/comments on the proposal? For purposes of LTO, it is essential that we be able to efficiently load the IR during the whole program optimization phase. Part of that is almost certainly going to mean having some sort of index which will tell us whether to load the IR at all--if the functions represented in some .o file are rarely called, then we should use the .o file as-is, and not try to further optimize it. This is not part of the current LTO plan, but I think it is inevitable if we are to avoid an hours-long compilation process. But there is another part: we need to have an IR which can be very quickly loaded into memory for further processing. When it comes to loading IR, there is nothing faster than mmap. That requires that the IR be stored in the .o file in a form which can be used directly when it is read in from memory. And ideally that means no pointer swizzling: the IR should be usable when loaded without modification. And because the link phase can see arbitrary .o files, we can not use the PCH hack of requiring a specific memory address. So that requires an IR which is position independent. The obvious way to make the proposed tuples position independent would be to use array offsets rather than pointers. This has the obvious disadvantage that every access through a pointer requires an additional memory reference. On the other hand, it has some other advantages: it may no longer be necessary to keep a previous pointer in each tuple; we can delete tuples by marking them with a deleted code, and we can periodically garbage collect deleted tuples and fix up the next pointers. On a 64-bit system, we do not need to burn 64 bits for each pointer; 32 bits will be sufficient for an array offset. I would like us to seriously think about this approach. Most of the details would be hidden by accessor macros when it comes to actual coding. The question is whether we can tolerate some slow down for normal processing in return for a benefit to LTO. If anybody can see how to get the best of both worlds, that would of course be even better. Ian
Re: RFC: GIMPLE tuples. Design and implementation proposal
Ian Lance Taylor wrote on 04/10/07 13:53: > I seem to recall that at one point somebody worked on a gensimplify > program or something like that. Would it make sense to revive that > approach, and use it to generate simplifiers for trees, GIMPLE, and > RTL, to avoid triplification of these basic optimizations? Perhaps. This would allow us to define folding/simplification using a pattern matching system. I think I like this better than the other two choices. Replicating fold-const.c for GIMPLE would involve a bit of code duplication, but since GIMPLE is a strict subset of ASTs, I think it would be a fraction of what we have today. Still, it's annoying and we should probably avoid it. > Or should we instead rewrite fold-const.c to work on GIMPLE rather > than trees, thus essentially removing constant folding from the > front-ends? If we follow that path somebody would need to think about > the effect on warnings issued by the front-end, and on > __builtin_constant_p. I don't think we want to do that. Folding and simplification needs to be done at just about every level.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Mike Stump <[EMAIL PROTECTED]> writes: > On Apr 10, 2007, at 10:53 AM, Ian Lance Taylor wrote: > > I seem to recall that at one point somebody worked on a gensimplify > > program or something like that. Would it make sense to revive that > > approach, and use it to generate simplifiers for trees, GIMPLE, and > > RTL, to avoid triplification of these basic optimizations? > > Hum... Almost sounds like a job for a template... [ takes step > backwards ] I think making this work as a template would be kind of tough, actually, if you think about the details (e.g., PLUS_EXPR vs. PLUS vs. GS_PLUS_EXPR). Doable but rather awkward. I guess you could use dozens of template parameters, most of them integers providing the code values. Virtual functions and a table of operator codes would probably work just as well, and be slightly more comprehensible. Ian
Re: A microoptimization of isnegative or greaterthan2millions.
10 Apr 2007 10:53:08 -0700, Ian Lance Taylor <[EMAIL PROTECTED]> wrote: As far as I can tell, you are recommending that gcc generate a different code sequence than it currently does. The most helpful approach you can use for such a suggestion is to open a bug report marked as an enhancement. See http://gcc.gnu.org/bugs.html. Postings to gcc@gcc.gnu.org are not wrong, but they will almost certainly get lost. An entry in the bug database will not get lost. Thanks. Ian Thanks, bug reported as enhancement in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31531
Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/9/07, Andrew Pinski <[EMAIL PROTECTED]> wrote: Also I noticed in your pdf, you have "PHI NODE" as 12%, we can improve the memory usage for this statement by removing the usage of TREE_CHAIN/TREE_TYPE, so we can save 4/8 bytes for those 12% without doing much work. I can send a patch in the next week or so (I am busy at a conference the next two days but I can start writting a patch tomorrow). Here is the quick patch (thanks to the work done for gimple tuple) which does this, removes the unneeded type from phi nodes. I have not tested it except for a quick test on some small testcases so there might be more places which use TREE_CHAIN instead of PHI_CHAIN. -- Pinski Index: tree.h === --- tree.h (revision 123691) +++ tree.h (working copy) @@ -951,7 +951,7 @@ (TREE_CODE_CLASS (TREE_CODE ((NODE))) == tcc_gimple_stmt) /* Nonzero if NODE is a GIMPLE tuple. */ -#define GIMPLE_TUPLE_P(NODE) (GIMPLE_STMT_P (NODE)) +#define GIMPLE_TUPLE_P(NODE) (GIMPLE_STMT_P (NODE) || TREE_CODE (NODE) == PHI_NODE) /* A GIMPLE tuple that has a ``locus'' field. */ #define GIMPLE_TUPLE_HAS_LOCUS_P(NODE) GIMPLE_STMT_P ((NODE)) @@ -1866,7 +1866,7 @@ /* PHI_NODEs for each basic block are chained together in a single linked list. The head of the list is linked from the block annotation, and the link to the next PHI is in PHI_CHAIN. */ -#define PHI_CHAIN(NODE)TREE_CHAIN (PHI_NODE_CHECK (NODE)) +#define PHI_CHAIN(NODE)PHI_NODE_CHECK (NODE)->phi.chain #define PHI_NUM_ARGS(NODE) PHI_NODE_CHECK (NODE)->phi.num_args #define PHI_ARG_CAPACITY(NODE) PHI_NODE_CHECK (NODE)->phi.capacity @@ -1885,7 +1885,8 @@ struct tree_phi_node GTY(()) { - struct tree_common common; + struct tree_base common; + tree chain; tree result; int num_args; int capacity; Index: tree-ssa-structalias.c === --- tree-ssa-structalias.c (revision 123691) +++ tree-ssa-structalias.c (working copy) @@ -4730,7 +4730,7 @@ block_stmt_iterator bsi; tree phi; - for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi)) + for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi)) { if (is_gimple_reg (PHI_RESULT (phi))) { @@ -4882,7 +4882,7 @@ block_stmt_iterator bsi; tree phi; - for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi)) + for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi)) { if (is_gimple_reg (PHI_RESULT (phi))) { Index: tree-phinodes.c === --- tree-phinodes.c (revision 123691) +++ tree-phinodes.c (working copy) @@ -218,7 +218,6 @@ TREE_SET_CODE (phi, PHI_NODE); PHI_NUM_ARGS (phi) = len; PHI_ARG_CAPACITY (phi) = capacity; - TREE_TYPE (phi) = TREE_TYPE (var); if (TREE_CODE (var) == SSA_NAME) SET_PHI_RESULT (phi, var); else
Re: RFC: GIMPLE tuples. Design and implementation proposal
2007/4/10, Andrew MacLeod <[EMAIL PROTECTED]> wrote: On Tue, 2007-04-10 at 19:54 +0200, J.C. Pizarro wrote: > Can i remove the word "prev"? > Thanks to "bb", i can traverse the short list of > the small basic block getted from its hashtable. Do you mean implement this as a single linked list and then to find the previous instruction, start at the beginning of the block and traverse forward? Back in the early days of tree-ssa we did such a thing with the tree iterators. It was too slow. there are many LARGE basic blocks which make the computation too expensive for any pass that goes backwards through a block. Andrew Frequently, how big are the basic blocks in the 90% of the non-multimedia code when using general purpose instructions with many jumps/calls/ifs? I think that 10% of the multimedia code (hard to find) is little penalization of few minutes. So, it's not problematic, i suppose. J.C. Pizarro
Re: RFC: GIMPLE tuples. Design and implementation proposal
Ian Lance Taylor wrote on 04/10/07 14:13: > Diego Novillo <[EMAIL PROTECTED]> writes: > >> Following up on the recent discussion about GIMPLE tuples >> (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized >> our main ideas and implementation proposal in the attached document. >> >> This should be enough to get the implementation going, but there will be >> many details that still need to be addressed. >> >> Thoughts/comments on the proposal? > > For purposes of LTO, it is essential that we be able to efficiently > load the IR during the whole program optimization phase. Certainly. Now, is GIMPLE going to be our bytecode? Or do we want to invest the time in some other bytecode representation with an eye towards a future JIT component? Regardless of that, streaming GIMPLE is certainly something worth pursuing. Even if it is just to give us the ability to load IL snippets and inject them into the optimizers without having to go through all the intermediate steps. > Part of that is almost certainly going to mean having some sort of > index which will tell us whether to load the IR at all--if the > functions represented in some .o file are rarely called, then we > should use the .o file as-is, and not try to further optimize it. > This is not part of the current LTO plan, but I think it is inevitable > if we are to avoid an hours-long compilation process. > > But there is another part: we need to have an IR which can be very > quickly loaded into memory for further processing. When it comes to > loading IR, there is nothing faster than mmap. That requires that the > IR be stored in the .o file in a form which can be used directly when > it is read in from memory. And ideally that means no pointer > swizzling: the IR should be usable when loaded without modification. > And because the link phase can see arbitrary .o files, we can not use > the PCH hack of requiring a specific memory address. So that requires > an IR which is position independent. > > The obvious way to make the proposed tuples position independent would > be to use array offsets rather than pointers. This has the obvious > disadvantage that every access through a pointer requires an > additional memory reference. On the other hand, it has some other > advantages: it may no longer be necessary to keep a previous pointer I doubt this. We had started with single-linked chains but reverse traversals do occur, and they were very painful and slow. > in each tuple; we can delete tuples by marking them with a deleted > code, and we can periodically garbage collect deleted tuples and fix > up the next pointers. On a 64-bit system, we do not need to burn 64 > bits for each pointer; 32 bits will be sufficient for an array offset. > > I would like us to seriously think about this approach. Most of the > details would be hidden by accessor macros when it comes to actual > coding. The question is whether we can tolerate some slow down for > normal processing in return for a benefit to LTO. > > If anybody can see how to get the best of both worlds, that would of > course be even better. I've thought about this a little bit and it may not be all that onerous. So, if you take the components of a tuple: nextCould be a UID to the next tuple prevLikewise bb Use bb->index here locus Not needed. INSN_LOCATOR. block Likewise. The operands may get tricky, but perhaps not so much. We have a- _DECLs. These are easily replaced with their UID and a symbol table. b- SSA_NAMEs. Just the SSA name version is enough. c- *_CONST. They're just a bit pattern, no swizzling required. But we may need to define byte ordering. c- *_REF. These may be tricky, but we could emit them using a REF table and just put the index here. We then reserve the first few bits to distinguish the class of operand and the remaining bits as the index into the respective table.
Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: Following up on the recent discussion about GIMPLE tuples (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized our main ideas and implementation proposal in the attached document. This should be enough to get the implementation going, but there will be many details that still need to be addressed. Thoughts/comments on the proposal? Hi Diego, There is no need for the addresses_taken bitmap, it's a waste of space. It is, in fact, currenly write-only except for 2 places: 1. Create_structure_vars, which uses it to shortcut determining which variables may need updating 2. Escape analysis, which uses it as a shortcut to determining which variables had their addresses taken. Neither of these really needs it, and i have a patch to remove it entirely. Thanks.
Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/10/07, Diego Novillo <[EMAIL PROTECTED]> wrote: Ian Lance Taylor wrote on 04/10/07 13:53: > I seem to recall that at one point somebody worked on a gensimplify > program or something like that. Would it make sense to revive that > approach, and use it to generate simplifiers for trees, GIMPLE, and > RTL, to avoid triplification of these basic optimizations? Perhaps. This would allow us to define folding/simplification using a pattern matching system. I think I like this better than the other two choices. Yes, I worked on gensimplify, but i gave up when i realized it will only work in the most trivial of cases for the case of RTL and trees, because of how different modes and types are. (modes were very well defined and simple, types are complex). This may or may not be true of GIMPLE and tree. That said, you are going to need to come up with a pretty complex pattern matching language (that includes being able to, for example, compare precision of types), or write a lot of the predicates externally in order to represent a lot of the fold optimizations. It may make adding future folding and simplification easier, but it's going to be a pain in the ass to rework into a pattern matching style :)
RE: RFC: GIMPLE tuples. Design and implementation proposal
On 10 April 2007 20:02, Diego Novillo wrote: >> The obvious way to make the proposed tuples position independent would >> be to use array offsets rather than pointers. This has the obvious >> disadvantage that every access through a pointer requires an >> additional memory reference. On the other hand, it has some other >> advantages: it may no longer be necessary to keep a previous pointer > > I doubt this. We had started with single-linked chains but reverse > traversals do occur, and they were very painful and slow. Reverse-traversing an array really isn't all that painful or slow! >> in each tuple; we can delete tuples by marking them with a deleted >> code, and we can periodically garbage collect deleted tuples and fix >> up the next pointers. On a 64-bit system, we do not need to burn 64 >> bits for each pointer; 32 bits will be sufficient for an array offset. >> >> I would like us to seriously think about this approach. Most of the >> details would be hidden by accessor macros when it comes to actual >> coding. The question is whether we can tolerate some slow down for >> normal processing in return for a benefit to LTO. >> >> If anybody can see how to get the best of both worlds, that would of >> course be even better. > > I've thought about this a little bit and it may not be all that onerous. > So, if you take the components of a tuple: > > nextCould be a UID to the next tuple > prevLikewise How about delta-linked lists? Makes your iterators bigger, but makes every single node smaller. cheers, DaveK -- Can't think of a witty .sigline today
Re: RFC: GIMPLE tuples. Design and implementation proposal
Daniel Berlin wrote on 04/10/07 15:18: > There is no need for the addresses_taken bitmap, it's a waste of space. Awesome. I was going to check, but I forgot. I did check the stmt_makes_clobbering_call and that one is also write-only. > Neither of these really needs it, and i have a patch to remove it entirely. Excellent. Thanks.
RE: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, 2007-04-10 at 20:39 +0100, Dave Korn wrote: > On 10 April 2007 20:02, Diego Novillo wrote: > > >> The obvious way to make the proposed tuples position independent would > >> be to use array offsets rather than pointers. This has the obvious > >> disadvantage that every access through a pointer requires an > >> additional memory reference. On the other hand, it has some other > >> advantages: it may no longer be necessary to keep a previous pointer > > > > I doubt this. We had started with single-linked chains but reverse > > traversals do occur, and they were very painful and slow. > > Reverse-traversing an array really isn't all that painful or slow! I don't think we're talking about traversing an array, but rather a linked list which uses an array index rather than a pointer. You don't want to go moving around array slices everytime you add instructions. You will still need a PREV index if you want to avoid finding prev's by starting at the beginning of a BB and scanning forward. We've already done this in tree-ssa and it is a big loss on passes which work on the instruction stream in reverse. Andrew
Re: RFC: GIMPLE tuples. Design and implementation proposal
Dave Korn wrote on 04/10/07 15:39: > Reverse-traversing an array really isn't all that painful or slow! Instructions are not laid out in an array. Insertion and removal is done constantly. It would be very expensive to use arrays to represent basic blocks. Insertions and deletions are all too common. > How about delta-linked lists? Makes your iterators bigger, but makes every > single node smaller. Worth a shot, I guess. Don't recall what other properties these things had.
Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/10/07, Andrew Pinski <[EMAIL PROTECTED]> wrote: Here is the quick patch (thanks to the work done for gimple tuple) which does this, removes the unneeded type from phi nodes. I have not tested it except for a quick test on some small testcases so there might be more places which use TREE_CHAIN instead of PHI_CHAIN. This patch has one problem, the GC issue with chain_next and lang_tree_node, this problem is not hard to fix. -- Pinski
Re: RFC: GIMPLE tuples. Design and implementation proposal
Andrew MacLeod <[EMAIL PROTECTED]> writes: > > Do you mean implement this as a single linked list and then to find the > previous instruction, start at the beginning of the block and traverse > forward? Back in the early days of tree-ssa we did such a thing with the > tree iterators. It was too slow. there are many LARGE basic blocks which > make the computation too expensive for any pass that goes backwards > through a block. In theory you could use xor lists (xor next and prev and if you know one element you can get the other). Problem is that garbage collectors require special casing for them and debuggers need special macros to walk those. -Andi
Re: RFC: GIMPLE tuples. Design and implementation proposal
Andrew Pinski wrote on 04/10/07 16:04: > On 4/10/07, Andrew Pinski <[EMAIL PROTECTED]> wrote: >> Here is the quick patch (thanks to the work done for gimple tuple) >> which does this, removes the unneeded type from phi nodes. I have not >> tested it except for a quick test on some small testcases so there >> might be more places which use TREE_CHAIN instead of PHI_CHAIN. > > This patch has one problem, the GC issue with chain_next and > lang_tree_node, this problem is not hard to fix. I'm not sure what you want me to do with this patch. It's not related to this thread and would not be applicable to the tuples branch. I would suggest that you thoroughly test the patch, measure the benefit and if it's good enough propose it for 4.3.
Re: RFC: GIMPLE tuples. Design and implementation proposal
2007/4/10, Dave Korn <[EMAIL PROTECTED]> wrote: On 10 April 2007 20:02, Diego Novillo wrote: >> The obvious way to make the proposed tuples position independent would >> be to use array offsets rather than pointers. This has the obvious >> disadvantage that every access through a pointer requires an >> additional memory reference. On the other hand, it has some other >> advantages: it may no longer be necessary to keep a previous pointer > > I doubt this. We had started with single-linked chains but reverse > traversals do occur, and they were very painful and slow. Reverse-traversing an array really isn't all that painful or slow! >> in each tuple; we can delete tuples by marking them with a deleted >> code, and we can periodically garbage collect deleted tuples and fix >> up the next pointers. On a 64-bit system, we do not need to burn 64 >> bits for each pointer; 32 bits will be sufficient for an array offset. >> >> I would like us to seriously think about this approach. Most of the >> details would be hidden by accessor macros when it comes to actual >> coding. The question is whether we can tolerate some slow down for >> normal processing in return for a benefit to LTO. >> >> If anybody can see how to get the best of both worlds, that would of >> course be even better. > > I've thought about this a little bit and it may not be all that onerous. > So, if you take the components of a tuple: > > nextCould be a UID to the next tuple > prevLikewise How about delta-linked lists? Makes your iterators bigger, but makes every single node smaller. cheers, DaveK -- Can't think of a witty .sigline today delta-linked lists? Better one iterator than complicating this with several iterators. Function prev_of_this_instruction(word) -> word The singleton (global structure) of prev_of_this_instruction contains a small cache to accelerate the reverse-traversing. This cache is like a hashmap of { word this instruction => word prev instruction }. If this instruction isn't in the cache then go to the beginning of a BB of this instruction and start to traverse adding pointers to the cache. Use LRU/old-age-used/less-frequently-used to replace their entries. To have many smaller nodes in the heap is better. J.C. Pizarro :)
RE: RFC: GIMPLE tuples. Design and implementation proposal
On 10 April 2007 21:03, Diego Novillo wrote: > Dave Korn wrote on 04/10/07 15:39: > >> Reverse-traversing an array really isn't all that painful or slow! > > Instructions are not laid out in an array. Insertion and removal is > done constantly. It would be very expensive to use arrays to represent > basic blocks. Insertions and deletions are all too common. Agh, yes. >> How about delta-linked lists? Makes your iterators bigger, but makes >> every single node smaller. > > Worth a shot, I guess. Don't recall what other properties these things had. I only just learned about them, so I may be guilty of seeing every problem as a nail :-), but there's a quick summary at: http://forums.worsethanfailure.com/forums/permalink/117726/117749/ShowThread.a spx#117749 One slight catch is we'd have to be sure all the nodes were allocated from the same 'object', perhaps an obstack or pool or heap, so that we're allowed to add/subtract/compare pointers to them without invoking undefined behaviour. cheers, DaveK -- Can't think of a witty .sigline today
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, 2007-04-10 at 16:03 -0400, Diego Novillo wrote: > Dave Korn wrote on 04/10/07 15:39: > > > Reverse-traversing an array really isn't all that painful or slow! > > Instructions are not laid out in an array. Insertion and removal is > done constantly. It would be very expensive to use arrays to represent > basic blocks. Insertions and deletions are all too common. > > > How about delta-linked lists? Makes your iterators bigger, but makes > > every > > single node smaller. > > Worth a shot, I guess. Don't recall what other properties these things had. Personally, just stick with the double linked lists, be it via pointers or array index. Any of these other suggestions either complicate the algorithms or slow down traversal or both to save that word of memory and slow down the initial implementation. These details are easily changed after the initial TUPLE implementation by replacing the meat of the next/prev iterator. There will be lots of time for everyone to try their favorite double linked list alternative before we write TUPLES out to a file in some production compiler and commit ourselves to specific memory footprint. As long as the entire thing has a clean interface, changing details like this is trivial. Whats important is whether the proposal meets what we expect our future needs to be, such as LTO and such. Have we missed anything critical... Andrew
Re: RFC: GIMPLE tuples. Design and implementation proposal
2007/4/10, Andrew MacLeod <[EMAIL PROTECTED]> wrote: Personally, just stick with the double linked lists, be it via pointers or array index. Any of these other suggestions either complicate the algorithms or slow down traversal or both to save that word of memory and slow down the initial implementation. These details are easily changed after the initial TUPLE implementation by replacing the meat of the next/prev iterator. There will be lots of time for everyone to try their favorite double linked list alternative before we write TUPLES out to a file in some production compiler and commit ourselves to specific memory footprint. As long as the entire thing has a clean interface, changing details like this is trivial. Whats important is whether the proposal meets what we expect our future needs to be, such as LTO and such. Have we missed anything critical... Andrew Of course, i just stick with the double linked lists too. The reason is to attain minus KLOCs of implementation and more performance in the accesses because i've not problem with the 2 GiB of RAM of my old PC. Sincerely, J.C. Pizarro :)
Re: Inclusion in an official release of a new throw-like qualifier
With respect to this: Jason Merrill wrote: Yes. But that's not a reason to add a slightly different non-standard feature that would require people already using standard exception specifications to rewrite everything. That's just a non-starter. Maybe I missed some point: why everything should be rewritten? As far as I understand, nothing needs to be rewritten. The programs using the standard qualifier throw will not be aware of the new feature, and they will compile and run as if the feature were not implemented. The standard throw qualifier must be kept, and a new qualifier (static throw, or _throw) would be added with the new meaning. If you like it, you can use it by specifying -fallow_static_throws. If you don't like it, don't use the -fallow... and your code will be standard and it will have the standard semantics... Moreover, if you use the -fallow_static_throws the code not using "static throw" or "_throw" will compile and run as if the feature were not implemented... The only bad thing here is that you have two qualifiers having similar meanings... But I think it must be that way in order to _avoid_ rewriting code. However, a flag to perform static checking of standard exception specifications and warn about violations is very likely to be included. The point here is that, in order to do this, you need interprocedural analysis. If you have qualifiers as the one I describe, you can perform the check by merely using the prototypes... I agree with Jason that an extension which implies rewriting everything is not very likely to be included, but it is not case. So, what do you think now? Cheers, Sergio
Re: Inclusion in an official release of a new throw-like qualifier
On Apr 10, 2007, at 2:06 PM, Sergio Giro wrote: Maybe I missed some point: why everything should be rewritten? Let me try again. The standard way to add a new qualifier in g++, is to add it in an attribute, please do that. The possible responses are, no, I want to be different, or ok. If you choose the former, you have to back your position. For throw specs, that attribute is the standard one that goes on the function/method. The name or the spec can be statically_check_eh_spec, or something less verbose. The only bad thing here is that you have two qualifiers having similar meanings... Ding. But I think it must be that way No, this is incorrect. in order to _avoid_ rewriting code. The point here is that, in order to do this, you need interprocedural analysis. You've not yet grasped they are isomorphic forms. If the above is true, then then below is wrong. If the below is not wrong, then the above must be wrong. Your pick. If you have qualifiers as the one I describe, you can perform the check by merely using the prototypes... So, what do you think now? Unchanged.
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, 2007-04-10 at 15:02 -0400, Diego Novillo wrote: > Ian Lance Taylor wrote on 04/10/07 14:13: > > I would like us to seriously think about this approach. Most of the > > details would be hidden by accessor macros when it comes to actual > > coding. The question is whether we can tolerate some slow down for > > normal processing in return for a benefit to LTO. > > > > If anybody can see how to get the best of both worlds, that would of > > course be even better. > > I've thought about this a little bit and it may not be all that onerous. > So, if you take the components of a tuple: > > nextCould be a UID to the next tuple > prevLikewise > bb Use bb->index here > locus Not needed. INSN_LOCATOR. > block Likewise. > > The operands may get tricky, but perhaps not so much. We have > > a- _DECLs. These are easily replaced with their UID and a symbol table. > b- SSA_NAMEs. Just the SSA name version is enough. > c- *_CONST. They're just a bit pattern, no swizzling required. But we > may need to define byte ordering. > c- *_REF. These may be tricky, but we could emit them using a REF table > and just put the index here. > > We then reserve the first few bits to distinguish the class of operand > and the remaining bits as the index into the respective table. I also think this is worth exploring. Probably as a side project once the initial implementation is done. (It would also serve as a proof of interface abstraction :-) And we'd be able to compare what the actual performance hit is and trade it off against the benefits. We'd also be able to "linearize" things when we write it out... ie, flatten out the linked lists of instruction so they are all continguous indexes, tables etc. this would make it compact and when its read in, we might even see some cache friendly accesses for a change. Andrew
PHI_NODE memory reduction, WAS: Re: RFC: GIMPLE tuples. Design and implementation proposal
On 4/10/07, Andrew Pinski <[EMAIL PROTECTED]> wrote: On 4/10/07, Andrew Pinski <[EMAIL PROTECTED]> wrote: > Here is the quick patch (thanks to the work done for gimple tuple) > which does this, removes the unneeded type from phi nodes. I have not > tested it except for a quick test on some small testcases so there > might be more places which use TREE_CHAIN instead of PHI_CHAIN. This patch has one problem, the GC issue with chain_next and lang_tree_node, this problem is not hard to fix. And here is that patch still not fully test except this time compiling some simple C++ code and without ICEing in the gc collect :). Thanks, Andrew Pinski Index: tree.h === --- tree.h (revision 123697) +++ tree.h (working copy) @@ -951,7 +951,7 @@ (TREE_CODE_CLASS (TREE_CODE ((NODE))) == tcc_gimple_stmt) /* Nonzero if NODE is a GIMPLE tuple. */ -#define GIMPLE_TUPLE_P(NODE) (GIMPLE_STMT_P (NODE)) +#define GIMPLE_TUPLE_P(NODE) (GIMPLE_STMT_P (NODE) || TREE_CODE (NODE) == PHI_NODE) /* A GIMPLE tuple that has a ``locus'' field. */ #define GIMPLE_TUPLE_HAS_LOCUS_P(NODE) GIMPLE_STMT_P ((NODE)) @@ -975,6 +975,11 @@ used in hash tables which are saved to a PCH. */ #define TREE_HASH(NODE) ((size_t) (NODE) & 077) +/* The TREE_CHAIN but it is able to handle tuples. */ +#define GENERIC_NEXT(NODE) \ + (TREE_CODE (NODE) == PHI_NODE ? PHI_CHAIN (NODE) : \ + GIMPLE_STMT_P (NODE) ? NULL_TREE : TREE_CHAIN (NODE)) + /* Given an expression as a tree, strip any NON_LVALUE_EXPRs and NOP_EXPRs that don't change the machine mode. */ @@ -1866,7 +1871,7 @@ /* PHI_NODEs for each basic block are chained together in a single linked list. The head of the list is linked from the block annotation, and the link to the next PHI is in PHI_CHAIN. */ -#define PHI_CHAIN(NODE)TREE_CHAIN (PHI_NODE_CHECK (NODE)) +#define PHI_CHAIN(NODE)PHI_NODE_CHECK (NODE)->phi.chain #define PHI_NUM_ARGS(NODE) PHI_NODE_CHECK (NODE)->phi.num_args #define PHI_ARG_CAPACITY(NODE) PHI_NODE_CHECK (NODE)->phi.capacity @@ -1885,7 +1890,8 @@ struct tree_phi_node GTY(()) { - struct tree_common common; + struct tree_base common; + tree chain; tree result; int num_args; int capacity; Index: tree-ssa-structalias.c === --- tree-ssa-structalias.c (revision 123697) +++ tree-ssa-structalias.c (working copy) @@ -4730,7 +4730,7 @@ block_stmt_iterator bsi; tree phi; - for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi)) + for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi)) { if (is_gimple_reg (PHI_RESULT (phi))) { @@ -4882,7 +4882,7 @@ block_stmt_iterator bsi; tree phi; - for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi)) + for (phi = phi_nodes (bb); phi; phi = PHI_CHAIN (phi)) { if (is_gimple_reg (PHI_RESULT (phi))) { Index: tree-phinodes.c === --- tree-phinodes.c (revision 123697) +++ tree-phinodes.c (working copy) @@ -218,7 +218,6 @@ TREE_SET_CODE (phi, PHI_NODE); PHI_NUM_ARGS (phi) = len; PHI_ARG_CAPACITY (phi) = capacity; - TREE_TYPE (phi) = TREE_TYPE (var); if (TREE_CODE (var) == SSA_NAME) SET_PHI_RESULT (phi, var); else Index: ada/ada-tree.h === --- ada/ada-tree.h (revision 123697) +++ ada/ada-tree.h (working copy) @@ -36,7 +36,7 @@ /* Ada uses the lang_decl and lang_type fields to hold a tree. */ union lang_tree_node GTY((desc ("0"), - chain_next ("(GIMPLE_STMT_P (&%h.t) ? (union lang_tree_node *) 0 : (union lang_tree_node *)TREE_CHAIN (&%h.t))"))) + chain_next ("GENERIC_NEXT (&%h.generic)"))) { union tree_node GTY((tag ("0"))) t; Index: java/java-tree.h === --- java/java-tree.h(revision 123697) +++ java/java-tree.h(working copy) @@ -667,7 +667,7 @@ /* The resulting tree type. */ union lang_tree_node GTY((desc ("TREE_CODE (&%h.generic) == IDENTIFIER_NODE"), - chain_next ("(GIMPLE_STMT_P (&%h.generic) ? (union lang_tree_node *) 0 : (union lang_tree_node *)TREE_CHAIN (&%h.generic))"))) + chain_next ("GENERIC_NEXT (&%h.generic)"))) { union tree_node GTY ((tag ("0"), Index: cp/cp-tree.h === --- cp/cp-tree.h(revision 123697) +++ cp/cp-tree.h(working copy) @@ -539,7 +539,7 @@ /* The resulting tree type. */ union lang_tree_node GTY((desc ("cp_tree_node_structure (&%h)"), - chain_next ("(GIMPLE_STMT_P (&%h.generic) ? (union lan
Re: Inclusion in an official release of a new throw-like qualifier
Mike Stump wrote: Let me try again. The standard way to add a new qualifier in g++, is to add it in an attribute, please do that. OK, I agree. Let's say that a method will be declared as int method() throw(std::exception) __attribute__((static_exc_check)); (this is intended to have the same meaning of int method() _throw(std::exception); as explained before). With respect to this: > The point here is that, in order to do this, you need > interprocedural analysis. You've not yet grasped they are isomorphic forms. If the above is true, then then below is wrong. If the below is not wrong, then the above must be wrong. Your pick. > If you have qualifiers as the one I describe, you can perform the > check by merely using the prototypes... This check using prototypes is not more interprocedural than type-checking is... If your definition of interprocedural analysis includes type-checking, OK, I agree... In order to type check a method m, you need the prototypes of the methods m calls to. In addition, when you check the code of a method m, you compare the return type of the code with the return type in the prototype. It works the same way for the static check of exception specifications...
Re: RFC: GIMPLE tuples. Design and implementation proposal
Ian Lance Taylor wrote: > Diego Novillo <[EMAIL PROTECTED]> writes: > >> Following up on the recent discussion about GIMPLE tuples >> (http://gcc.gnu.org/ml/gcc/2007-03/msg01126.html), we have summarized >> our main ideas and implementation proposal in the attached document. >> >> This should be enough to get the implementation going, but there will be >> many details that still need to be addressed. >> >> Thoughts/comments on the proposal? > > For purposes of LTO, it is essential that we be able to efficiently > load the IR during the whole program optimization phase. Right. > Part of that is almost certainly going to mean having some sort of > index which will tell us whether to load the IR at all--if the > functions represented in some .o file are rarely called, then we > should use the .o file as-is, and not try to further optimize it. > This is not part of the current LTO plan, but I think it is inevitable > if we are to avoid an hours-long compilation process. I agree that this is probably going to be necessary. Whether that happens in the LTO "front end" proper, or we rely on the driver (or even, at first, the user) to prune out uninteresting .o files isn't clear to me. There may also be situations where we want to read just one functions out the .o file (which, perhaps, was compiled with -ffunction-sections to encourage this) and inline it into other things, but not read or recompile everything else in the .o file. > But there is another part: we need to have an IR which can be very > quickly loaded into memory for further processing. When it comes to > loading IR, there is nothing faster than mmap. That requires that the > IR be stored in the .o file in a form which can be used directly when > it is read in from memory. And ideally that means no pointer > swizzling: the IR should be usable when loaded without modification. > And because the link phase can see arbitrary .o files, we can not use > the PCH hack of requiring a specific memory address. So that requires > an IR which is position independent. > > The obvious way to make the proposed tuples position independent would > be to use array offsets rather than pointers. And maybe self-relative pointers if you need pointers into things that aren't symbol-tablish? A not-so space-efficient approach would be to allocate out of an ordinary block of memory, using ordinary pointers, and then swizzle into self-relative pointers at serialization time. Then, when you deserialize, you could just leave them self-relative, or swizzle them back. Man, playing with all these ideas would sure be easier if you could make a class and overload "*" and "->" -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, Apr 10, 2007 at 05:23:55PM -0400, Andrew MacLeod wrote: > We'd also be able to "linearize" things when we write it out... ie, > flatten out the linked lists of instruction so they are all continguous > indexes, tables etc. this would make it compact and when its read in, we > might even see some cache friendly accesses for a change. FYI, I did this with PCH once... I never followed it through well enough to get consistent results from it, but I did get some remarkable jumps during testing. -- Daniel Jacobowitz CodeSourcery
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, Apr 10, 2007 at 10:53:19AM -0700, Ian Lance Taylor wrote: > As you know, we have a lot of basic optimizations in fold-const.c that > operatee on trees. We also have a lot of basic optimizations in > simplify-rtx.c that operate on RTL. These sets of optimizations are > independent implementations. What can we do to avoid implementing a > third set of basic optimizations that operate on GIMPLE? Perhaps I misunderstood what Diego was proposing, but I would have thought the subcode would continue to be the tree PLUS_EXPR, and not a GS_PLUS something. With that, build_foldN does essentially what we want, without having to regenerate tree nodes on the input side. With a bit of effort, we could proably eliminate the trees generated on the output side, with some callbacks. At least, that's what I'd had in mind. r~
Re: RFC: GIMPLE tuples. Design and implementation proposal
Richard Henderson wrote on 04/10/07 20:30: > Perhaps I misunderstood what Diego was proposing, but I > would have thought the subcode would continue to be the > tree PLUS_EXPR, and not a GS_PLUS something. Yes. > With that, build_foldN does essentially what we want, > without having to regenerate tree nodes on the input side. Sure, but things will be different if/when the operands stop being 'tree'.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Richard Henderson <[EMAIL PROTECTED]> writes: > On Tue, Apr 10, 2007 at 10:53:19AM -0700, Ian Lance Taylor wrote: > > As you know, we have a lot of basic optimizations in fold-const.c that > > operatee on trees. We also have a lot of basic optimizations in > > simplify-rtx.c that operate on RTL. These sets of optimizations are > > independent implementations. What can we do to avoid implementing a > > third set of basic optimizations that operate on GIMPLE? > > Perhaps I misunderstood what Diego was proposing, but I > would have thought the subcode would continue to be the > tree PLUS_EXPR, and not a GS_PLUS something. > > With that, build_foldN does essentially what we want, > without having to regenerate tree nodes on the input side. I'm having a hard time seeing it. fold_build2 calls fold_binary; I agree that if we can handle fold_binary, we can handle fold_build2. But fold_binary takes trees as parameters. How are you thinking of calling it? Admittedly this is going to be more of an issue for a hypothetical tree^H^H^H^Htuple combiner than it is for run of the mill tuples. Ian
Re: RFC: GIMPLE tuples. Design and implementation proposal
Ian Lance Taylor wrote on 04/10/07 20:49: > I'm having a hard time seeing it. fold_build2 calls fold_binary; I > agree that if we can handle fold_binary, we can handle fold_build2. > But fold_binary takes trees as parameters. How are you thinking of > calling it? the gimple version of z = x + y is stmt -> GS_ASSIGN we have a wrapper that calls: op0 = GIMPLE_OPERAND (stmt, 0); op1 = GIMPLE_OPERAND (stmt, 1); op2 = GIMPLE_OPERAND (stmt, 2); t = fold_build2 (GIMPLE_SUBCODE(stmt), TREE_TYPE (op0), op0, op1, op2); and then stuffs t's operands back into 'stmt'.
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, Apr 10, 2007 at 11:13:44AM -0700, Ian Lance Taylor wrote: > The obvious way to make the proposed tuples position independent would > be to use array offsets rather than pointers. I suggest instead, if we want something like this, that we make the references be pc-relative. So something like // This would replace all ocurrences of "tree" and "gimple" within // the actual structures, to force the macros to be used. typedef struct { ptrdiff_t d; } gs_ref; static inline void * gs_ref (gs_ref *x) { return (char *)x + x->d; } #define GS_REF(TYPE,X) ((TYPE) gs_ref (&(X))) static inline void gs_ref_set (gs_ref *loc, const void *val) { loc->d = (const char *)val - (const char *)loc; } #define GS_REF_SET(LOC, VAL)gs_ref_set(&(LOC), VAL) // Ignoring other checking code for the moment. #define TREE_OPERAND(NODE, X) GS_REF(tree, (NODE)->exp.operands[X]) #define TREE_OPERAND_SET(NODE, X, VAL) \ GS_REF_SET((NODE)->exp.operands[X], VAL) When loading a .o file, this will require some fixups to the file's symbol table for external references, but otherwise everything will be self-contained. With some careful planning, one might could arrange for all of these fixups to be in the same pages of the .o file, minimizing the amount of COW space that would be used. This has an overhead of one addition insn per reference. Which ought to be much less than a memory reference via an index into an array. r~
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, Apr 10, 2007 at 05:49:03PM -0700, Ian Lance Taylor wrote: > But fold_binary takes trees as parameters. How are you thinking of > calling it? The operands of gimple statements are still trees. Very simple trees, mind, but trees. r~
Re: RFC: GIMPLE tuples. Design and implementation proposal
On Tue, Apr 10, 2007 at 08:48:27PM -0400, Diego Novillo wrote: > Sure, but things will be different if/when the operands stop being 'tree'. We'll burn that bridge when we come to it. It's possible to parameterize the fold-const even further. One passes in void* instead of tree, and have a set of callbacks that queries the things of interest, such as the type or sub-operands. While in the front ends, these callbacks are just wrappers around the existing macros. While operating on gimple ssa_names (for instance), these callbacks dig the type out from SSA_NAME->DECL->TYPE. When we're interested in local simplifications, we arrange for all sub-operands to return NULL. When we're interested in global simplifications (tree-combine?), we can allow "sub-operands" to mean the ssa_name's definition. r~
Re: RFC: GIMPLE tuples. Design and implementation proposal
Richard Henderson wrote on 04/10/07 21:19: > On Tue, Apr 10, 2007 at 08:48:27PM -0400, Diego Novillo wrote: >> Sure, but things will be different if/when the operands stop being 'tree'. > > We'll burn that bridge when we come to it. Works for me.
Re: RFC: GIMPLE tuples. Design and implementation proposal
Diego Novillo <[EMAIL PROTECTED]> writes: | Richard Henderson wrote on 04/10/07 21:19: | > On Tue, Apr 10, 2007 at 08:48:27PM -0400, Diego Novillo wrote: | >> Sure, but things will be different if/when the operands stop being 'tree'. | > | > We'll burn that bridge when we come to it. | | Works for me. I'll supply the fuel :-) -- Gaby
Re: RFC: GIMPLE tuples. Design and implementation proposal
Richard Henderson wrote: > On Tue, Apr 10, 2007 at 11:13:44AM -0700, Ian Lance Taylor wrote: >> The obvious way to make the proposed tuples position independent would >> be to use array offsets rather than pointers. > > I suggest instead, if we want something like this, that we make > the references be pc-relative. Self-relative, not PC-relative, right? (If so, that happens to be what I was suggesting earlier.) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Inclusion in an official release of a new throw-like qualifier
I prefer the method Jason mentioned of including this functionality as a form of more strict checking of -Wexception-specs (Or maybe defining a new warning) as opposed to having an attribute that defines new semantics. In the end the two are practically identical. The semantics of the existing "OLD" throw() you are describing as the exceptions that can be caught, where as the "NEW" semantics would be that the given exceptions are the only ones that can propagate from calling the given function. These are for all intents and purposes the same. In the "OLD" style the function still can only throw the given exceptions. It just so happens that the program terminates if it tries to do otherwise. The stricter checking just ensures that no other exceptions may arise and so the program wont terminate as a result. The only difference is that at compile time you will either enforce that this throw() specifier is adhered to by doing static analysis or you wont. It avoids having to define special attributes in the code, the only difference is the set of command line flags you pass to the compiler. It does however mean that you cant provide function level "enable/disable of static checking". I.e. It will check for all functions whose implementation you find in the given translation unit. Brendon. Mike Stump wrote: > On Apr 10, 2007, at 2:06 PM, Sergio Giro wrote: >> Maybe I missed some point: why everything should be rewritten? > > Let me try again. The standard way to add a new qualifier in g++, is to > add it in an attribute, please do that. The possible responses are, no, > I want to be different, or ok. If you choose the former, you have to > back your position. For throw specs, that attribute is the standard one > that goes on the function/method. The name or the spec can be > statically_check_eh_spec, or something less verbose. > >> The only bad thing here is that you have two qualifiers having similar >> meanings... > > Ding. > >> But I think it must be that way > > No, this is incorrect. > >> in order to _avoid_ rewriting code. > >> The point here is that, in order to do this, you need interprocedural >> analysis. > > You've not yet grasped they are isomorphic forms. If the above is true, > then then below is wrong. If the below is not wrong, then the above > must be wrong. Your pick. > >> If you have qualifiers as the one I describe, you can perform the >> check by merely using the prototypes... > >> So, what do you think now? > > Unchanged. > >