Re: Summer of Code 2006
On Sun, 2006-04-16 21:30:08 -0700, Ian Lance Taylor wrote: > http://gcc.gnu.org/projects/ > If anybody wants to pull together a single URL of projects suitable > for students, probably on the Wiki page, that would be helpful. Or I > might try to tackle that in the next few days myself. Maybe not only new features, but also code cleanups, should considered: * Move _output_function_prologue() to RTL. * CONST_INT_P() instead of GET_CODE() == CONST_INT, etc. * Trailing whitespace patrol. * ... MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED]. +49-172-7608481 _ O _ "Eine Freie Meinung in einem Freien Kopf| Gegen Zensur | Gegen Krieg _ _ O für einen Freien Staat voll Freier Bürger" | im Internet! | im Irak! O O O ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA)); signature.asc Description: Digital signature
[ VASTU BULLETIN VOLUME-29]
VASTU BULLETIN VOLUME-29 [APR JUNE 2006] Environment & Landscaping according to vastu In this issue, we will be focusing on understanding environmental influences of location. The location must have good energy. Living in a beautiful environment is much more supportive than living in an environment that is an eyesore. Premises having powerful and fertile environment is a most successful premises according to vastu. Although nothing takes the place of an experienced vastu consultant to assist you in building your dream home or office, these suggestions come from vastu principles that have been tested over millennia. According to vastu mountain or elevated area at the south & west is an auspicious sign. It can be a mountain, tall building or hill area. By using rocks, crystals and earth, you can create this mountain in your area. You can also use large decorative rocks. Because these are used in certain situations to correct for a problem, they need to be placed under the guidance of a Vastu consultant. The location must have good energy flow. The land must have spirit. The aura of the location must be strong and fertile. Most important you should never underestimate the power of location. A Positive environment enhances our well being. A bad environment will cause illness. The landscape can affect the flow of a healthy or a bad life force it all depends where the house is located and the direction it faces. If the house is in alignment or in rhythm with the landscape, a good healthy life force is created. Good healthy energy will enter the home which will make the occupants vitalized and alert and enable them to make the most of good opportunities that happen around them. Good energy will bring them good health, relationships and prosperity. On the other hand, if the house is not in alignment with the landscape, then healthy energy cannot be generated into the house. The occupants of this house will always feel tired and become lethargic, irritable, forgetful and lose concentration. Worst of all, the occupants may become ill and could miss good opportunities due to poor environment. This is why we can sometimes feel good in one house and feel uncomfortable in another. Energy forces around us are invisible and it is sometimes hostile. It is like the frequency of a mobile phone or a television. Sometimes it gives us disturbance or good reception in certain areas. With the help of vastu, when we have a chance to change the quality of our lives and progress to something better, why struggle? It would be so easy to end up wasting our lives not knowing there are good and bad areas that can be utilised to help us progress. When a house receives good healthy energy, the occupants will be alert, vitalized and live longer and will enjoy many other good fortunes and happiness. Due to limited space in our homes, we sometimes can't avoid the negative energy. This is when vastu skills can be applied. Some houses contain stagnated vibrations due to the wrong alignment of the plot, entrance location and direction. This will prevent the occupants from making progress in the life. The occupants will also face difficulties in selling the house. We have seen that if the existing house contains a healthy and powerful positive energy, the occupants will buy another good house. If not, the occupants will be unable to make a correct decision and normally end up getting a worse house than the original. How will you make environment positive with the help of vastu:- 1. Placing the objects according to load factor 2. Using proper colours according to direction and application of the area 3. Placement of Five elements of vastu i.e. fire, water, earth, wind and Space 4.Use of magnetic forces Office Environment- Your Staff your valuable asset The workforce is by far the most valuable asset of any business, and almost always the biggest cost. A business that gives serious attention to the physical environment of the office is far more likely to increase staff productivity than one which ignores the building. Those employers who ignore the evidence of workplace design as an enabler of staff satisfaction and performance risk the loss of key staff and ultimately business success. A vastu friendly office can be an employee-friendly office which can boost productivity and performance, latest research has suggested. Good workplace environment can make a big difference in staff satisfaction, attraction, motivation and retention. It can also affect the level of knowledge and skills of workers, how innovative and creating they are, how they respond to business and technological change and how effective the organisation is at attracting and retaining customers. Poor workplace design, by contrast, is linked to lower business performance and higher level of stress experienced by employees. An employee's workplace is responsible for 25 per cent of their job satisfaction level. Half of the office workers belie
Suboptimal code generated for ?: as a condition
There is a missing opportunity for optimization. int f(void); void test(int x) { if (x & 1 ? x == 0 : x > 0) f(); } This is gcc-4.1.0-0.20051206 with some patches by PLD Linux Distribution. gcc -S -O2 -fomit-frame-pointer generates the following code: test: movl4(%esp), %eax testb $1, %al je .L2 testl %eax, %eax sete%al testb %al, %al jne .L9 .L7: rep ; ret .p2align 4,,7 .L2: testl %eax, %eax setg%al testb %al, %al je .L7 .L9: jmp f It makes little sense to materialize the boolean result of the condition in %al, and then to test %al for 0. I would expect direct conditional jumps to branches of the outer if. -- __("< Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/
Re: Summer of Code 2006
How about a pull-together of all the existing docs on how to create a new front-end / back-end.. stuff like whats in the gcc internals doc, and Stallman's "Using and porting gcc".. maybe update it all in a single place for ver 4.x, and update for the undocumented / obsolete macros? It's not strictly coding, I know, but in many ways, it might actually be a more important (and higher level) project than one strictly involving coding, as it would clear up loose ends, and facilitate the expansion of gcc into other languages / targets. The students that took the project on would gain a detailed knowledge of the structure of gcc itself, and would gain kudos for their efforts. It would be a serious help to people trying to develop new front ends or back ends for gcc (like me!).
Yara status on PPC (powerpc-darwin)
I decided to look into the Yara branch to see if it could even be bootstrap on PPC (with Yara turned on by default). I ran into an ICE while compiling libgcc2.c for __muldi3. The ICE was in emit_secondary_memory_move. The preprocessed source is: typedef int SItype __attribute__ ((mode (SI))); typedef int DItype __attribute__ ((mode (DI))); struct DWstruct {SItype high, low;}; typedef union { struct DWstruct s; DItype ll; } DWunion; DItype __muldi3 (DItype u) { DWunion w; w.ll = 0; w.s.high = u; return w.ll; } --- And then I decided just to look into code generation: int f(void) { return 0; } --- With the above code, I noticed that GCC saved and restored the link register which is not needed because this is a leaf function. The reason why it was being saved/restored is because current_function_is_leaf was not being set at all with Yara on. Before it was being set in the local-alloc.c. The next code generation issue is related to the ICE above as both are caused by spilling long long variables to the stack always (or it seems). Also it looks like it might be producing wrong code too as the one half is not zero'd out. Testcase: typedef int SItype __attribute__ ((mode (SI))); typedef int DItype __attribute__ ((mode (DI))); DItype __muldi3 (DItype u) { DItype ll; ll = 0; ll = (ll &~0x) | (u&0x); return ll; } --- Asm produced WITHOUT Yara turned on: .machine ppc .text .align 2 .globl ___muldi3 ___muldi3: li r3,0 blr .subsections_via_symbols --- Asm produced WITH Yara turned on: .machine ppc .text .align 2 .globl ___muldi3 ___muldi3: mflr r0 stw r0,8(r1) stwu r1,-48(r1) lwz r0,24(r1) stw r4,20(r1) stw r0,12(r1) lfd f0,8(r1) stfd f0,16(r1) lwz r3,16(r1) lwz r4,20(r1) addi r1,r1,48 lwz r0,8(r1) mtlr r0 blr .subsections_via_symbols --- Hopefully this helps the progress of Yara some more. Thanks, Andrew Pinski
Re: Yara status on PPC (powerpc-darwin)
[EMAIL PROTECTED] wrote: I decided to look into the Yara branch to see if it could even be bootstrap on PPC (with Yara turned on by default). Thanks, for the information. It is even a surprise for me that some tests work correctly for ppc. Last time when I had time and checked ppc status (a month ago), yara was completely broken on ppc. Therefore I wrote that yara will not work on ppc for now. My major focus platform is x86 right now and the first priority task is to speed Yara up because it is really slow (right now compiler is 3-4% slower with YARA on SPECINT2000). If I approach the speed of the current gcc compiler (my goal is 1-1.5% compiler time degradation when yara is used), I can say that the project will have a chance to be successful. Although I can say YARA now is much more faster than the new register allocator (-fnew-ra) in previous versions of gcc. As for x86, it is a good platform to work on most register allocator tasks. One big problem is still waiting for a solution. It is a dealing with constraint on the displacement size (x86 has no such problem). Another smaller problem is a good allocation when many hard-registers used in RTL (e.g. for x86_64). My feeling is that YARA is a long project. Just removing reload is a big task by itself. So please don't expect soon that it will be working for other platforms except x86. Many things should be done also to get a decent code for other platforms (rematerialization, register pressure relief, better register preferrencing, different tunning). Also a big problem is a generation of full debugging information (pseudo in yara can live in different locations - memory, different registers which is quite different from the current allocator). This problem is also waiting for a solution. Vlad
Re: Suboptimal code generated for ?: as a condition
On Mon, 2006-04-17 at 14:48 +0200, Marcin 'Qrczak' Kowalczyk wrote: > There is a missing opportunity for optimization. > > int f(void); > void test(int x) { >if (x & 1 ? x == 0 : x > 0) f(); > } > > This is gcc-4.1.0-0.20051206 with some patches by PLD Linux > Distribution. gcc -S -O2 -fomit-frame-pointer generates the > following code: > > test: > movl4(%esp), %eax > testb $1, %al > je .L2 > testl %eax, %eax > sete%al > testb %al, %al > jne .L9 > .L7: > rep ; ret > .p2align 4,,7 > .L2: > testl %eax, %eax > setg%al > testb %al, %al > je .L7 > .L9: > jmp f > > It makes little sense to materialize the boolean result of the > condition in %al, and then to test %al for 0. I would expect > direct conditional jumps to branches of the outer if. See bug 15911 in the bugzilla database. The proposed solution for that bug will result in something like this for your testcase test: movl4(%esp), %eax testb $1, %al jne .L5 testl %eax, %eax jg .L7 .L5: rep ; ret .p2align 4,,7 .L7: jmp f The patch referenced in the bugzilla database is not yet complete -- it's still a work in progress. jeff
Re: Toolchain relocation
Dave Murphy wrote: > install: e:/devkitPro/devkitARM/lib/gcc/arm-elf/4.1.0/ Don't use a --prefix with a drive letter. Just use --prefix=/devkitARM, and then use "make install DESTDIR=e:/devkitPro" to install it where you actually want it. Ross Ridge
"Experimental" features in releases
Dan Berlin and I exchanged some email about PR 26435, which concerns a bug in -ftree-loop-linear, and we now think it would make sense to have a broader discussion. The PR in question is about an ice-on-valid regression in 4.1, when using -O1 -ftree-loop-linear. Dan notes that this optimization option is "experimental", but I didn't see that reflected in the documentation, which says: > @item -ftree-loop-linear > Perform linear loop transformations on tree. This flag can improve cache > performance and allow further loop optimizations to take place. In any case, the broader question is: to what extent should we have experimental options in releases, and how should we warn users of their experimental nature? On the one hand, it is of course useful to get additional testing and feedback for new features. On the other, users of GCC will twist the knobs we give them, and if there's no obvious way to know that they're doing something dangerous, they'll have a negative reaction when things go wrong. In this particular case, Dan wrote: > There is really no easy way to fix the code that exists there now, it > was meant to be a temporary hack. It's not quite algorithmically sound. If it's really a temporary hack, then it's not clear that we're getting much useful testing by exposing it to users. And, if it's not algorithmically sound, then it seems users should be warned away from it. The obvious counter-argument is that there are certainly known bugs in -O2, and yet we don't warn people not to use -O2. So, it doesn't make sense to have a bright-line rule. My suggestion is that features that are clearly experimental (like this one) should be (a) documented as such, and (b) should generate a warning, like: warning: -ftree-loop-linear is an experimental feature and is not recommended for production use At least we're ensuring that even someone copying someone else's Makefile is aware that they're in dangerous territory. Thoughts? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: "Experimental" features in releases
-ftree-loop-linear enables a number of features and transformations. Which part, exactly, is experimental? You are quoting from the documentation for the option, but Dan may be referring to a particular transformation. I thought the failure and algorithm correctness was related to creating perfect loop nests used by other transformations, not all of the tree loop linear. I think this discussion is losing a lot in the summary. David
Re: "Experimental" features in releases
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mark Mitchell wrote: > Thoughts? > I don't know which of the loop linear transformations you folks were debating (the loop linear stuff defines a family of transformations), but I vote for having no experimental features in releases. If a feature is meant to be activated with -Ox or with -f/-m switches, it should only be present in a release if we are prepared to support it. In this case, based on Dan's analysis I vote for moving the non-working code into a branch and allow it to mature. Particularly, if it has not been around for long. Alternately, we could have a sanitization process during releases that enables a warning message inside the gate_*() functions for experimental passes. But, not every feature can be enabled/disabled with a gating function and it becomes a slippery slope. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) iD8DBQFEQ+0RUTa2oAUaiwQRAlpFAJ0cUkDBW0Bf0COa6tMS9RJOReopLACgjWPV 0q7N9OeVgDpUVE47mhQ/W4A= =Fxlp -END PGP SIGNATURE-
Re: "Experimental" features in releases
David Edelsohn wrote: > -ftree-loop-linear enables a number of features and > transformations. Which part, exactly, is experimental? You are quoting > from the documentation for the option, but Dan may be referring to a > particular transformation. I thought the failure and algorithm > correctness was related to creating perfect loop nests used by other > transformations, not all of the tree loop linear. > > I think this discussion is losing a lot in the summary. I did provide nearly all of the information that I had from Dan, but it's entirely possible that the original conversation failed to convey the full complexity of the situation to me. :-) Dan wrote (and gave me permission to repost): > It only happens with -ftree-loop-linear, a flag we have marked as > experimental. > IE it is not something that users will hit without adding extra flags to > their compiler. I don't see that -ftree-loop-linear is marked as experimental, so something seems to be confused about this situation. That quote suggested to me that the entire option was experimental. However, Dan also wrote: > I can fix almost all of these bugs by simply removing the perfect nest > conversion code, which confirms your suggestion that it's this one transformation that's experimental. However, we still have the same problems as from my original email: * We have an option which is (at least partially) experimental and known-dangerous, but we're not conveying this to users, and: * We need a general strategy for dealing with this situation in future, independent of -ftree-loop-linear. Possibly, we could solve this specific issue by creating a separate option to enable the problematic transformations -- but I'd still like to have a general strategy for dealing with such options, including the one that would be created to enable the perfect nest conversion code. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: "Experimental" features in releases
On Apr 17, 2006, at 11:52 AM, Mark Mitchell wrote: Dan Berlin and I exchanged some email about PR 26435, which concerns a bug in -ftree-loop-linear, and we now think it would make sense to have a broader discussion. The PR in question is about an ice-on-valid regression in 4.1, when using -O1 -ftree-loop-linear. Dan notes that this optimization option is "experimental", but I didn't see that reflected in the documentation, which says: @item -ftree-loop-linear Perform linear loop transformations on tree. This flag can improve cache performance and allow further loop optimizations to take place. I wasn't aware that it was supposed to be experimental either, and it wasn't explained that way when it went in (Sep 2004). (Incomplete or buggy would not be surprising, but it sounds now like we're talking about fatally flawed design, which is different.) In any case, the broader question is: to what extent should we have experimental options in releases, and how should we warn users of their experimental nature? In general I would agree in principle with Diego that such features don't belong in releases, but this isn't the first time features have been found to be buggy after they've gone in. -frename-registers comes to mind; in that case, the bugginess was documented for several releases, and that warning has recently been removed as the bugs are believed to be fixed. This optimization is worth about a 5x speedup on one of the SPECmarks (see discussion in archives), so IMO we should consider carefully before removing it. It was in 4.0 and 4.1 releases. My suggestion is that features that are clearly experimental (like this one) should be (a) documented as such, and (b) should generate a warning, like: warning: -ftree-loop-linear is an experimental feature and is not recommended for production use Looks good to me.
RFC: ssa subvariables for complex types
Hi folks. While investigating a regression from the V_MUST_DEF removal on mem-ssa, I've noticed that we were missing out on optimization of certain stores to complex types (on mainline). For example, here: _Complex int t = 0; __real__ t = 2; __imag__ t = 2; we end up with: # t_2 = V_MUST_DEF ; t = __complex__ (0, 0); # t_3 = V_MAY_DEF ; REALPART_EXPR = 2; # t_4 = V_MAY_DEF ; IMAGPART_EXPR = 2; When we really should be decomposing the field stores into SFTs, like this: # SFT.0_3 = V_MUST_DEF ; # SFT.1_4 = V_MUST_DEF ; t = __complex__ (0, 0); # SFT.1_5 = V_MUST_DEF ; REALPART_EXPR = 2; # SFT.0_6 = V_MUST_DEF ; IMAGPART_EXPR = 2; The problem with not decomposing, is that since we can't account for the fields themselves, we have to end up using V_MAY_DEFs (instead of V_MUST_DEFs) for the entire complex type, and later on DCE cannot remove the original clearring of "t" because we have a V_MUST_DEF followed by a V_MAY_DEF. I see the original rationale for inhibiting creation of subvariables on aggregates here: http://gcc.gnu.org/ml/fortran/2006-01/msg00195.html But I don't think, memory wise, it should apply to complex types. This patch will cause the clearring of "t" to be redundant on mainline. On mem-ssa it doesn't matter, cause we get the case wrong anyhow, but it's best to describe what's going on-- while I'm at it :). How does this look? * tree-ssa-alias.c (create_overlap_variables_for): Do not inhibit creation of subvariables for complex types. Index: tree-ssa-alias.c === --- tree-ssa-alias.c(revision 112618) +++ tree-ssa-alias.c(working copy) @@ -2878,7 +2878,8 @@ create_overlap_variables_for (tree var) up = up_lookup (uid); if (!up - || up->write_only) + || (up->write_only + && TREE_CODE (TREE_TYPE (var)) != COMPLEX_TYPE)) return; push_fields_onto_fieldstack (TREE_TYPE (var), &fieldstack, 0, NULL);
Re: "Experimental" features in releases
Dale Johannesen wrote: > I wasn't aware that it was supposed to be experimental either, and it > wasn't explained that way when it went in (Sep 2004). (Incomplete or > buggy would not be surprising, but it sounds now like we're talking > about fatally flawed design, which is different.) My understanding from clarifications from others is that this loop nest bit is somehow not quite right, but that most of it is solid. So, I think that I may have read more into Dan's mail than he intended. We shouldn't just to drastic conclusions, but there certainly is something to be clarified here. > This optimization is worth about a 5x speedup on one of the SPECmarks > (see discussion in archives), so IMO we should consider carefully before > removing it. It was in 4.0 and 4.1 releases. Indeed. My understanding is we might be able to remove just the problematic part (or segregate that into a separate option) -- but that problematic part is the 177.swim bit, so that's an issue. You're completely right that all new features are more likely to have defects than older features. There's no clear line between experimental and non-experimental features, but sometimes it may be obvious into which category a feature falls. >> My suggestion is that features that are clearly experimental (like this >> one) should be (a) documented as such, and (b) should generate a >> warning, like: >> >> warning: -ftree-loop-linear is an experimental feature and is not >> recommended for production use > > Looks good to me. Good. Independent of this issue, I'd certainly like to get consensus on the general question. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: traverse the gimple tree
Thanks for reply. What I am trying to do is something like counting the times of a particular function call, i.e., whenever there is a CALL_EXPR in the tree, I want to look at the id to see if that is the function I want to count during runtime. If the id is the function name I want to count, I insert a counter instruction. It seems that I can mimic the code of dump_generic_node() (which is called by dump_function() ) to do my instrumentation. However, here I have two questions: (1) dump_generic_node() is definietly an overkill to my problem because it takes care of all the NODE type, even the node I am not interested in, such as TYPEs, DECLs. Based on what I want to do (described above), is there an easy and cleaner way than the following pseudo code? i.e, is there a way that I don't need to go through all the GIMPLE grammar situation (because this is error-prone)? pass_function_instrument(){ instrument( &DECL_SAVED_TREE (current_function_decl) ); } instrument (tree *tp){ switch (TREE_NODE(*tp) ){ case CALL_EXPR: insert_counter_instruction(tp); case COND_EXPR: //COND_EXPR_COND (tp) can't contain stmts any more instrument (COND_EXPR_THEN (tp)); instrument (COND_EXPR_ELSE (tp));; case BIND_EXPR: //BIND_EXPR_VARS (tp) can't contain stmts any more instrument(BIND_EXPR_BODY (tp) ); case STATEMENT_LIST: i = tsi_start(); which (!tsi_end_p (i)) { instrument (tsi_stmt (i)); tsi_next(&i); } case: ... } } (2) It seem to me that TREE_LIST and TREE_VEC node is not reachable from DECL_SAVED_TREE node according to GIMPLE grammar. By I did see case taking care of them in dump_generic_node(). Can someone explain TREE_LIST and TREE_VEC to me? Thanks, Sean From: Zdenek Dvorak <[EMAIL PROTECTED]> To: sean yang <[EMAIL PROTECTED]> CC: gcc@gcc.gnu.org Subject: Re: traverse the gimple tree Date: Tue, 11 Apr 2006 13:56:48 +0200 Hello, > I want to write a pass to walk the gimple tree and add some intrumentation > code. I read the chapter 9 of "GCC Internals" document, and it seems not to > describe the Macros to do so. > > Can I get some information about this? Specifically, if someone can show me > which .h file I should look at to find the Macros, that would be great. Or, > Is there any other pass do the similar thing(traverse the gimple tree) that > I can read (--I did not find)? depending on what you need, you may use walk_tree and/or combination of special-handling for structured statements and tsi_ iterators, at that point. See e.g. pass_lower_cf. Zdenek > > //in gcc.4.0.2, tree-optimize.c >323 void >324 init_tree_optimization_passes (void) >325 { >326 struct tree_opt_pass **p; >327 >328 #define NEXT_PASS(PASS) (p = next_pass_1 (p, &PASS)) >329 >330 p = &all_passes; >331 NEXT_PASS (pass_gimple); >332 >333 NEXT_PASS (MYPASS_code_instrument); //this is what I want to do >334 //the reason I want to add the pass here is: both C/C++(any other > front end, later) can use this; >335 NEXT_PASS (pass_remove_useless_stmts); >336 NEXT_PASS (pass_mudflap_1); >337 NEXT_PASS (pass_lower_cf); >338 NEXT_PASS (pass_lower_eh); >339 NEXT_PASS (pass_build_cfg); > .. > } > > > _ > Express yourself instantly with MSN Messenger! Download today - it's FREE! > http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ > _ Dont just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/
Re: RFC: ssa subvariables for complex types
> I see the original rationale for inhibiting creation of subvariables > on aggregates here: > > http://gcc.gnu.org/ml/fortran/2006-01/msg00195.html > > But I don't think, memory wise, it should apply to complex types. > This patch will cause the clearring of "t" to be redundant on mainline. > On mem-ssa it doesn't matter, cause we get the case wrong anyhow, but it's > best to describe what's going on-- while I'm at it :). > > How does this look? Actually the patch referenced above was refered as a hack and really should reverted if the memory usage in general is fixed. In fact that patch makes GCC miss that a read is done if the address is taken so I stand by that above patch should be instead reverted. -- Pinski
Re: RFC: ssa subvariables for complex types
On 4/17/06, Aldy Hernandez <[EMAIL PROTECTED]> wrote: > Hi folks. > > While investigating a regression from the V_MUST_DEF removal on mem-ssa, > I've noticed that we were missing out on optimization of certain > stores to complex types (on mainline). > > For example, here: > > _Complex int t = 0; > __real__ t = 2; > __imag__ t = 2; > > we end up with: > > # t_2 = V_MUST_DEF ; > t = __complex__ (0, 0); > # t_3 = V_MAY_DEF ; > REALPART_EXPR = 2; > # t_4 = V_MAY_DEF ; > IMAGPART_EXPR = 2; > > When we really should be decomposing the field stores into SFTs, like this: > > # SFT.0_3 = V_MUST_DEF ; > # SFT.1_4 = V_MUST_DEF ; > t = __complex__ (0, 0); > # SFT.1_5 = V_MUST_DEF ; > REALPART_EXPR = 2; > # SFT.0_6 = V_MUST_DEF ; > IMAGPART_EXPR = 2; > > The problem with not decomposing, is that since we can't account for the > fields themselves, we have to end up using V_MAY_DEFs (instead of V_MUST_DEFs) > for the entire complex type, and later on DCE cannot remove the original > clearring of "t" because we have a V_MUST_DEF followed by a V_MAY_DEF. Well, it's written to only in this testcase. Can you post a more complete one? > I see the original rationale for inhibiting creation of subvariables > on aggregates here: > > http://gcc.gnu.org/ml/fortran/2006-01/msg00195.html > > But I don't think, memory wise, it should apply to complex types. > This patch will cause the clearring of "t" to be redundant on mainline. > On mem-ssa it doesn't matter, cause we get the case wrong anyhow, but it's > best to describe what's going on-- while I'm at it :). > > How does this look? Certainly a hack ontop of a hack. But as the fortran frontend is not going to be fixed the original hack will stay there, so it looks reasonable. Richard.
Re: traverse the gimple tree
Hello, > What I am trying to do is something like counting the times of a particular > function call, i.e., whenever there is a CALL_EXPR in the tree, I want to > look at the id to see if that is the function I want to count during > runtime. If the id is the function name I want to count, I insert a > counter instruction. > > It seems that I can mimic the code of dump_generic_node() (which is called > by dump_function() ) to do my instrumentation. However, here I have two > questions: > (1) dump_generic_node() is definietly an overkill to my problem because it > takes care of all the NODE type, even the node I am not interested in, such > as TYPEs, DECLs. Based on what I want to do (described above), is there an > easy and cleaner way than the following pseudo code? i.e, is there a way > that I don't need to go through all the GIMPLE grammar situation (because > this is error-prone)? pass_tree_profile seems to be more appropriate place for this transformation. You already have some infrastructure in place there (see value-prof.c), so it might be easier to implement than starting from scratch. In that case, you do not even have to worry about the internal representation too much, just add what you need to tree_values_to_profile (and use get_call_expr_in to find function calls in statements). Zdenek
Re: "Experimental" features in releases
> Mark Mitchell writes: Mark> My understanding is we might be able to remove just the Mark> problematic part (or segregate that into a separate option) -- but that Mark> problematic part is the 177.swim bit, so that's an issue. Well, yes and no. The 177.swim bit is loop interchange and that part should be solid. However, loop interchange requires a perfect loop nest. Other optimization improvements (included in GCC 4.1) allow enough statement motion to destroy the perfect loop nest. GCC has become too effective at other optimizations for loop interchange to trigger in 177.swim without additional loop transformations. It is my understanding that the less robust piece of tree-loop-linear is a quick and dirty loop distribution transformation to create a perfect loop nest. The code was suppose to be conservative, but apparantly it is not conservative enough. David
Re: RFC: ssa subvariables for complex types
> > Hi folks. > > While investigating a regression from the V_MUST_DEF removal on mem-ssa, > I've noticed that we were missing out on optimization of certain > stores to complex types (on mainline). > > For example, here: > > _Complex int t = 0; > __real__ t = 2; > __imag__ t = 2; > > we end up with: > > # t_2 = V_MUST_DEF ; > t = __complex__ (0, 0); > # t_3 = V_MAY_DEF ; > REALPART_EXPR = 2; > # t_4 = V_MAY_DEF ; > IMAGPART_EXPR = 2; > > When we really should be decomposing the field stores into SFTs, like this: > > # SFT.0_3 = V_MUST_DEF ; > # SFT.1_4 = V_MUST_DEF ; > t = __complex__ (0, 0); > # SFT.1_5 = V_MUST_DEF ; > REALPART_EXPR = 2; > # SFT.0_6 = V_MUST_DEF ; > IMAGPART_EXPR = 2; Relooking at the orginal testcase which actually has a read in it, This seems like the wrong approach. Can you figure out why write_only is not being set to false for the orginal testcase (and not the reduced one)? -- Pinski
Re: RFC: ssa subvariables for complex types
> > > > > Hi folks. > > > > While investigating a regression from the V_MUST_DEF removal on mem-ssa, > > I've noticed that we were missing out on optimization of certain > > stores to complex types (on mainline). > > > > For example, here: > > > > _Complex int t = 0; > > __real__ t = 2; > > __imag__ t = 2; > > > > we end up with: > > > > # t_2 = V_MUST_DEF ; > > t = __complex__ (0, 0); > > # t_3 = V_MAY_DEF ; > > REALPART_EXPR = 2; > > # t_4 = V_MAY_DEF ; > > IMAGPART_EXPR = 2; > > > > When we really should be decomposing the field stores into SFTs, like this: > > > > # SFT.0_3 = V_MUST_DEF ; > > # SFT.1_4 = V_MUST_DEF ; > > t = __complex__ (0, 0); > > # SFT.1_5 = V_MUST_DEF ; > > REALPART_EXPR = 2; > > # SFT.0_6 = V_MUST_DEF ; > > IMAGPART_EXPR = 2; > > Relooking at the orginal testcase which actually has a read in it, This > seems like the wrong approach. Can you figure out why write_only is not > being set to false for the orginal testcase (and not the reduced one)? I should also mention on the mainline, we get the decomposing for the orginal testcase which means this is a bug only on the MEM-SSA branch. -- Pinski
Re: "Experimental" features in releases
On Mon, Apr 17, 2006 at 11:52:26AM -0700, Mark Mitchell wrote: > My suggestion is that features that are clearly experimental (like this > one) should be (a) documented as such, and (b) should generate a > warning, like: > > warning: -ftree-loop-linear is an experimental feature and is not > recommended for production use Or, while new optimizations are considered experimental they can use different options to make it clear: -fexperimental-tree-loop-linear. What does "experimental" imply? Comments in invoke.texi about the few existing options said to be experimental imply that they might not actually speed up code in all cases, but that's true of many optimizations that are not included in -O2. From this discussion it almost sounds as if an experimental optimization is more likely to result in an ICE or wrong code, and such bugs have a lower priority than other bugs. Janis
Re: "Experimental" features in releases
I am a gcc user at a fininancial institution and IMHO it would not be a good idea to have non-production ready functionality in gcc. We are trying to use gcc for mission critical functionality. Hope this helps, Ivan Mark Mitchell wrote: Dan Berlin and I exchanged some email about PR 26435, which concerns a bug in -ftree-loop-linear, and we now think it would make sense to have a broader discussion. The PR in question is about an ice-on-valid regression in 4.1, when using -O1 -ftree-loop-linear. Dan notes that this optimization option is "experimental", but I didn't see that reflected in the documentation, which says: @item -ftree-loop-linear Perform linear loop transformations on tree. This flag can improve cache performance and allow further loop optimizations to take place. In any case, the broader question is: to what extent should we have experimental options in releases, and how should we warn users of their experimental nature? On the one hand, it is of course useful to get additional testing and feedback for new features. On the other, users of GCC will twist the knobs we give them, and if there's no obvious way to know that they're doing something dangerous, they'll have a negative reaction when things go wrong. In this particular case, Dan wrote: There is really no easy way to fix the code that exists there now, it was meant to be a temporary hack. It's not quite algorithmically sound. If it's really a temporary hack, then it's not clear that we're getting much useful testing by exposing it to users. And, if it's not algorithmically sound, then it seems users should be warned away from it. The obvious counter-argument is that there are certainly known bugs in -O2, and yet we don't warn people not to use -O2. So, it doesn't make sense to have a bright-line rule. My suggestion is that features that are clearly experimental (like this one) should be (a) documented as such, and (b) should generate a warning, like: warning: -ftree-loop-linear is an experimental feature and is not recommended for production use At least we're ensuring that even someone copying someone else's Makefile is aware that they're in dangerous territory. Thoughts?
Re: "Experimental" features in releases
On 4/18/06, Ivan Novick <[EMAIL PROTECTED]> wrote: > I am a gcc user at a fininancial institution and IMHO it would not be a > good idea to have non-production ready functionality in gcc. We are > trying to use gcc for mission critical functionality. It has been always the case that additional options not enabled with any regular -O level gets less testing and more likely has bugs. So for mission critical functionality I would strongly suggest to stay with -O2 and not try to rely on not thoroughly tested combinations of optimization options. So from my point of view, the situation with -ftree-loop-linear is fine - it's ICEing after all, not producing silently wrong-code. For experimental options (where I would include all options not enabled by -O[123s]) known wrong-code bugs should be fixed. That's my 2c, Richard.
optimizing away parts of macros?
I am using gcc (GCC) 4.0.2 20051125 (Red Hat 4.0.2-8) under the Inline::C perl module and having a very weird situation. I have a multi-line macro that declares several variables and then does some work with them, for use in several functions that have similar invocations, interfacing to an external library. I was getting mysterious segfaults so I went over everything with a tweezers, eventually adding a printf line to the end of the macro so I could verify that the unpacking of my arguments was proceeding correctly. To my surprise, the addition of this line, which shouldn't have any side effects, has solved the problem. Adding -O0 to the CCFLAGS makes no difference. GCC appears to be treating my long macro as some kind of block and throwing out variables that are not used within it instead of simply pasting the code in at the macro invocation point. Is this a known problem with 4.0.2? Is there a workaround? Should I upgrade? -- David L Nicol Can you remember when vending machines took pennies?
Re: optimizing away parts of macros?
On Mon, Apr 17, 2006 at 04:42:18PM -0500, David Nicol wrote: > GCC appears to be treating my long macro as some kind of block > and throwing out variables that are not used within it instead of simply > pasting the code in at the macro invocation point. > > Is this a known problem with 4.0.2? Is there a workaround? Should I upgrade? This is not a useful bug report, sorry. Please take a look at the bug reporting instructions on gcc.gnu.org. -- Daniel Jacobowitz CodeSourcery
Re: optimizing away parts of macros?
On Mon, Apr 17, 2006 at 04:42:18PM -0500, David Nicol wrote: > I am using > gcc (GCC) 4.0.2 20051125 (Red Hat 4.0.2-8) > > under the Inline::C perl module > > and having a very weird situation. You vaguely describe your problem, speculate on its cause, and don't include a complete testcase. Not a good way to get help. > I have a multi-line macro that declares several variables and then does some > work with them, for use in several functions that have similar invocations, > interfacing to an external library. Use the -E option to see what's coming out of the preprocessor. If that doesn't make it obvious, use the gcc-help list.
Re: "Experimental" features in releases
On Apr 17, 2006, at 2:31 PM, Richard Guenther wrote: On 4/18/06, Ivan Novick <[EMAIL PROTECTED]> wrote: I am a gcc user at a fininancial institution and IMHO it would not be a good idea to have non-production ready functionality in gcc. We are trying to use gcc for mission critical functionality. It has been always the case that additional options not enabled with any regular -O level gets less testing and more likely has bugs. So for mission critical functionality I would strongly suggest to stay with -O2 and not try to rely on not thoroughly tested combinations of optimization options. I'd go further: you should not be trusting a compiler (gcc or any other) to be correct in "mission critical" situations. Finding a compiler without bugs is not a realistic expectation. Every compiler release I'm familiar with has had bugs. So from my point of view, the situation with -ftree-loop-linear is fine - it's ICEing after all, not producing silently wrong-code. For experimental options (where I would include all options not enabled by -O[123s]) known wrong- code bugs should be fixed. The case of this in 20256 did produce silent bad code when it was reported, but that seems to have changed.
Re: optimizing away parts of macros?
Thank you. Nobody is aware of such a problem.
Re: "Experimental" features in releases
On Apr 17, 2006, at 2:53 PM, Dale Johannesen wrote: I'd go further: you should not be trusting a compiler (gcc or any other) to be correct in "mission critical" situations. Or, to use the option that spits out the proof that the transformation of the code that the compiler did was indeed valid and then to run that through the proof checker before trusting it. :-)
Re: "Experimental" features in releases
On Mon, Apr 17, 2006 at 02:53:37PM -0700, Dale Johannesen wrote: > >So from my point of view, the situation with -ftree-loop-linear is > >fine - it's ICEing after all, not producing silently wrong-code. For > >experimental options (where > >I would include all options not enabled by -O[123s]) known wrong- > >code bugs > >should be fixed. > > The case of this in 20256 did produce silent bad code when it was > reported, but that seems to have changed. The 4.0 branch silently produces bad code; 4.1 and trunk now ICE. Janis
Re: optimizing away parts of macros?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 David Nicol wrote: > Thank you. Nobody is aware of such a problem. > What problem? You have provided no evidence that there indeed is a problem here. Again, please visit our bug submission page at gcc.gnu.org. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) iD8DBQFERCdxUTa2oAUaiwQRAjUJAKCIkztqgQuuw+DwivdlJU4Sh4hGPgCeLcS3 zYaOFOJqIAh9Jduk93CK2Ng= =yNqZ -END PGP SIGNATURE-
Re: "Experimental" features in releases
> "Mark" == Mark Mitchell <[EMAIL PROTECTED]> writes: Mark> In any case, the broader question is: to what extent should we have Mark> experimental options in releases, and how should we warn users of their Mark> experimental nature? Why not put this into the option name? Something like '-Xoption' or '-fexperimental-option? Then people will know that it is experimental. Also, such options could be documented in a separate section to avoid people tripping over them by mistake. Tom
Re: RFC: ssa subvariables for complex types
> I should also mention on the mainline, we get the decomposing for the > orginal testcase which means this is a bug only on the MEM-SSA branch. No we don't. Look at the actual testcase I posted. This is a bug on mainline.
Re: RFC: ssa subvariables for complex types
> Well, it's written to only in this testcase. Can you post a more complete > one? Here's the complete testcase. int g(_Complex int*); int f(void) { _Complex int t = 0; __real__ t = 2; __imag__ t = 2; return g(&t); }
Re: RFC: ssa subvariables for complex types
> > > Well, it's written to only in this testcase. Can you post a more complete > > one? > > Here's the complete testcase. > > int g(_Complex int*); > int f(void) > { > _Complex int t = 0; > __real__ t = 2; > __imag__ t = 2; > return g(&t); > } Yes but that should not matter always as this is going to the same issue with a struct assignment like: struct a { int t; int t1; }; int g(struct a*); int f(void) { struct a t = {}; t.t = 1; t.t1 = 2; reutrn g(&t); } You will noticed that we get {} in there with a VMUST_DEF (I hope). So really this testcase is not really useful in general except for losing a slight missed optimization on the tree level. I can construct a testcase that exposes the compile time issues with complex variables just as well as I could with structs which is what Richard G.'s patch was trying to fix. So I think this is the wrong approach to just disable this for one type only. -- Pinski
Re: RFC: ssa subvariables for complex types
> losing a slight missed optimization on the tree level. Yay, exactly what I'm trying to fix. Glad you agree. Aldy
Re: Toolchain relocation
Ross Ridge wrote: Dave Murphy wrote: install: e:/devkitPro/devkitARM/lib/gcc/arm-elf/4.1.0/ Don't use a --prefix with a drive letter. Just use --prefix=/devkitARM, and then use "make install DESTDIR=e:/devkitPro" to install it where you actually want it. Doesn't help, it's still checking that the old installation paths exist and producing the "insert disk" dialog. Interestingly configuring and installing in this way does appear to correct the output from -print-search-dirs but gcc still looks for a specs file in the old location. The include paths in the old location are still checked when compiling. Dave
Re: Toolchain relocation
Dave Murphy wrote: Ross Ridge wrote: Dave Murphy wrote: install: e:/devkitPro/devkitARM/lib/gcc/arm-elf/4.1.0/ Don't use a --prefix with a drive letter. Just use --prefix=/devkitARM, and then use "make install DESTDIR=e:/devkitPro" to install it where you actually want it. Doesn't help, it's still checking that the old installation paths exist and producing the "insert disk" dialog. Interestingly configuring and installing in this way does appear to correct the output from -print-search-dirs but gcc still looks for a specs file in the old location. The include paths in the old location are still checked when compiling. Dave Actually no it doesn't. I messed up the DESTDIR and was testing with an old build where I had corrected the relocation of the paths for -print-search-dirs. Oops, sorry. Dave
bootstrap failure on i686-pc-linux-gnu
Hi Geoff I'm seeing a bootstrap failure on x86 Linux that looks to be due to your change (noted below): /home/bje/build/gcc-clean/./gcc/xgcc -B/home/bje/build/gcc-clean/./gcc/ -B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux-gnu/lib/ -isystem /usr/local/i686-pc-linux-gnu/include -isystem /usr/local/i686-pc-linux-gnu/sys-include -O2 -O2 -g -O2 -DIN_GCC-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -msse -c \ /home/bje/source/gcc-clean/gcc/config/i386/crtfastmath.c \ -o crtfastmath.o /home/bje/source/gcc-clean/gcc/config/i386/crtfastmath.c:110: internal compiler error: in prune_unused_types_update_strings, at dwarf2out.c:14009 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html> for instructions. 2006-04-17 Geoffrey Keating <[EMAIL PROTECTED]> * dwarf2out.c (free_AT): Delete. (remove_AT): Update string ref counts. (remove_child_TAG): Don't call free_die. (free_die): Delete. (break_out_includes): Don't call free_die on DW_TAG_GNU_BINCL or DW_TAG_GNU_EINCL. (prune_unused_types_walk_attribs): Reset string refcounts. (prune_unused_types_update_strings): New. (prune_unused_types_prune): Don't make unnecessary stores. Don't call free_die. Do call prune_unused_types_update_strings. (prune_unused_types): Empty debug_str_hash. Ben
Re: Toolchain relocation
Daniel Jacobowitz wrote: No, this patch is not correct. Take a wander through set_std_prefix and the call to update_path in add_prefix. Expected as much :) You might want to play around with relocation on a non-MinGW-hosted system, for comparison. Does that work better? If so, it's likely something which does not handle drive letters. make_relative_prefix may need to be taught something about them. make_relative_prefix seems to handle drive letters fine, I don't believe that's the problem. If a *blank* disk is inserted then the compiler behaves properly so the relocation works. It's not dependent on finding anything in the configured location. What's happening here is that the toolchain is still attempting to access directories in the configured location (even on my debian box) but you only notice a problem if the configured location happens to be a removable media device with no media present on windows when you get a dialog box asking you to insert a disk. Looking at set_std_prefix and update_path it appears that std_prefix is set to the relocated install directory quite early on (line 3278 of gcc.c). update_path only translates the path if it contains std_prefix, this section of code in prefix.c const int len = strlen (std_prefix); if (! strncmp (path, std_prefix, len) && (IS_DIR_SEPARATOR(path[len]) || path[len] == '\0') && key != 0) { If I change this code so that it checks against the configured install location instead of the new relocated location then update_path does indeed translate the paths. It appears to be enough to add static const char *old_prefix = PREFIX; and change that section of code to read as follows const int len = strlen (old_prefix); if (! strncmp (path, old_prefix, len) && (IS_DIR_SEPARATOR(path[len]) || path[len] == '\0') && key != 0) { In this case the output from -print-search-dirs is fully translated with the exception of the install path which is due to always reporting the configured path in line 6306 of gcc.c printf (_("install: %s%s\n"), standard_exec_prefix, machine_suffix); standard_exec_prefix is a constant which refers to the originally installed folder, changing this to gcc_exec_prefix displays the relocated install location. At line 6149 in gcc.c standard_exec_prefix is used to access the specs file /* We need to check standard_exec_prefix/just_machine_suffix/specs for any override of as, ld and libraries. */ specs_file = alloca (strlen (standard_exec_prefix) + strlen (just_machine_suffix) + sizeof ("specs")); strcpy (specs_file, standard_exec_prefix); strcat (specs_file, just_machine_suffix); strcat (specs_file, "specs"); if ( access (specs_file, R_OK) == 0) read_specs (specs_file, TRUE); Changing this to gcc_exec_prefix removes the last of the attempts to access anything in the old path as far as gcc is concerned. There are still some attempts to find include files in the configured install directory which I believe are related to cc1, cc1plus and c-incpath.c which I'm currently attempting to grok. Dave
Re: "Experimental" features in releases
This has been very enlightening information. Is it documented anywhere which gcc features should not be trusted or are known to have faults? Regards, Ivan Richard Guenther wrote: On 4/18/06, Ivan Novick <[EMAIL PROTECTED]> wrote: I am a gcc user at a fininancial institution and IMHO it would not be a good idea to have non-production ready functionality in gcc. We are trying to use gcc for mission critical functionality. It has been always the case that additional options not enabled with any regular -O level gets less testing and more likely has bugs. So for mission critical functionality I would strongly suggest to stay with -O2 and not try to rely on not thoroughly tested combinations of optimization options. So from my point of view, the situation with -ftree-loop-linear is fine - it's ICEing after all, not producing silently wrong-code. For experimental options (where I would include all options not enabled by -O[123s]) known wrong-code bugs should be fixed. That's my 2c, Richard.
Re: "Experimental" features in releases
[apologies that this will come out threaded slightly wrong, none of my handy mail clients feel like letting edit the references and in-reply-to header, apparently it's not cool anymore] > Dale Johannesen wrote: > > > I wasn't aware that it was supposed to be experimental either, and it > > wasn't explained that way when it went in (Sep 2004). (Incomplete or > > buggy would not be surprising, but it sounds now like we're talking > > about fatally flawed design, which is different.) > So let me clarify here by explaining how tree-loop-linear works, and what is broken. It's a bunch of parts, but can generally be broken into (in order): 1.Converting loops to perfect nests (convert_loop_to_perfect_nest/can_convert_loop_to_perfect_nest) 2.Verification that a loop is a perfect nest 3.gcc loop information -> iteration space info converter (gcc_loop_to_lambda_loop) 4.iteration space info manipulator transformer (lambda_*) 5.iteration space info -> gcc loop code generator (lambda_loop_to_gcc_loop) The vast majority of the code is in the last 3 pieces, and it may have some bugs and ICEs, but is algorithmically sound and these are just regular bugs[1]. For example, we have one ICE caused by the 6, due to an assert for something i never had a chance to implement. This is not a major deal to implement (a week), i just haven't had the motivation because it doesn't hit that often. There are a bunch of bugs, wrong code generation, and ICE's, caused by the first two pieces. This is because 1. The transformation it makes is not always legal but 2. It doesn't verify the all the actually necessary safety conditions before transforming the loop to a perfect nest (single exits, verifying the place it moves it to executes under the same conditions, etc) and 3. It doesn't always update the newly created loop correctly anyway. Thus, it is algorithmically unsound in it's current form :) Again, this is *only* the piece that attempts to convert loops to perfect nests. This is in fact, not terribly surprising, since the algorithm used was the result of Sebastian and I sitting at my whiteboard for 30 minutes trying to figure out what we'd need to do to make swim happy :). At one point, the only thing it would interchange was incredibly simple loops, which worked out okay, and some simple perfect nests (which is where swim falls into). Everything was more or less happy. As our optimizers got better at cleaning up code, it seems to have started to transform more and more perfect nests that violate the safety conditions, but it doesn't know about it because the analysis necessary to see this is just not there. It could be made sound by implementing these analysis, but doing so is just as much work as implementing something that isn't a complete hack, like loop distribution. At that point, you may as well just do that (and in fact, the analysis you need to implement is the basis of one of the common loop distribution algorithms). As there have been rumblings and people who said they were working on loop distribution at the last summit, I have simply bided my time (ie making sure all the bugs noted are really the same bugs i know about, etc), in the hopes that marking it experimental (which i distinctly remember it being, but apparently is not, to my surprise) until loop distribution was implemented would suffice, since nobody really wants to drop our spec scores by that much,etc. This evil plan has not worked out, because 1. nobody has implemented loop distribution 2. Mark marked one of the bugs about perfect nest conversion as P1. Since nobody has stepped forward to do #1 in the time since the summit, and it has reached the point where these bugs are piling up, I told Mark in(i don't remember whether this part was mentioned), that my fix for the PR in question, which is a 4.2 P1, would be to remove perfect nest conversion portion. This would leave -ftree-loop-linear in 4.2, but make it not useful for increasing SPEC scores. It would be useful for interchanging perfectly nested loops. > > My understanding from clarifications from others is that this loop nest > bit is somehow not quite right, but that most of it is solid. So, I > think that I may have read more into Dan's mail than he intended. We > shouldn't just to drastic conclusions, but there certainly is something > to be clarified here. > I apologize that the mail was shorter than this one is, and may have been a bit confusing. This is simply a side-effect of having written it while in the middle of simultaneously moving and starting a new job. It is only now that i have had time to sit down and write a detailed description. > -- > Mark Mitchell > CodeSourcery > [EMAIL PROTECTED] > (650) 331-3385 x713 > > [1] Note that before anyone asks, doing iteration space code generation on loops that are not perfect nests is possible, but is a much more computationally expensive idea Plus, in the general case, it requires generation of tons of gu
Re: bootstrap failure on i686-pc-linux-gnu
On 17/04/2006, at 9:55 PM, Ben Elliston wrote: Hi Geoff I'm seeing a bootstrap failure on x86 Linux that looks to be due to your change (noted below): /home/bje/build/gcc-clean/./gcc/xgcc -B/home/bje/build/gcc-clean/./ gcc/ -B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux- gnu/lib/ -isystem /usr/local/i686-pc-linux-gnu/include -isystem / usr/local/i686-pc-linux-gnu/sys-include -O2 -O2 -g -O2 - DIN_GCC-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing- prototypes -Wold-style-definition -isystem ./include -fPIC -g - DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -msse -c \ /home/bje/source/gcc-clean/gcc/config/i386/ crtfastmath.c \ -o crtfastmath.o /home/bje/source/gcc-clean/gcc/config/i386/crtfastmath.c:110: internal compiler error: in prune_unused_types_update_strings, at dwarf2out.c:14009 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html> for instructions. 2006-04-17 Geoffrey Keating <[EMAIL PROTECTED]> * dwarf2out.c (free_AT): Delete. (remove_AT): Update string ref counts. (remove_child_TAG): Don't call free_die. (free_die): Delete. (break_out_includes): Don't call free_die on DW_TAG_GNU_BINCL or DW_TAG_GNU_EINCL. (prune_unused_types_walk_attribs): Reset string refcounts. Does this help? @@ -13802,9 +13777,8 @@ s->refcount++; /* Avoid unnecessarily putting strings that are used less than twice in the hash table. */ - if (s->refcount == 2 - || (s->refcount == 1 - && (DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) != 0)) + if (s->refcount + == ((DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) ? 1 : 2)) { void ** slot; slot = htab_find_slot_with_hash (debug_str_hash, s->str, smime.p7s Description: S/MIME cryptographic signature