Jolanta, Rudite un Karmena nosuuta Tev seksigas buchas
Burviigaas masieriites Gunta, Saulcerite un Evija suuta Tev kveelas buchas Mileetaaju sveetkos! Protams, beibes gaida Tevi uz pikantu izklaidi! http://www.kapec-tev-neatnakt.info : spied uz hiperlinka un uzzini vairaak! Slepenais vaards atlaidei: Saulcerite
Re: GCC 4.6 performance regressions
On 10 February 2011 05:18, Quentin Neill wrote: > On Wed, Feb 9, 2011 at 2:42 AM, Jonathan Wakely wrote: >> On 9 February 2011 08:34, Sebastian Pop wrote: >>> >>> For example x264 defines CFLAGS="-O4 -ffast-math $CFLAGS", and so >>> building this benchmark with CFLAGS="-O2" would have no effect. >> >> Why not? >> >> Ignoring the fact -O3 is the highest level for GCC, the manual says: >> "If you use multiple -O options, with or without level numbers, the >> last such option is the one that is effective." >> http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html >> >> And CFLAGS="-fno-fast-math -O2" would cancel the effects of -ffast-math too. >> > Because the makefile can override CFLAGS in the environment (or in a > make variable) at any time, and GCC wouldn't even see it. Yes, I know how make works, but the example Sebastian gave is a case where CFLAGS from the command-line or environment will be appended to the make variable, allowing the default compiler flags to be overridden. Obviously not all makefiles are written that way (although most aren't written like your example either) but I was referring to a specific case.
Ervita, Patricia un Karmena suuta Tev kveelus sveicienus
Sexy masieriites Patricija, Ance un Karina nosuuta Tev neparastus sveicienus 14. februarii (nu tak Valentiindiena!) Protams, vinas gaida Tevi pie seviim! http://www.kapec-tev-neatnakt.info te ir muusu bildiites! Tava privata masiere, Ance
Re: ICE in get_constraint_for_component_ref
On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi wrote: > Hi all, > > I am trying to port a private target in GCC 4.5.1. Following are the > properties of the target > > #define BITS_PER_UNIT 32 > #define BITS_PER_WORD 32 > #define UNITS_PER_WORD 1 > > > #define CHAR_TYPE_SIZE 32 > #define SHORT_TYPE_SIZE 32 > #define INT_TYPE_SIZE 32 > #define LONG_TYPE_SIZE 32 > #define LONG_LONG_TYPE_SIZE 32 > > > > I am getting an ICE > internal compiler error: in get_constraint_for_component_ref, at > tree-ssa-structalias.c:3031 > > For the following testcase: > > struct fb_cmap { > int start; > int len; > int *green; > }; > > extern struct fb_cmap fb_cmap; > > void directcolor_update_cmap(void) > { > fb_cmap.green[0] = 34; > } > > The following is the output of debug_tree of the argument thats given > for the function get_constraint_for_component_ref > > type type size > unit size > align 32 symtab 0 alias set -1 canonical type > 0x2b6a4554a498 precision 32 min -2147483648> max > pointer_to_this > > unsigned PQI size unit size > > align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930> > > arg 0 type size > unit size > align 32 symtab 0 alias set -1 canonical type > 0x2b6a45602888 fields context > > chain > > used public external common BLK file pr28675.c line 7 col 23 > size unit size 0x2b6a455fc488 3> > align 32 > chain type > public static QI file pr28675.c line 9 col 6 align 32 > initial result D.1200> > (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags > 0x3] ) [0 S1 > A32]) > struct-function 0x2b6a455453f0>> > arg 1 > unsigned PQI file pr28675.c line 4 col 7 size 0x2b6a4553c460 32> unit size > align 32 offset_align 32 > offset > bit offset context > > > pr28675.c:11:10> > > I was wondering if this ICE is due to the fact that this is a 32bit > char target ? Can somebody help me with pointers to debug this issue? Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT. > Regards, > Shafi >
Re: loop hoisting fails
On 09/02/11 15:57, Ian Lance Taylor wrote: For your processor it sounds like you should make a constant more expensive than a register for an outer code of SET. You're right that the cost should really depend on the destination of the set but unfortunately I don't know if you will see that. I agree that costs are unfortunately not very well documented and the middle-end does not use them in the most effective manner. It's still normally the right mechanism to use to control what combine does. Thanks for the help. Is there any bug open or anyone working to improve handling of costs?
Re: ICE in get_constraint_for_component_ref
On 10 February 2011 15:57, Richard Guenther wrote: > On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi wrote: >> Hi all, >> >> I am trying to port a private target in GCC 4.5.1. Following are the >> properties of the target >> >> #define BITS_PER_UNIT 32 >> #define BITS_PER_WORD 32 >> #define UNITS_PER_WORD 1 >> >> >> #define CHAR_TYPE_SIZE 32 >> #define SHORT_TYPE_SIZE 32 >> #define INT_TYPE_SIZE 32 >> #define LONG_TYPE_SIZE 32 >> #define LONG_LONG_TYPE_SIZE 32 >> >> >> >> I am getting an ICE >> internal compiler error: in get_constraint_for_component_ref, at >> tree-ssa-structalias.c:3031 >> >> For the following testcase: >> >> struct fb_cmap { >> int start; >> int len; >> int *green; >> }; >> >> extern struct fb_cmap fb_cmap; >> >> void directcolor_update_cmap(void) >> { >> fb_cmap.green[0] = 34; >> } >> >> The following is the output of debug_tree of the argument thats given >> for the function get_constraint_for_component_ref >> >> > type > type > size >> unit size >> align 32 symtab 0 alias set -1 canonical type >> 0x2b6a4554a498 precision 32 min > -2147483648> max >> pointer_to_this > >> unsigned PQI size unit size >> >> align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930> >> >> arg 0 > type > size >> unit size >> align 32 symtab 0 alias set -1 canonical type >> 0x2b6a45602888 fields context >> >> chain > >> used public external common BLK file pr28675.c line 7 col 23 >> size unit size > 0x2b6a455fc488 3> >> align 32 >> chain > type >> public static QI file pr28675.c line 9 col 6 align 32 >> initial result > D.1200> >> (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags >> 0x3] ) [0 S1 >> A32]) >> struct-function 0x2b6a455453f0>> >> arg 1 >> unsigned PQI file pr28675.c line 4 col 7 size > 0x2b6a4553c460 32> unit size >> align 32 offset_align 32 >> offset >> bit offset context >> > >> pr28675.c:11:10> >> >> I was wondering if this ICE is due to the fact that this is a 32bit >> char target ? Can somebody help me with pointers to debug this issue? > > Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT. > That did the trick. Looking at the code i assume that this is proper and hence should be committed in the trunk and 4.5 branch. Will that be done? Shafi
Re: ICE in get_constraint_for_component_ref
On Thu, Feb 10, 2011 at 12:42 PM, Mohamed Shafi wrote: > On 10 February 2011 15:57, Richard Guenther > wrote: >> On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi wrote: >>> Hi all, >>> >>> I am trying to port a private target in GCC 4.5.1. Following are the >>> properties of the target >>> >>> #define BITS_PER_UNIT 32 >>> #define BITS_PER_WORD 32 >>> #define UNITS_PER_WORD 1 >>> >>> >>> #define CHAR_TYPE_SIZE 32 >>> #define SHORT_TYPE_SIZE 32 >>> #define INT_TYPE_SIZE 32 >>> #define LONG_TYPE_SIZE 32 >>> #define LONG_LONG_TYPE_SIZE 32 >>> >>> >>> >>> I am getting an ICE >>> internal compiler error: in get_constraint_for_component_ref, at >>> tree-ssa-structalias.c:3031 >>> >>> For the following testcase: >>> >>> struct fb_cmap { >>> int start; >>> int len; >>> int *green; >>> }; >>> >>> extern struct fb_cmap fb_cmap; >>> >>> void directcolor_update_cmap(void) >>> { >>> fb_cmap.green[0] = 34; >>> } >>> >>> The following is the output of debug_tree of the argument thats given >>> for the function get_constraint_for_component_ref >>> >>> >> type >> type >> size >>> unit size >>> align 32 symtab 0 alias set -1 canonical type >>> 0x2b6a4554a498 precision 32 min >> -2147483648> max >>> pointer_to_this > >>> unsigned PQI size unit size >>> >>> align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930> >>> >>> arg 0 >> type >> size >>> unit size >>> align 32 symtab 0 alias set -1 canonical type >>> 0x2b6a45602888 fields context >>> >>> chain > >>> used public external common BLK file pr28675.c line 7 col 23 >>> size unit size >> 0x2b6a455fc488 3> >>> align 32 >>> chain >> type >>> public static QI file pr28675.c line 9 col 6 align 32 >>> initial result >> D.1200> >>> (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags >>> 0x3] ) [0 S1 >>> A32]) >>> struct-function 0x2b6a455453f0>> >>> arg 1 >>> unsigned PQI file pr28675.c line 4 col 7 size >> 0x2b6a4553c460 32> unit size >>> align 32 offset_align 32 >>> offset >>> bit offset context >>> > >>> pr28675.c:11:10> >>> >>> I was wondering if this ICE is due to the fact that this is a 32bit >>> char target ? Can somebody help me with pointers to debug this issue? >> >> Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT. >> > > That did the trick. Looking at the code i assume that this is proper > and hence should be committed in the trunk and 4.5 branch. Will that > be done? I'll include it in one of my next bootstraps/tests and commit it. Richard. > Shafi >
Re: ICE in get_constraint_for_component_ref
On 10 February 2011 17:16, Richard Guenther wrote: > On Thu, Feb 10, 2011 at 12:42 PM, Mohamed Shafi wrote: >> On 10 February 2011 15:57, Richard Guenther >> wrote: >>> On Thu, Feb 10, 2011 at 6:23 AM, Mohamed Shafi wrote: Hi all, I am trying to port a private target in GCC 4.5.1. Following are the properties of the target #define BITS_PER_UNIT 32 #define BITS_PER_WORD 32 #define UNITS_PER_WORD 1 #define CHAR_TYPE_SIZE 32 #define SHORT_TYPE_SIZE 32 #define INT_TYPE_SIZE 32 #define LONG_TYPE_SIZE 32 #define LONG_LONG_TYPE_SIZE 32 I am getting an ICE internal compiler error: in get_constraint_for_component_ref, at tree-ssa-structalias.c:3031 For the following testcase: struct fb_cmap { int start; int len; int *green; }; extern struct fb_cmap fb_cmap; void directcolor_update_cmap(void) { fb_cmap.green[0] = 34; } The following is the output of debug_tree of the argument thats given for the function get_constraint_for_component_ref >>> type >>> type >>> size unit size align 32 symtab 0 alias set -1 canonical type 0x2b6a4554a498 precision 32 min >>> -2147483648> max pointer_to_this > unsigned PQI size unit size align 32 symtab 0 alias set -1 canonical type 0x2b6a45559930> arg 0 >>> type >>> size unit size align 32 symtab 0 alias set -1 canonical type 0x2b6a45602888 fields context chain > used public external common BLK file pr28675.c line 7 col 23 size unit size >>> 0x2b6a455fc488 3> align 32 chain >>> type public static QI file pr28675.c line 9 col 6 align 32 initial result >>> D.1200> (mem:QI (symbol_ref:PQI ("directcolor_update_cmap") [flags 0x3] ) [0 S1 A32]) struct-function 0x2b6a455453f0>> arg 1 >>> 0x2b6a45559930> unsigned PQI file pr28675.c line 4 col 7 size >>> 0x2b6a4553c460 32> unit size align 32 offset_align 32 offset bit offset context > pr28675.c:11:10> I was wondering if this ICE is due to the fact that this is a 32bit char target ? Can somebody help me with pointers to debug this issue? >>> >>> Try fixing the * 8 in bitpos_of_field to use BITS_PER_UNIT. >>> >> >> That did the trick. Looking at the code i assume that this is proper >> and hence should be committed in the trunk and 4.5 branch. Will that >> be done? > > I'll include it in one of my next bootstraps/tests and commit it. > Thanks Richard :) Shafi
Re: hints on debugging memory corruption...
Joern Rennecke writes: > Quoting Tom Tromey : > >>> "Basile" == Basile Starynkevitch writes: >> >> Basile> So I need to understand who is writing the 0x101 in that field. > > >> One thing to watch out for is that the memory can be recycled. I've >> been very confused whenever I've forgotten this. I have a hack for the >> GC (appended -- ancient enough that it probably won't apply) that makes >> it easy to notice when an object you are interested in is collected. >> IIRC I apply this before the first run, call ggc_watch_object for the >> thing I am interested in, and then see in what GC cycle the real one is >> allocated. > > If what you are looking for survives such a change, postponing garbage > collection so it won't happen till the crash can make things simpler. For the sake of archiving these tricks how do you postpone garbage collection in practise? -- Dodji
Re: loop hoisting fails
On 09/02/11 15:57, Ian Lance Taylor wrote: For your processor it sounds like you should make a constant more expensive than a register for an outer code of SET. You're right that the cost should really depend on the destination of the set but unfortunately I don't know if you will see that. Increasing the cost of constants slightly for outer code set actually works. It blocks cse from doing the transformation but then gcse comes and does the transformation without consulting costs. That makes the cost change pretty useless. Also, saying that 0 with a set is slightly more expensive than a register is also a mistake since it depends on the destination of the 0. But then again I guess this is a gcc constraint which everyone has probably to live with.
Vector permutation only deals with # of vector elements same as mask?
Hi, I noticed that vector permutation gets more use in GCC 4.6, which is great. It is used to handle negative step by reversing vector elements now. However, after reading the related code, I understood that it only works when the # of vector elements is the same as that of mask vector in the following code. perm_mask_for_reverse (tree-vect-stmts.c) ... mask_type = get_vectype_for_scalar_type (mask_element_type); nunits = TYPE_VECTOR_SUBPARTS (vectype); if (!mask_type || TYPE_VECTOR_SUBPARTS (vectype) != TYPE_VECTOR_SUBPARTS (mask_type)) return NULL; ... For PowerPC altivec, the mask_type is V16QI. It means that compiler can only permute V16QI type. But given the capability of altivec vperm instruction, it can permute any 128-bit type (V8HI, V4SI, etc). We just need convert in/out V16QI from given types and a bit more extra work in producing mask. Do I understand correctly or miss something here? Thanks, Bingfeng Mei
Re: loop hoisting fails
"Paulo J. Matos" writes: > On 09/02/11 15:57, Ian Lance Taylor wrote: > >> For your processor it sounds like you should make a constant more >> expensive than a register for an outer code of SET. You're right that >> the cost should really depend on the destination of the set but >> unfortunately I don't know if you will see that. > > Increasing the cost of constants slightly for outer code set actually > works. It blocks cse from doing the transformation but then gcse comes > and does the transformation without consulting costs. That makes the > cost change pretty useless. Bother. I've encountered that problem before and I think I used a sledgehammer (a local patch). It's definitely a bug that gcse doesn't consider costs. Ian
Re: loop hoisting fails
On 10/02/11 16:04, Ian Lance Taylor wrote: Bother. I've encountered that problem before and I think I used a sledgehammer (a local patch). It's definitely a bug that gcse doesn't consider costs. At least I am happy that you confirm this. :) Have you reported a bug for this before?
Re: Proposal to move Valgrind annotations from "valgrind" to "misc" --enable-checking option
2011/2/8 Hans-Peter Nilsson : > On Thu, 27 Jan 2011, Laurynas Biveinis wrote: >> Thus I propose to separate the two. To avoid introducing another >> --enable-checking option, let's move the annotations to the "misc" >> checking and also enable "misc" too if "valgrind" is requested. Both >> these options are disabled for releases, so no performance loss there. >> >> There are two drawbacks I can think of. First, if one wants Valgrind >> annotations but does not have the required headers, then the compiler >> will be built without them - silently (currently >> --enable-checking=valgrind fails if headers are not found). Second, >> the compiler binary will be built slightly different if "misc" is >> enabled depending on the presence or absence of those headers. I >> believe these are minor enough. [...] > If people want your "misc" changes but failing without headers, > add "--enable-valgrind-annotations". I think this is a good idea. At gc-improv I will go with my original plan of moving annotations to misc and will add --enable-valgrind-annotations for hard error if headers not available. Thanks, -- Laurynas
Re: Vector permutation only deals with # of vector elements same as mask?
Hi, "Bingfeng Mei" wrote on 10/02/2011 05:35:45 PM: > > Hi, > I noticed that vector permutation gets more use in GCC > 4.6, which is great. It is used to handle negative step > by reversing vector elements now. > > However, after reading the related code, I understood > that it only works when the # of vector elements is > the same as that of mask vector in the following code. > > perm_mask_for_reverse (tree-vect-stmts.c) > ... > mask_type = get_vectype_for_scalar_type (mask_element_type); > nunits = TYPE_VECTOR_SUBPARTS (vectype); > if (!mask_type > || TYPE_VECTOR_SUBPARTS (vectype) != TYPE_VECTOR_SUBPARTS (mask_type)) > return NULL; > ... > > For PowerPC altivec, the mask_type is V16QI. It means that > compiler can only permute V16QI type. But given the capability of > altivec vperm instruction, it can permute any 128-bit type > (V8HI, V4SI, etc). We just need convert in/out V16QI from > given types and a bit more extra work in producing mask. > > Do I understand correctly or miss something here? Yes, you are right. The support of reverse access is somewhat limited. Please see vect_transform_slp_perm_load() in tree-vect-slp.c for example of all type permutation support. But, anyway, reverse accesses are not supported for altivec's load realignment scheme. Ira > > Thanks, > Bingfeng Mei > > > >
Re: loop hoisting fails
"Paulo J. Matos" writes: > On 10/02/11 16:04, Ian Lance Taylor wrote: >> >> Bother. I've encountered that problem before and I think I used a >> sledgehammer (a local patch). It's definitely a bug that gcse doesn't >> consider costs. >> > > At least I am happy that you confirm this. :) Have you reported a bug > for this before? No, because it's inherently target specific, and when I encountered it I was working on a private target. The issue can't be reliably fixed without some test cases. Ian
Re: loop hoisting fails
On 02/09/2011 07:07 AM, Ian Lance Taylor wrote: > "Paulo J. Matos" writes: > >> But then this is combined by cse into: >> >> (set (mem/s:QI (reg:QI 41)) (const_int 0)) >> >> and bammm, same problem. No loop hoisting. What's the best way to >> handle this? Any suggestions? > > You need to set TARGET_RTX_COSTS to indicate that this operation is > relatively expensive. That should stop combine from generating it. If constants are never valid as the source of a store, then you could try something like register_operand (operands[0], ) || register_operand (operands[1], ) in the extra-predicate field of your move insns. This is not unlike the check for two memories that many ports use at the moment. C.f. movsi_internal for i386. That will prevent combine, or anyone else for that matter, from re-combining the constant load. r~
Re: hints on debugging memory corruption...
On 02/10/2011 06:32 AM, Dodji Seketeli wrote: > For the sake of archiving these tricks how do you postpone garbage > collection in practise? Set --param ggc-min-heapsize to a very large value. r~
Re: math-68881.h vs -ffast-math
On 02/09/2011 03:39 PM, Vincent Rivière wrote: > The file gcc/config/m68k/math-68881.h is distributed with GCC. It is > about inlining the libm functions using FPU instructions on m68k > targets. > > But -ffast-math seems to serve the same purpose, even better. > > My question: Is math-68881.h still useful for some purpose ? You're certainly correct that incorporating the functions into the compiler directly is better than an external header file. Not for the least of reasons that e.g. the Fortran compiler can use the builtins, but not the header file. It looks like not all of the functions in math-68881.h have been transfered into the m68k.md file. Almost all of the functions have a builtin equivalent though -- look at the i386.md file for pointers. Patches to add the patterns to support the missing builtins one at a time would be welcome. r~
Re: hints on debugging memory corruption...
On Thu, 10 Feb 2011 15:32:39 +0100 Dodji Seketeli wrote: > Joern Rennecke writes: > > > Quoting Tom Tromey : > > > >>> "Basile" == Basile Starynkevitch writes: > >> > >> Basile> So I need to understand who is writing the 0x101 in that field. > > > > > >> One thing to watch out for is that the memory can be recycled. I've > >> been very confused whenever I've forgotten this. I have a hack for the > >> GC (appended -- ancient enough that it probably won't apply) that makes > >> it easy to notice when an object you are interested in is collected. > >> IIRC I apply this before the first run, call ggc_watch_object for the > >> thing I am interested in, and then see in what GC cycle the real one is > >> allocated. > > > > If what you are looking for survives such a change, postponing garbage > > collection so it won't happen till the crash can make things simpler. > > For the sake of archiving these tricks how do you postpone garbage > collection in practise? > BTW, postponing garbage collection is completely inpossible for me (at the most, I could disable it in the gdb debugger, but not in real MELT runs), since the GC (I mean the ggc_collect() routine) is called at arbitrary moments by the MELT runtime (it is called from the MELT garbage collector, routine melt_garbcoll(), which takes the appropriate measures -together with the MELT translator- to do that safely.) Thanks for all the help. I did find out what was wrong: it was my (incorrect) understanding of get_loop_body. I thought (incorrectly) that it returned a GTY-ed pointer. In my opinion, that function has a bizarre property: it returns a calloc-ed array of basic_block-s, which are themselves GTY-ed (that is managed by the Ggc collector with the help of GTY annotations for gengtype). What I find bizarre, is that get_loop_body returns a manually managed memory data chunk (an array, actually) of Ggc-ed pointers (as every one guess, I do like the idea of a garbage collector, and my insane wish is that GCC would have much more GTY-ed data. I do know that this mine position is against the majority). I would find much more logical (or at least more elegant to my eyes) if get_loop_body returned (for instance) a GTY-ed vector [or any other GTY ((variable)) thing] of loop-s pointers. Strangely, it doesn't! In MELT parlance, I cannot simply make a MELT primitive which invokes get_loop_body. It has to be interfaced by a MELT function which builds a MELT tuple of MELT boxed basic_block-s. Now, this function needs support from the MELT runtime to permit mutation of MELT boxed basic_block-s. So I had to generate the meltgc_basicblock_updatebox in file gcc/meltrunsup.h [that file is MELT generated. MELT is now able to generate all the boxing, upboxing, hashmapping... of any GTY-ed ctypes]. So I had to improve the MELT generator (file melt/warmelt-outobj.melt) to do that generation of updatebox routines. Once the generated code was better, I could commit it (using git) to the MELT branch. Now, I have to merge the latest trunk into the MELT branch, but since I switched to git, I am very scared to do that, and I am not sure to understand the reliable procedure to do so. (I was able to merge once the trunk into MELT using git, but that was with the kind help of Andreas Schwab and Dodji Seketeli, and I am not sure to have understood all..). Some of my uses of git (on GCC & MELT) gives me bizarre (hence scary) messages. I am really ashamed to be a git newbie. I am sort of able to use it on some (own, non GCC) code, but I am very scared of messing GCC with git and I use it on GCC MELT with fear. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***
Re: RTL Expand pass - Difference in float and int datatypes
On 02/09/2011 08:55 AM, Anitha Boyapati wrote: > Direct-conditional branch > >> (jump_insn 9 8 34 3 gt.c:4 (set (pc) >> (if_then_else (gt:CC (cc0) >> (const_int 0 [0x0])) >> (label_ref 12) >> (pc))) -1 (nil)) > > Reverse-conditional Branch > >> (jump_insn 9 8 34 3 gt.c:4 (set (pc) >> (if_then_else (gt:CC (cc0) >> (const_int 0 [0x0])) >> (pc))) -1 (nil)) >> (label_ref 14) You do not have to support reverse-conditional branches in your port, and frankly I encourage you not to do so. These patterns are, or ought to be, long obsolete. > The latter pattern is supposed to emit assembly which tests for the > reverse-condition. For instance if the former pattern emits assembly > instruction like "brgt " then the latter pattern is supposed to > emit instruction like "brle " See reverse_condition_maybe_unordered -- the true inverse of GT is UNLE. That is, UNORDERED || LE. Support as many of the UNORDERED comparisons as possible and you'll obviate the need for the reverse-conditional patterns, and also prevent the generation of conditions that cannot be implemented on your target. >From what I can determine re avr32 fcmp.s, you cannot support any of the compound unordered comparisons directly. You can only handle ORDERED and UNORDERED via the overflow (V) bit; there are no branches that combine V with any other flag in a way that's useful for floats. r~
Re: hints on debugging memory corruption...
Richard Henderson writes: > On 02/10/2011 06:32 AM, Dodji Seketeli wrote: >> For the sake of archiving these tricks how do you postpone garbage >> collection in practise? > > Set --param ggc-min-heapsize to a very large value. That wouldn't work for pieces of code that explicitly call ggc_collect, would it? -- Dodji
Re: hints on debugging memory corruption...
On 02/10/2011 10:58 AM, Dodji Seketeli wrote: >> Set --param ggc-min-heapsize to a very large value. > > That wouldn't work for pieces of code that explicitly call > ggc_collect, would it? > Sure it does. The first thing that ggc_collect does is determine if enough work has been done to warrant a collect. r~
gcc-4.5-20110210 is now available
Snapshot gcc-4.5-20110210 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110210/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 170024 You'll find: gcc-4.5-20110210.tar.bz2 Complete GCC (includes all of below) MD5=d3c4a9e61562347f46b62f047ceac2d9 SHA1=df4cca0ad0c5cb261f29d82a9458e8717c3efc36 gcc-core-4.5-20110210.tar.bz2C front end and core compiler MD5=e6c04428ce838c66dc049d02c4a648de SHA1=bf919b22bf531e7939d6ce8f4e2b3182c8d993b5 gcc-ada-4.5-20110210.tar.bz2 Ada front end and runtime MD5=63744d18a12448d2e5d741ccdd708542 SHA1=82a8114f7d0724aa57f5a8566be1164ba9f94221 gcc-fortran-4.5-20110210.tar.bz2 Fortran front end and runtime MD5=154592e46a49b6cb5c7d58a7151c062e SHA1=275c4e5300f4fc5b3273be486655d6da5f06464d gcc-g++-4.5-20110210.tar.bz2 C++ front end and runtime MD5=2adb2858572affddfac543f7c0220cd7 SHA1=944b088ab2717f7f91fdc51fa48d7d237445627a gcc-go-4.5-20110210.tar.bz2 Go front end and runtime MD5=cc03f409c2e86ecb957093ab8623bbd9 SHA1=0156117d17412d4e66ac611e2a24e9cde1b7491d gcc-java-4.5-20110210.tar.bz2Java front end and runtime MD5=451dd3e57b909228acaace6df7b8d050 SHA1=67f709d43106d4be82fa1b47955e4655574c1306 gcc-objc-4.5-20110210.tar.bz2Objective-C front end and runtime MD5=2d951e87413da0ca07c9a1d0090cbca6 SHA1=e82b28863a4a62cc771a1e56f3ffd4cb9a43da98 gcc-testsuite-4.5-20110210.tar.bz2 The GCC testsuite MD5=439bbbaec39c819f1e7eae3e44c4f29f SHA1=3e36dccc6fb1773cd27762f3f9e04177bff46822 Diffs from 4.5-20110203 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: RTL Expand pass - Difference in float and int datatypes
On 11 February 2011 00:20, Richard Henderson wrote: > On 02/09/2011 08:55 AM, Anitha Boyapati wrote: >> Reverse-conditional Branch >> >>> (jump_insn 9 8 34 3 gt.c:4 (set (pc) >>> (if_then_else (gt:CC (cc0) >>> (const_int 0 [0x0])) >>> (pc))) -1 (nil)) >>> (label_ref 14) > > You do not have to support reverse-conditional branches in your port, > and frankly I encourage you not to do so. These patterns are, or ought > to be, long obsolete. > >> The latter pattern is supposed to emit assembly which tests for the >> reverse-condition. For instance if the former pattern emits assembly >> instruction like "brgt " then the latter pattern is supposed to >> emit instruction like "brle " > > See reverse_condition_maybe_unordered -- the true inverse of GT is UNLE. > That is, UNORDERED || LE. > > Support as many of the UNORDERED comparisons as possible and you'll > obviate the need for the reverse-conditional patterns, and also > prevent the generation of conditions that cannot be implemented > on your target. > Does this mean that the requirement for reverse-conditional pattern mostly arise because of unordered comparisons? It is true that our current machine description does not handle conditions 'unordered', 'unlt', 'ungt', 'unge', 'unle'... > From what I can determine re avr32 fcmp.s, you cannot support any > of the compound unordered comparisons directly. You can only handle > ORDERED and UNORDERED via the overflow (V) bit; there are no branches > that combine V with any other flag in a way that's useful for floats. There is "brvs" which detects if the unordered/overflow bit is set and then jumps to given location. So after a floating-point comparison, when the branch is called (we are not using cbranch. The branch pattern looks exactly similar to the one described in Code Iterators section of gcc internals), we first check for unordered comparison using 'brvs' and then emit br. Since an unordered comparison is always false in IEEE 754, the emitted code will normally look like: Direct-Conditional Branches: brvs+6 //jump to fall tru br // cond is eq, lt, le, gt, ge ... //fall tru : But for a reverse-condition, 'brvs' should jump to label (and not to fall tru). Reverse-Conditional Branches: brvs// jump to LABEL for unordered br // ... : ... This is when we got hit, as to when and why a reverse-conditional branch is generated. From what I see in cond-optab branch, it looks like these are going to deprecated in 4.5 version. Going by what you said we will first try supporting unordered comparisons in code_iterators and play around it for some time to understand better :-) Thanks a lot Richard! Anitha
libgcc question
Am I doing something wrong or there's a problem with libgcc? I'm compiling code for an ARM based micro. I'm using gcc 4.5.1, configured for arm-eabi-none, C compiler only. The target is a standalone embedded device, no OS, nothing, not even a C library, just bare metal. The compiler (and linker, gcc is being used to start the linker) get the -ffreestanding -static -static-libgcc -nostdlib flags. Everything works fine until I want to do a 64-bit division. Then the linking fails, telling me that I have undefined references to memcpy, abort, __exidx_start and __exidx_end. Telling the linker to create an output despite the missing references reveals that the resulting object file contains the actual 64-bit division from libgcc, as expected. Plus it also contains about 4KB worth of functions related to unwinding, which are never referenced anywhere (i.e. the libgcc division routine does not call or use *any* of the functions there). There is all sorts of code in there to deal with the (nonexistent) float-point coprocessor, throwing exceptions and other magic. So, a function containing a single call to __aeabi_uldivmod results in about 4 KB unused code being sucked in from libgcc.a (some of which could not even be executed by the target processor), with 4 undefined references, of which __exidx_start and __exidx_end are, as far as I know, not even standard library functions. Is this a bug in libgcc, have I massiviley misconfigured the compilation of gcc itself or am I doing something horribly wrong but can't see the obvious? Zoltan
Re: loop hoisting fails
On 10/02/11 17:59, Richard Henderson wrote: If constants are never valid as the source of a store, They are but it really depends to which registers they are going to. If the destination belongs to a certain class it is ok, for all the others it is not. It is tricky even to define costs when you are given: rtx_costs of (const_int 0), outer_code SET. This might be very, very cheap or very, very expensive for certain classes of registers.
Re: loop hoisting fails
On 10/02/11 16:04, Ian Lance Taylor wrote: Bother. I've encountered that problem before and I think I used a sledgehammer (a local patch). It's definitely a bug that gcse doesn't consider costs. I think I might try also patching my local gcc. I guess the trick is to check for the cost of the alternative before making the replacement in gcse, right? Is it possible to have an idea of how you did it?