plugins directory?
Hello All, This is a discussion complementing http://gcc.gnu.org/ml/gcc-patches/2009-06/msg01587.html. So plugins here means GPLv3 licensed GCC plugins. First, I believe that progressively, there will be some plugins dedicated to the compilation of some major specific free software, mostly for application specific warnings. For example, it could happen that the Linux kernel community, or KDE/QT community, or Gnome/GTK community, or Apache community will progressively develop plugins to help their development. BTW, the treehydra effort from Mozilla (that is Taras Glek) is illustrative of my intuition. I am alone in having this belief (that application specific GCC plugins would appear, mostly for static analysis or better diagnostic purposes, not for code generation.)? If that will happen, most of the plugins will indeed be GCC version specific (so a gcckde plugin binary -the gcckde-4.7.2.so file- would indeed be specific to gcc 4.7.2; the same plugin for gcc 4.7.3 would indeed need a recompilation at least). I am supposing a user of some -e.g. Linux- distribution which happens to package commonly used application specific GCC plugins that the user already installed.. Let's suppose the user is developing a KDE application. He wants the KDE/QT plugin to check some stuff about his code. First, I believe that this user does not want to change his makefile when gcc bumps from 4.7.2 to 4.7.3. As a concrete example, as a Debian/Sid=Unstable user, I never had to change a Makefile because gcc bumped from 4.3.2 to 4.3.3. I believe this is practically very important for plugins to be used (and for application specific plugins to flourish). So, what is the Makefile that such user would use? It definitely cannot contain a hardcoded plugin path like e.g. CXXFLAGS=-fplugin=/usr/lib/gcc/4.7.2/plugins/kdegcc.so -O At the very least, the hardcoded plugin path should be somehow computed, e.g. CXXFLAGS= -fplugin=$(shell pkg-config -gccplugin kde) -O Alternatively, one could some Debian package would offer a gcc-for-kde command (probably a shell script) to compile KDE applications. However, we could suppose that some computation (to find where & which exact kdegcc.so is loaded) is done by GCC. Concretely, I was thinking of CXXFLAGS= -fplugin=kdegcc -O and then we should provide some machinery to find the kdegcc.so plugin at the appropriate place. BTW, this already happens in GCC for other stuff: the gcc driver does find and manage the cc1 executable! I would prefer some discussion to happen, instead of blindly proposing patches on gcc-patches@ and have them rejected until I find a consensus. Do you think that 1. We should not care about where system wide plugins are located, and leave the -fplugin command as it is. 2. We should have a path of GCC plugins, i.e. a sequence of directories where plugins are searched. a. should this path be setable only by a -fplugin-path option b. should this path be also settable thru e.g. an environment variable like GCC_PLUGINS_PATH c. what is the default value of this path Notice that dlopen already uses LD_LIBRARY_PATH, but I feel it is not entirely appropriate for our purposes (because plugins are not libraries, even if they share the .so ...). I think we should reach a consesus quickly: the stage 1 is closing soon, and such a patch has to happen very soon! Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: How to deal with unrecognizable RTL code
daniel.tian wrote: > But I have already defined the PROMOTE_MODE which will convert all the > QImode, HImode to SImode in computation. ? That's not what the code you posted says: > PROMOTE_MODE, stolen from mips, is defined in my port: > #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ > if (GET_MODE_CLASS (MODE) == MODE_INT \ > && GET_MODE_SIZE (MODE) < UNITS_PER_WORD) \ >{ \ > (MODE) = Pmode; \ >} The docs for PROMOTE_MODE suggest using `word_mode'; it may be that on mips, Pmode and word_mode are the same, but they aren't on your machine? > And the RTL mentioned, including > " SUBREG " should not exist after reload. Now all the instructions > pattern, which modes are smaller than SImode, formed in SUBREG. Is one of your MD macros not handling the post-reload REG_OK_STRICT macro correctly perhaps? cheers, DaveK
Re: (known?) Issue with bitmap iterators
Daniel Berlin wrote: > Also, what do you expect the semantics to be? Since we don't expect an iterator to return the same bit twice when iterating in any case, the ideal would be that it shouldn't matter what happens to bits that the iterator has already passed. > In particular, are new bits past the current index iterated over, or > do you expect to iterate over the bitmap as it existed at the time you > started iteration? That would be an ecumenical matter! Err, I mean ... maybe the best solution (particularly in terms of preventing future bugs) would be for opening an iterator to put the bitmap into a read-only mode that causes bitmap_clear_bit or bitmap_set_bit to fail, and that automatically clears when the iterator runs off the end? cheers, DaveK
Re: plugins directory?
On Mon, Jun 22, 2009 at 1:20 PM, Basile STARYNKEVITCH wrote: > Hello All, > > This is a discussion complementing > http://gcc.gnu.org/ml/gcc-patches/2009-06/msg01587.html. So plugins here > means GPLv3 licensed GCC plugins. > > First, I believe that progressively, there will be some plugins dedicated to > the compilation of some major specific free software, mostly for application > specific warnings. > For example, it could happen that the Linux kernel community, or KDE/QT > community, or Gnome/GTK community, or Apache community will progressively > develop plugins to help their development. BTW, the treehydra effort from > Mozilla (that is Taras Glek) is illustrative of my intuition. > > I am alone in having this belief (that application specific GCC plugins > would appear, mostly for static analysis or better diagnostic purposes, not > for code generation.)? > > If that will happen, most of the plugins will indeed be GCC version specific > (so a gcckde plugin binary -the gcckde-4.7.2.so file- would indeed be > specific to gcc 4.7.2; the same plugin for gcc 4.7.3 would indeed need a > recompilation at least). > > I am supposing a user of some -e.g. Linux- distribution which happens to > package commonly used application specific GCC plugins that the user already > installed.. Let's suppose the user is developing a KDE application. He wants > the KDE/QT plugin to check some stuff about his code. > > First, I believe that this user does not want to change his makefile when > gcc bumps from 4.7.2 to 4.7.3. As a concrete example, as a > Debian/Sid=Unstable user, I never had to change a Makefile because gcc > bumped from 4.3.2 to 4.3.3. I believe this is practically very important for > plugins to be used (and for application specific plugins to flourish). > > So, what is the Makefile that such user would use? It definitely cannot > contain a hardcoded plugin path like e.g. > CXXFLAGS=-fplugin=/usr/lib/gcc/4.7.2/plugins/kdegcc.so -O > At the very least, the hardcoded plugin path should be somehow computed, > e.g. > CXXFLAGS= -fplugin=$(shell pkg-config -gccplugin kde) -O > Alternatively, one could some Debian package would offer a gcc-for-kde > command (probably a shell script) to compile KDE applications. > > However, we could suppose that some computation (to find where & which exact > kdegcc.so is loaded) is done by GCC. Concretely, I was thinking of > CXXFLAGS= -fplugin=kdegcc -O > and then we should provide some machinery to find the kdegcc.so plugin at > the appropriate place. BTW, this already happens in GCC for other stuff: the > gcc driver does find and manage the cc1 executable! > > I would prefer some discussion to happen, instead of blindly proposing > patches on gcc-patches@ and have them rejected until I find a consensus. > > Do you think that > > 1. We should not care about where system wide plugins are located, and leave > the -fplugin command as it is. > > 2. We should have a path of GCC plugins, i.e. a sequence of directories > where plugins are searched. > a. should this path be setable only by a -fplugin-path option > b. should this path be also settable thru e.g. an environment variable like > GCC_PLUGINS_PATH > c. what is the default value of this path > > Notice that dlopen already uses LD_LIBRARY_PATH, but I feel it is not > entirely appropriate for our purposes (because plugins are not libraries, > even if they share the .so ...). > > I think we should reach a consesus quickly: the stage 1 is closing soon, and > such a patch has to happen very soon! On the contrary. If we put in place such a mechanism we have to maintain it forever. It is hard to tell what is the correct solution without distributors facing the problem (there are no existing plugins that can be packaged or are packaged now). So, let's wait until the problem happens and we know all of its aspects and deal with it then. Richard.
Re: (known?) Issue with bitmap iterators
On Mon, Jun 22, 2009 at 1:45 PM, Dave Korn wrote: > Daniel Berlin wrote: > >> Also, what do you expect the semantics to be? > > Since we don't expect an iterator to return the same bit twice when > iterating in any case, the ideal would be that it shouldn't matter what > happens to bits that the iterator has already passed. > >> In particular, are new bits past the current index iterated over, or >> do you expect to iterate over the bitmap as it existed at the time you >> started iteration? > > That would be an ecumenical matter! Err, I mean ... maybe the best solution > (particularly in terms of preventing future bugs) would be for opening an > iterator to put the bitmap into a read-only mode that causes bitmap_clear_bit > or bitmap_set_bit to fail, and that automatically clears when the iterator > runs off the end? Heh, that sounds useful. Keep bitmaps forced readonly during iterating over them but be able to actually verify it. Might need some new exit-from-iterating magic though. Richard. > > cheers, > DaveK > >
Re: an opencl to ati stream converter
Hi, Soufiane Thanks for your interest in GCC and OpenCL. There currently is a GCC Summer of Code project working on OpenCL and members of the GCC community working on OpenCL projects. Hopefully you can collaborate with one of them. David On Sat, Jun 20, 2009 at 5:01 PM, soufiane BAGHDADI wrote: > Hello, > > I am Soufiane BAGHDADI, an Algerian student, I worked before on some > parsers (lex&yacc, javacc) and I loved this domain, so I want to > participate in the gcc project as it is a great open source compiler > and as I have a lot of time this summer. > I am working on creating an OpenCL for ATI graphics cards, but I don't > want to do something that is done before and this is why I send you > this message! > > my approach is to create an OpenCL to ATI stream converter then i will > call the stream converter to generate the binary file > i found also that there is a pseudo assembly language that is called > ATI IL (ATI Intermediate Language) and i don't know How efficient it > is? is better or worse than ATI stream? > > So please answer my questions: > is there a working OpenCL compiler that supports ATI graphics cards > and that is implemented in GCC? > Can I work with you in this project? if yes, can you tell me who to contact? > > Soufiane B. >
Re: (known?) Issue with bitmap iterators
Richard Guenther wrote: > On Mon, Jun 22, 2009 at 1:45 PM, Dave > Korn wrote: >> Daniel Berlin wrote: >> >>> Also, what do you expect the semantics to be? >> Since we don't expect an iterator to return the same bit twice when >> iterating in any case, the ideal would be that it shouldn't matter what >> happens to bits that the iterator has already passed. >> >>> In particular, are new bits past the current index iterated over, or >>> do you expect to iterate over the bitmap as it existed at the time you >>> started iteration? >> That would be an ecumenical matter! Err, I mean ... maybe the best solution >> (particularly in terms of preventing future bugs) would be for opening an >> iterator to put the bitmap into a read-only mode that causes bitmap_clear_bit >> or bitmap_set_bit to fail, and that automatically clears when the iterator >> runs off the end? > > Heh, that sounds useful. Keep bitmaps forced readonly during > iterating over them but be able to actually verify it. > > Might need some new exit-from-iterating magic though. I took a look. I don't think it would be hideously hacky to do something like ... #define EXECUTE_IF_SET_IN_BITMAP(BITMAP, MIN, BITNUM, ITER) \ for ((BITMAP)->ro_flag = true,\ bmp_iter_set_init (&(ITER), (BITMAP), (MIN), &(BITNUM));\ (BITMAP)->ro_flag = bmp_iter_set (&(ITER), &(BITNUM)); \ bmp_iter_next (&(ITER), &(BITNUM))) cheers, DaveK
Re: (known?) Issue with bitmap iterators
I took a look. I don't think it would be hideously hacky to do something like ... #define EXECUTE_IF_SET_IN_BITMAP(BITMAP, MIN, BITNUM, ITER) \ for ((BITMAP)->ro_flag = true, \ bmp_iter_set_init (&(ITER), (BITMAP), (MIN), &(BITNUM));\ (BITMAP)->ro_flag = bmp_iter_set (&(ITER), &(BITNUM));\ bmp_iter_next (&(ITER), &(BITNUM))) You should add a BREAK_FROM_EXECUTE_IF_SET(BITMAP) macro too, however. Paolo
Unnecessary regmoves in modulo scheduler?
Hello, Ayal, It may be a stupid a question :-). After reading the modulo scheduling code in GCC, I have a question about necessity of generating reg moves. The reg move is supposed to break register lifetime > II in absence of rotating register file. However, in current GCC implementation, it seems unnecessary since extensive dependencies are drawn between nodes in GCC. Examining the DDG generated (a DDG based on sms-6.c is appended at the end of the mail), I found a true register dependency is always accompanied with a cross-iteration anti dependency. This should guarantee the true dependence cannot have lifetime longer than II. A --> B true dep (e.g., insn 36 -> insn 38 in the DDG) B --> A anti dep with distance = 1(e.g., insn 38 -> insn 36 in the DDG) The second dependdency should lead to : Sched_Time(A) + II >(=) Sched_Time(B) which means Sched_Time(B) - Sched_Time(A) <(=) II and no need for reg move. Similarly, an anti register dependency is always accompanied with a cross-iteration true dependency. A --> B anti dep (e.g., insn 36 -> insn 53 in the DDG) B --> A true dep with distance = 1(e.g., insn 53 -> insn 36 in the DDG) We can reach similar conclusion. I wonder what other scenario would require to generate reg moves. Am I missing some obvious points? Thanks in advance. Cheers, Bingfeng Mei Broadcom UK DDG from sms-6.c SMS loop num: 1, file: sms-6.c, line: 9 Node num: 0 (insn 36 35 37 3 sms-6.c:11 (set (reg:SI 113) (mem:SI (reg:SI 108 [ ivtmp.44 ]) [4 S4 A32])) 184 {*movwsi} (nil)) OUT ARCS: [36 -(A,0,0)-> 53] [36 -(T,6,0)-> 38] IN ARCS: [38 -(A,0,1)-> 36] [53 -(T,1,1)-> 36] Node num: 1 (insn 37 36 38 3 sms-6.c:11 (set (reg:SI 114) (mem:SI (reg:SI 109 [ ivtmp.42 ]) [5 S4 A32])) 184 {*movwsi} (nil)) OUT ARCS: [37 -(A,0,0)-> 52] [37 -(T,6,0)-> 38] IN ARCS: [38 -(A,0,1)-> 37] [52 -(T,1,1)-> 37] Node num: 2 (insn 38 37 39 3 sms-6.c:11 (set (reg:SI 115) (mult:SI (reg:SI 113) (reg:SI 114))) 262 {mulsi3} (expr_list:REG_DEAD (reg:SI 114) (expr_list:REG_DEAD (reg:SI 113) (nil OUT ARCS: [38 -(A,0,1)-> 37] [38 -(A,0,1)-> 36] [38 -(T,8,0)-> 39] IN ARCS: [39 -(A,0,1)-> 38] [36 -(T,6,0)-> 38] [37 -(T,6,0)-> 38] Node num: 3 (insn 39 38 40 3 sms-6.c:11 (set (mem:SI (reg:SI 107 [ ivtmp.45 ]) [3 S4 A32]) (reg:SI 115)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 115) (nil))) OUT ARCS: [39 -(A,0,1)-> 38] [39 -(O,1,0)-> 69] [39 -(A,0,0)-> 54] IN ARCS: [54 -(T,1,1)-> 39] [51 -(O,1,1)-> 39] [47 -(O,1,1)-> 39] [43 -(O,1,1)-> 39] [38 -(T,8,0)-> 39] Node num: 4 (insn 40 39 41 3 sms-6.c:12 (set (reg:SI 116) (mem:SI (plus:SI (reg:SI 108 [ ivtmp.44 ]) (const_int 4 [0x4])) [4 S4 A32])) 184 {*movwsi} (nil)) OUT ARCS: [40 -(A,0,0)-> 53] [40 -(T,6,0)-> 42] IN ARCS: [42 -(A,0,1)-> 40] [53 -(T,1,1)-> 40] Node num: 5 (insn 41 40 42 3 sms-6.c:12 (set (reg:SI 117) (mem:SI (plus:SI (reg:SI 109 [ ivtmp.42 ]) (const_int 4 [0x4])) [5 S4 A32])) 184 {*movwsi} (nil)) OUT ARCS: [41 -(A,0,0)-> 52] [41 -(T,6,0)-> 42] IN ARCS: [42 -(A,0,1)-> 41] [52 -(T,1,1)-> 41] Node num: 6 (insn 42 41 43 3 sms-6.c:12 (set (reg:SI 118) (mult:SI (reg:SI 116) (reg:SI 117))) 262 {mulsi3} (expr_list:REG_DEAD (reg:SI 117) (expr_list:REG_DEAD (reg:SI 116) (nil OUT ARCS: [42 -(A,0,1)-> 41] [42 -(A,0,1)-> 40] [42 -(T,8,0)-> 43] IN ARCS: [43 -(A,0,1)-> 42] [40 -(T,6,0)-> 42] [41 -(T,6,0)-> 42] Node num: 7 (insn 43 42 44 3 sms-6.c:12 (set (mem:SI (plus:SI (reg:SI 107 [ ivtmp.45 ]) (const_int 4 [0x4])) [3 S4 A32]) (reg:SI 118)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 118) (nil))) OUT ARCS: [43 -(A,0,1)-> 42] [43 -(O,1,0)-> 69] [43 -(A,0,0)-> 54] [43 -(O,1,1)-> 39] IN ARCS: [54 -(T,1,1)-> 43] [51 -(O,1,1)-> 43] [47 -(O,1,1)-> 43] [42 -(T,8,0)-> 43] Node num: 8 (insn 44 43 45 3 sms-6.c:13 (set (reg:SI 119) (mem:SI (plus:SI (reg:SI 108 [ ivtmp.44 ]) (const_int 8 [0x8])) [4 S4 A32])) 184 {*movwsi} (nil)) OUT ARCS: [44 -(A,0,0)-> 53] [44 -(T,6,0)-> 46] IN ARCS: [46 -(A,0,1)-> 44] [53 -(T,1,1)-> 44] Node num: 9 (insn 45 44 46 3 sms-6.c:13 (set (reg:SI 120) (mem:SI (plus:SI (reg:SI 109 [ ivtmp.42 ]) (const_int 8 [0x8])) [5 S4 A32])) 184 {*movwsi} (nil)) OUT ARCS: [45 -(A,0,0)-> 52] [45 -(T,6,0)-> 46] IN ARCS: [46 -(A,0,1)-> 45] [52 -(T,1,1)-> 45] Node num: 10 (insn 46 45 47 3 sms-6.c:13 (set (reg:SI 121) (mult:SI (reg:SI 119) (reg:SI 120))) 262 {mulsi3} (expr_list:REG_DEAD (reg:SI 120) (expr_list:REG_DEAD (reg:SI 119) (nil OUT ARCS: [46 -(A,0,1)-> 45] [46 -(A,0,1)-> 44] [46 -(T,8,0)-> 47] IN ARCS: [47 -(A,0,1)-> 46] [44 -(T,6,0)-> 46] [45 -(T,6,0)-> 46] Node num: 11 (insn 47 46 48 3 sms-6.c:13 (set (mem:SI (plus:SI (reg:SI 107 [ ivtmp.45 ])
Re: (known?) Issue with bitmap iterators
Daniel Berlin wrote: On Sat, Jun 20, 2009 at 10:54 AM, Jeff Law wrote: Imagine a loop like this EXECUTE_IF_SET_IN_BITMAP (something, 0, i, bi) { bitmap_clear_bit (something, i) [ ... whatever code we want to process i, ... ] } This code is unsafe. No, this is known, and in fact, has been a source of "interesting" bugs in the past since it doesn't segfault, but often, as you've discovered, starts wandering into the free list happily iterating over elements from bitmaps of dataflows past. Making it safe is a little tricky, basically, you need to know whether the element you are currently iterating over disappears. At the very worst, you could make pre-delete hooks and have the iterators register for them or something. At best, you can probably set a bit in the bitmap saying it's being iterated over, and then add a tombstone bit, which lets you mark elements as deleted without actually deleting them until the end of iteration when they are in the middle of iteration or something. Also, what do you expect the semantics to be? In particular, are new bits past the current index iterated over, or do you expect to iterate over the bitmap as it existed at the time you started iteration? Could we simply pre-calculate the next index early and store it in the iterator struct before entering the processing loop for index 'I'? . This would allow the current index to be deleted without impacting the next iteration. The changes shouldn't be very significant. Of course, everything falls apart if you willy nilly change other things in the list, especially clearing the 'next' bit, but changing the bit at index 'I' must be the most common case by far... Andrew
gcc 4.3.2 vectorizes access to volatile array
gcc-4.3.2 seems to produce bad code when accessing an array of small 'volatile' objects -- it may try to access multiple such objects in a 'parallel' fashion. E.g., instead of reading two consecutive 'volatile short's sequentially it reads a single 32-bit longword. This may crash e.g., when accessing a memory-mapped device which allows only 16-bit accesses. If I compile this code fragment void volarrcpy(short *d, volatile short *s, int n) { int i; for (i=0; i
Re: git mirror at gcc.gnu.org
On 06/20/2009 04:38 AM, Bernie Innocenti wrote: On 06/20/09 04:59, Jason Merrill wrote: Any thoughts on what to do about the non-branch directories under branches/? The complete set is ARM apple csl dead gcj ibm ix86 suse ubuntu. The only solution I can think of would be to specifically enumerate which branches we want to mirror in git with separate git-svn fetch lines rather than use the branches= line. Don't we want to mirror all branches? Enumerating them all would be a lot of work... I was thinking to do that programatically. i.e. git config --unset svn-remote.svn.branches for f in `svn ls svn://gcc.gnu.org/svn/gcc/branches|egrep -v '^(ARM|apple|csl|dead|gcj|ibm|ix86|redhat|suse|ubuntu)/'`; do f=${f%/} git config --get-all svn-remote.svn.fetch | fgrep "branches/$f:" > /dev/null || git config --add svn-remote.svn.fetch branches/$f:refs/remotes/$f done for d in ARM apple csl dead gcj ibm ix86 redhat suse ubuntu; do for f in `svn ls svn://gcc.gnu.org/svn/gcc/branches/$d`; do f=$d/${f%/} git config --get-all svn-remote.svn.fetch | fgrep "branches/$f:" > /dev/null || git config --add svn-remote.svn.fetch branches/$f:refs/remotes/$f done done Perhaps we could report this missing feature to the git-svn folks and ask them to fix it for us? I'm not sure it's possible in general to tell whether a particular directory in SVN is a branch or not. Jason
Re: (known?) Issue with bitmap iterators
Richard Guenther wrote: On Sat, Jun 20, 2009 at 4:54 PM, Jeff Law wrote: Imagine a loop like this EXECUTE_IF_SET_IN_BITMAP (something, 0, i, bi) { bitmap_clear_bit (something, i) [ ... whatever code we want to process i, ... ] } This code is unsafe. If bit I happens to be the only bit set in the current bitmap word, then bitmap_clear_bit will free the current element and return it to the element free list where it can be recycled. So assume the bitmap element gets recycled for some other bitmap... We then come around to the top of the loop where EXECUTE_IF_SET_IN_BITMAP will call bmp_iter_set which can reference the recycled element when it tries to advance to the next element via bi->elt1 = bi->elt1->next. So we start iterating on elements of a completely different bitmap. You can assume this is not good. Our documentation clearly states that I is to remain unchanged, but ISTM that the bitmap we're iterating over needs to remain unchanged as well. Is this a known issue, or am I just the lucky bastard who tripped over it and now gets to audit all the EXECUTE_IF_SET_IN_BITMAP loops? It is known (but maybe not appropriately documented) that deleting bits in the bitmap you iterate over is not safe. If it would be me I would see if I could make it safe though. It's not a huge deal -- what bothers me is that it's not documented. Someone thought enough to document that the loop index shouldn't be modified in the loop, but didn't bother to mention that the bitmap itself shouldn't be modified in the loop. I'd lean towards the idea of making the bitmap readonly -- hopefully everyone already knows they can't set bits in the bitmap and expect the iterator to work, so let's keep things consistent with clearing bits. If we can back that up with some checking code, then I think we'd be in good shape. jeff
Re: (known?) Issue with bitmap iterators
Richard Guenther wrote: > > It is known (but maybe not appropriately documented) that deleting > > bits in the bitmap you iterate over is not safe. If it would be me I would > > see if I could make it safe though. On Mon, Jun 22, 2009 at 10:06:38AM -0700, Jeff Law wrote: > It's not a huge deal -- what bothers me is that it's not documented. > Someone thought enough to document that the loop index shouldn't be > modified in the loop, but didn't bother to mention that the bitmap > itself shouldn't be modified in the loop. As a general rule there is a performance cost for making iterators on a data structure safe with respect to modifications of that data structure. I'm not in a position to say what the right solution is in this case, but passes that iterate over bitmaps without modifying those bitmaps shouldn't be penalized. One solution sometimes used is two sets of iterators, with a slower version that's safe under modification.
Re: gcc 4.3.2 vectorizes access to volatile array
Till Straumann wrote: > gcc-4.3.2 seems to produce bad code when > accessing an array of small 'volatile' > objects -- it may try to access multiple > such objects in a 'parallel' fashion. > E.g., instead of reading two consecutive > 'volatile short's sequentially it reads > a single 32-bit longword. This may crash > e.g., when accessing a memory-mapped device > which allows only 16-bit accesses. > > If I compile this code fragment > > void volarrcpy(short *d, volatile short *s, int n) > { > int i; > for (i=0; id[i] = s[i]; > } > > > with '-O3' (the critical option seems to be '-ftree-vectorize') > then gcc-4.3.2 produces quite complicated code > but the essential section is (powerpc) > > .L7: >lhz 0,0(11) >addi 11,11,2 >lwzx 0,4,9 >stwx 0,3,9 >addi 9,9,4 >bdnz .L7 > > or i386 > > .L7: >movw(%ecx), %ax >movl(%esi,%edx,4), %eax >movl%eax, (%ebx,%edx,4) >incl%edx >addl$2, %ecx >cmpl%edx, -20(%ebp) >ja .L7 > > > Disassembled back into C-code, this reads > > uint32_t *dst_l = (uint32_t*)d; > uint32_t *src_l = (uint32_t*)s; > > for (i=0; id[i] = s[i]; >dst_l[i] = src_l[i]; > } > > This code seems neither optimal nor correct. > Besides reading half of the locations twice > which violates the semantics of volatile > objects accessing such objects in a 'vectorized' > way (in this case: instead of reading > two adjacent short addresses gcc emits > a single 32-bit read) seems illegal to me. > > Similar behavior seems to be present in 4.3.3. > > Does anybody have some insight? Should I file > a bug report? I can't reproduce this with "GCC: (GNU) 4.3.3 20081110 (prerelease)" .L8: movzwl (%ecx), %eax addl$1, %ebx addl$2, %ecx movw%ax, (%edx) addl$2, %edx cmpl%ebx, 16(%ebp) jg .L8 I think you should upgrade. Andrew.
Re: gcc 4.3.2 vectorizes access to volatile array
Andrew Haley wrote: Till Straumann wrote: gcc-4.3.2 seems to produce bad code when accessing an array of small 'volatile' objects -- it may try to access multiple such objects in a 'parallel' fashion. E.g., instead of reading two consecutive 'volatile short's sequentially it reads a single 32-bit longword. This may crash e.g., when accessing a memory-mapped device which allows only 16-bit accesses. If I compile this code fragment void volarrcpy(short *d, volatile short *s, int n) { int i; for (i=0; i I can't reproduce this with "GCC: (GNU) 4.3.3 20081110 (prerelease)" .L8: movzwl (%ecx), %eax addl$1, %ebx addl$2, %ecx movw%ax, (%edx) addl$2, %edx cmpl%ebx, 16(%ebp) jg .L8 I think you should upgrade. Andrew. OK, try this then: void c(char *d, volatile char *s) { int i; for ( i=0; i<32; i++ ) d[i]=s[i]; } (gcc --version: gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3) gcc -m32 -c -S -O3 produces an unrolled sequence: movzbl (%ecx), %eax leal20(%ebx), %edx movl(%ecx), %eax movl%eax, (%edi) movzbl 1(%ecx), %eax movl4(%ecx), %eax movl%eax, 4(%edi) movzbl 2(%ecx), %eax movl4(%ebx), %eax movl%eax, 4(%esi) movzbl 3(%ecx), %eax movl8(%ebx), %eax movl%eax, 8(%esi) ... < snip >... The 64-bit version even uses SSE registers to load the volatile data: (gcc -c -S -O3) .L7: movzbl (%rsi), %eax movdqu (%rsi), %xmm0 movdqa %xmm0, (%rdi) movzbl 1(%rsi), %eax movdqu (%rdx), %xmm0 movdqa %xmm0, 16(%rdi) Not sure an upgrade helps ;-) -- Till
Re: gcc 4.3.2 vectorizes access to volatile array
On Mon, Jun 22, 2009 at 11:14 AM, Till Straumann wrote: > Andrew Haley wrote: >> >> Till Straumann wrote: >> >>> >>> gcc-4.3.2 seems to produce bad code when >>> accessing an array of small 'volatile' >>> objects -- it may try to access multiple >>> such objects in a 'parallel' fashion. >>> E.g., instead of reading two consecutive >>> 'volatile short's sequentially it reads >>> a single 32-bit longword. This may crash >>> e.g., when accessing a memory-mapped device >>> which allows only 16-bit accesses. >>> >>> If I compile this code fragment >>> >>> void volarrcpy(short *d, volatile short *s, int n) >>> { >>> int i; >>> for (i=0; i>> d[i] = s[i]; >>> } >>> >>> >>> with '-O3' (the critical option seems to be '-ftree-vectorize') >>> then gcc-4.3.2 produces quite complicated code >>> but the essential section is (powerpc) >>> >>> .L7: >>> lhz 0,0(11) >>> addi 11,11,2 >>> lwzx 0,4,9 >>> stwx 0,3,9 >>> addi 9,9,4 >>> bdnz .L7 >>> >>> or i386 >>> >>> .L7: >>> movw (%ecx), %ax >>> movl (%esi,%edx,4), %eax >>> movl %eax, (%ebx,%edx,4) >>> incl %edx >>> addl $2, %ecx >>> cmpl %edx, -20(%ebp) >>> ja .L7 >>> >>> >>> Disassembled back into C-code, this reads >>> >>> uint32_t *dst_l = (uint32_t*)d; >>> uint32_t *src_l = (uint32_t*)s; >>> >>> for (i=0; i>> d[i] = s[i]; >>> dst_l[i] = src_l[i]; >>> } >>> >>> This code seems neither optimal nor correct. >>> Besides reading half of the locations twice >>> which violates the semantics of volatile >>> objects accessing such objects in a 'vectorized' >>> way (in this case: instead of reading >>> two adjacent short addresses gcc emits >>> a single 32-bit read) seems illegal to me. >>> >>> Similar behavior seems to be present in 4.3.3. >>> >>> Does anybody have some insight? Should I file >>> a bug report? >>> >> >> I can't reproduce this with "GCC: (GNU) 4.3.3 20081110 (prerelease)" >> >> .L8: >> movzwl (%ecx), %eax >> addl $1, %ebx >> addl $2, %ecx >> movw %ax, (%edx) >> addl $2, %edx >> cmpl %ebx, 16(%ebp) >> jg .L8 >> >> I think you should upgrade. >> >> Andrew. >> > > OK, try this then: > > void > c(char *d, volatile char *s) > { > int i; > for ( i=0; i<32; i++ ) > d[i]=s[i]; > } > > > (gcc --version: gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3) ^ That may be too old. Gcc 4.3.4 revision 148680 generates: .L5: leaq(%rsi,%rdx), %rax movzbl (%rax), %eax movb%al, (%rdi,%rdx) addq$1, %rdx cmpq$32, %rdx jne .L5 -- H.J.
Re: (known?) Issue with bitmap iterators
Joe Buck wrote: > As a general rule there is a performance cost for making iterators > on a data structure safe with respect to modifications of that data > structure. I'm not in a position to say what the right solution is > in this case, but passes that iterate over bitmaps without modifying > those bitmaps shouldn't be penalized. One solution sometimes used is > two sets of iterators, with a slower version that's safe under > modification. But then we'll run into the same bug again when someone uses the wrong kind, or changes the usage of a bitmap without changing which type it is. I think making them all safe, but under --enable-checking only, with the safety code compiled out when it's disabled might be a nice solution. cheers, DaveK
Re: (known?) Issue with bitmap iterators
Paolo Bonzini wrote: > >> I took a look. I don't think it would be hideously hacky to do >> something like ... >> >> #define EXECUTE_IF_SET_IN_BITMAP(BITMAP, MIN, BITNUM, ITER)\ >> for ((BITMAP)->ro_flag = true,\ >> bmp_iter_set_init (&(ITER), (BITMAP), (MIN), &(BITNUM));\ >>(BITMAP)->ro_flag = bmp_iter_set (&(ITER), &(BITNUM));\ >>bmp_iter_next (&(ITER), &(BITNUM))) > > You should add a BREAK_FROM_EXECUTE_IF_SET(BITMAP) macro too, however. > > Paolo :) I should, but I'm way too loaded at the moment to actually generate this patch... already got two simultaneous testruns going that will tie up my PC for a couple of days. Sorry. cheers, DaveK
Re: gcc 4.3.2 vectorizes access to volatile array
H.J. Lu wrote: > On Mon, Jun 22, 2009 at 11:14 AM, Till > Straumann wrote: >> Andrew Haley wrote: >>> Till Straumann wrote: >>> gcc-4.3.2 seems to produce bad code when accessing an array of small 'volatile' objects -- it may try to access multiple such objects in a 'parallel' fashion. E.g., instead of reading two consecutive 'volatile short's sequentially it reads a single 32-bit longword. This may crash e.g., when accessing a memory-mapped device which allows only 16-bit accesses. If I compile this code fragment void volarrcpy(short *d, volatile short *s, int n) { int i; for (i=0; i>>> d[i] = s[i]; } with '-O3' (the critical option seems to be '-ftree-vectorize') then gcc-4.3.2 produces quite complicated code but the essential section is (powerpc) .L7: lhz 0,0(11) addi 11,11,2 lwzx 0,4,9 stwx 0,3,9 addi 9,9,4 bdnz .L7 or i386 .L7: movw(%ecx), %ax movl(%esi,%edx,4), %eax movl%eax, (%ebx,%edx,4) incl%edx addl$2, %ecx cmpl%edx, -20(%ebp) ja .L7 Disassembled back into C-code, this reads uint32_t *dst_l = (uint32_t*)d; uint32_t *src_l = (uint32_t*)s; for (i=0; i>>> d[i] = s[i]; dst_l[i] = src_l[i]; } This code seems neither optimal nor correct. Besides reading half of the locations twice which violates the semantics of volatile objects accessing such objects in a 'vectorized' way (in this case: instead of reading two adjacent short addresses gcc emits a single 32-bit read) seems illegal to me. Similar behavior seems to be present in 4.3.3. Does anybody have some insight? Should I file a bug report? >>> I can't reproduce this with "GCC: (GNU) 4.3.3 20081110 (prerelease)" >>> >>> .L8: >>>movzwl (%ecx), %eax >>>addl$1, %ebx >>>addl$2, %ecx >>>movw%ax, (%edx) >>>addl$2, %edx >>>cmpl%ebx, 16(%ebp) >>>jg .L8 >>> >>> I think you should upgrade. >>> >>> Andrew. >>> >> OK, try this then: >> >> void >> c(char *d, volatile char *s) >> { >> int i; >> for ( i=0; i<32; i++ ) >> d[i]=s[i]; >> } >> >> >> (gcc --version: gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3) > ^ > > That may be too old. Gcc 4.3.4 revision 148680 > generates: > > .L5: > leaq(%rsi,%rdx), %rax > movzbl (%rax), %eax > movb%al, (%rdi,%rdx) > addq$1, %rdx > cmpq$32, %rdx > jne .L5 4.4.0 20090307 generates truly bizarre code, though: gcc -m32 -c -S -O3 x.c c: pushl %ebp movl%esp, %ebp pushl %ebx movl12(%ebp), %edx movl8(%ebp), %ebx movl%edx, %ecx orl %ebx, %ecx andl$3, %ecx leal4(%ebx), %eax je .L10 .L2: xorl%eax, %eax .p2align 4,,7 .p2align 3 .L5: leal(%edx,%eax), %ecx movzbl (%ecx), %ecx movb%cl, (%ebx,%eax) addl$1, %eax cmpl$32, %eax jne .L5 popl%ebx popl%ebp ret .p2align 4,,7 .p2align 3 .L10: leal4(%edx), %ecx cmpl%ecx, %ebx jbe .L11 .L7: movzbl (%edx), %ecx movl(%edx), %ecx movl%ecx, (%ebx) movzbl 1(%edx), %ecx movl4(%edx), %ecx movl%ecx, 4(%ebx) movzbl 2(%edx), %ecx movl8(%edx), %ecx movl%ecx, 4(%eax) movzbl 3(%edx), %ecx movl12(%edx), %ecx movl%ecx, 8(%eax) movzbl 4(%edx), %ecx movl16(%edx), %ecx movl%ecx, 12(%eax) movzbl 5(%edx), %ecx movl20(%edx), %ecx movl%ecx, 16(%eax) movzbl 6(%edx), %ebx leal24(%edx), %ecx movl24(%edx), %ebx movl%ebx, 20(%eax) movzbl 7(%edx), %edx movl4(%ecx), %edx movl%edx, 24(%eax) popl%ebx popl%ebp ret .p2align 4,,7 .p2align 3 .L11: cmpl%edx, %eax jae .L2 jmp .L7
Re: gcc 4.3.2 vectorizes access to volatile array
That's roughly the same that 4.3.3 produces. I had not quoted the full assembly code but just the essential part that is executed when source and destination are 4-byte aligned and are more than 4-bytes apart. Otherwise (not longword-aligned) the (correct) code labeled '.L5' is executed. -- Till Andrew Haley wrote: H.J. Lu wrote: On Mon, Jun 22, 2009 at 11:14 AM, Till Straumann wrote: Andrew Haley wrote: Till Straumann wrote: gcc-4.3.2 seems to produce bad code when accessing an array of small 'volatile' objects -- it may try to access multiple such objects in a 'parallel' fashion. E.g., instead of reading two consecutive 'volatile short's sequentially it reads a single 32-bit longword. This may crash e.g., when accessing a memory-mapped device which allows only 16-bit accesses. If I compile this code fragment void volarrcpy(short *d, volatile short *s, int n) { int i; for (i=0; i I can't reproduce this with "GCC: (GNU) 4.3.3 20081110 (prerelease)" .L8: movzwl (%ecx), %eax addl$1, %ebx addl$2, %ecx movw%ax, (%edx) addl$2, %edx cmpl%ebx, 16(%ebp) jg .L8 I think you should upgrade. Andrew. OK, try this then: void c(char *d, volatile char *s) { int i; for ( i=0; i<32; i++ ) d[i]=s[i]; } (gcc --version: gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3) ^ That may be too old. Gcc 4.3.4 revision 148680 generates: .L5: leaq(%rsi,%rdx), %rax movzbl (%rax), %eax movb%al, (%rdi,%rdx) addq$1, %rdx cmpq$32, %rdx jne .L5 4.4.0 20090307 generates truly bizarre code, though: gcc -m32 -c -S -O3 x.c c: pushl %ebp movl%esp, %ebp pushl %ebx movl12(%ebp), %edx movl8(%ebp), %ebx movl%edx, %ecx orl %ebx, %ecx andl$3, %ecx leal4(%ebx), %eax je .L10 .L2: xorl%eax, %eax .p2align 4,,7 .p2align 3 .L5: leal(%edx,%eax), %ecx movzbl (%ecx), %ecx movb%cl, (%ebx,%eax) addl$1, %eax cmpl$32, %eax jne .L5 popl%ebx popl%ebp ret .p2align 4,,7 .p2align 3 .L10: leal4(%edx), %ecx cmpl%ecx, %ebx jbe .L11 .L7: movzbl (%edx), %ecx movl(%edx), %ecx movl%ecx, (%ebx) movzbl 1(%edx), %ecx movl4(%edx), %ecx movl%ecx, 4(%ebx) movzbl 2(%edx), %ecx movl8(%edx), %ecx movl%ecx, 4(%eax) movzbl 3(%edx), %ecx movl12(%edx), %ecx movl%ecx, 8(%eax) movzbl 4(%edx), %ecx movl16(%edx), %ecx movl%ecx, 12(%eax) movzbl 5(%edx), %ecx movl20(%edx), %ecx movl%ecx, 16(%eax) movzbl 6(%edx), %ebx leal24(%edx), %ecx movl24(%edx), %ebx movl%ebx, 20(%eax) movzbl 7(%edx), %edx movl4(%ecx), %edx movl%edx, 24(%eax) popl%ebx popl%ebp ret .p2align 4,,7 .p2align 3 .L11: cmpl%edx, %eax jae .L2 jmp .L7
Re: Should -Wjump-misses-init be in -Wall?
On Sat, 20 Jun 2009, Ian Lance Taylor wrote: > That said, > I'm perfectly amenable to moving the new warning to -Wextra or just > turning it on only with -Wc++-compat. I don't personally care that > much, actually. I also agree with Robert's comments that all warnings are about valid C, with -Wall we diagnose what we subjectively feel is dubious coding practice. Not everyone will agree with what -Wall contains, that's not a reason to freeze it. Furthermore I am not impressed by a total of four warnings between binutils and GCC. Those are large packages and not necessarily representative of the average package among a gnu/linux distro. Now if someone does a test and shows that building the world exposes hundreds or even dozens of these warnings, **and** none of them are actual bugs, then I would reevaluate my opinion. Finally, the compromise fallback position should be to move it to -W/-Wextra (like -Wsign-compare in the past). As a dubious coding practice, I don't think we should banish it to only -Wc++-compat. --Kaveh
Re: Should -Wjump-misses-init be in -Wall?
On Mon, Jun 22, 2009 at 04:51:17PM -0700, Kaveh R. GHAZI wrote: > I also agree with Robert's comments that all warnings are about valid C, > with -Wall we diagnose what we subjectively feel is dubious coding > practice. Not everyone will agree with what -Wall contains, that's not a > reason to freeze it. Right, but it's a cost-benefit tradeoff. > Now if someone does a test and shows that building the world exposes > hundreds or even dozens of these warnings, **and** none of them are actual > bugs, then I would reevaluate my opinion. I think that this should be the standard: a warning belongs in -Wall if it tends to expose bugs. If it doesn't, then it's just somebody's idea of proper coding style but with no evidence in support of its correctness. A -Wall warning should expose bugs, and should be easy to silence in correct code.
Re: Should -Wjump-misses-init be in -Wall?
Joe Buck wrote: On Mon, Jun 22, 2009 at 04:51:17PM -0700, Kaveh R. GHAZI wrote: I also agree with Robert's comments that all warnings are about valid C, with -Wall we diagnose what we subjectively feel is dubious coding practice. Not everyone will agree with what -Wall contains, that's not a reason to freeze it. Right, but it's a cost-benefit tradeoff. Now if someone does a test and shows that building the world exposes hundreds or even dozens of these warnings, **and** none of them are actual bugs, then I would reevaluate my opinion. I think that this should be the standard: a warning belongs in -Wall if it tends to expose bugs. If it doesn't, then it's just somebody's idea of proper coding style but with no evidence in support of its correctness. A -Wall warning should expose bugs, and should be easy to silence in correct code. To understand what you are saying, we need to know what bug means, since it can have two meanings: 1. An actual error, that could show up right now in certain circumstances 2. An error reseulting in undefined behavior in the standard, but for the current version of gcc, it cannot actually cause any real misbehavior, but some future version of gcc might take advantage of this error status and do something weird. For me it is enough if warnings expose case 2 situations, even if they find few if any case 1 situations.
Re: Should -Wjump-misses-init be in -Wall?
Gabriel Dos Reis writes: > Historically, many C programmers have resisted that idea (even when we could > argue that it really is bad style programming.) They have?(!) This warning warns about: goto L; { int x = 3; L: ... } but not about: goto L; { int x; L: ... } right? So... is the idea that there might be code that assigns to x after L:, before the first use of x, making the warning superfluous? -Miles -- Twice, adv. Once too often.
Re: Should -Wjump-misses-init be in -Wall?
On Mon, Jun 22, 2009 at 10:43 PM, Miles Bader wrote: > Gabriel Dos Reis writes: >> Historically, many C programmers have resisted that idea (even when we could >> argue that it really is bad style programming.) > > They have?(!) > > This warning warns about: goto L; { int x = 3; L: ... } > but not about: goto L; { int x; L: ... } > right? Yes. > > So... is the idea that there might be code that assigns to x after L:, > before the first use of x, making the warning superfluous? Some of it are like that; some of it are when 'x' is not used after L. Some of it are more convoluted. Please note that I do consider that bad programming style. My 'middle ground' proposal was to enforce warn at -Wall for C99 -- because you're like to find more declarations (and therefore initializations) in the middle of blocks that are jumped over, than you would in C90. And have it at -Wextra for C90 -- and of course it will be part of -Wc++-compat. -- Gaby
Re: Should -Wjump-misses-init be in -Wall?
On Mon, Jun 22, 2009 at 09:45:52PM -0400, Robert Dewar wrote: > Joe Buck wrote: >> I think that this should be the standard: a warning belongs in -Wall if >> it tends to expose bugs. If it doesn't, then it's just somebody's idea >> of proper coding style but with no evidence in support of its correctness. >> >> A -Wall warning should expose bugs, and should be easy to silence in >> correct code. > > To understand what you are saying, we need to know what bug means, since > it can have two meanings: > > 1. An actual error, that could show up right now in certain circumstances > > 2. An error resulting in undefined behavior in the standard, but > for the current version of gcc, it cannot actually cause any real > misbehavior, but some future version of gcc might take advantage > of this error status and do something weird. > > For me it is enough if warnings expose case 2 situations, even if > they find few if any case 1 situations. I agree, but I think this warning should be in -Wc++-compat, not -Wall or even -Wextra. Why? I'd argue the warning is useless for C code, unless you care about C++ style. There were five places in binutils that triggered -Wjump-misses-init warnings. Not one of them was a real bug, even using Robert's case 2 definition. I believe the same is true of the three places in gcc where the warning triggered. So far, no one has generated a C testcase having undefined behaviour where -Wjump-misses-init warns but -Wuninitialized (already in -Wall) doesn't, when optimizing. If such a testcase is found, I'm guessing it probably should be filed as a -Wuninitialized bug. In C, an auto variable initialization is just an assignment. (I'm of course aware that arrays can be initialized and their size set, structs and unions initialized, but by and large, in C, an initialization is simply an assignment.) So, why single out the initial assignment? If skipping it deserves a warning then skipping other assignments deserves a warning too, which would be ridiculous. -- Alan Modra Australia Development Lab, IBM
Re: plugins directory?
Richard Guenther wrote: On Mon, Jun 22, 2009 at 1:20 PM, Basile STARYNKEVITCH wrote: I would prefer some discussion to happen, instead of blindly proposing patches on gcc-patches@ and have them rejected until I find a consensus. Do you think that 1. We should not care about where system wide plugins are located, and leave the -fplugin command as it is. In practice that would mean every plugin use to be wrapped, e.g. by a CFLAGS=-fplugin=$(shell pkg-config -gccplugins gtkgcc) -O in a Makefile 2. We should have a path of GCC plugins, i.e. a sequence of directories where plugins are searched. a. should this path be setable only by a -fplugin-path option b. should this path be also settable thru e.g. an environment variable like GCC_PLUGINS_PATH c. what is the default value of this path Notice that dlopen already uses LD_LIBRARY_PATH, but I feel it is not entirely appropriate for our purposes (because plugins are not libraries, even if they share the .so ...). I think we should reach a consensus quickly: the stage 1 is closing soon, and such a patch has to happen very soon! On the contrary. If we put in place such a mechanism we have to maintain it forever. It is hard to tell what is the correct solution without distributors facing the problem (there are no existing plugins that can be packaged or are packaged now). I am not sure to agree on that argument. My observation is that most of the (free, Linux) programs permitting plugins do define some conventions., which are coded inside the application. (Of course the convention should be configurable). In some of the applications, the convention is provided by wrapping the executable in a shell script (ooffice, firefox). In other applications, the convention is wired in the binary (so is provided by the source file): java, GTK,QT, ... It should be noted that we have just to define a convention (and of course implement it, but the implementation is trivial). And I don't understand why is maintaining such a convention such a big burden. If we don't define or propose anything, I would be afraid that either plugins could be less used, or that various distributions would have various widely incompatible conventions. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***