Re: Use of FLAGS_REGNUM clashes with generates insn
Quoting "Paulo J. Matos" : My addition instruction sets all the flags. So I have: This is annoying, but can be handled. Been there, done that. dse.c needs a small patch, which I intend to submit sometime in the future. And all my (define_insn "*mov..." are tagged with a (clobber (reg:CC RCC)). This generates all kinds of trouble since GCC generates moves internally without the clobber that fail to match. I don't think that can be overcome without cc0. Unless you want to hide your flags register altogether.
Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++
Hi, In libstdc++-v3/libsupc++/eh_term_handler.cc, it says by default the demangler things are pulled in, according to whether _GLIBCXX_HOSTED is defined. the demangler exception terminating handler are really big, especially for embedded system. Secondly, _GLIBCXX_HOSTED is now defined if --enable-hosted-libstdcxx is given(by default it is). This option also controls whether libstdc++.a itself is built for target system. So, for an embedded system, how could I provide the earlier "silent death" handler by defining _GLIBCXX_HOSTED, also with libstdc++ built? Any suggestion? Thanks in advance. FYI, all above are talking about cross-toolchain. -- Best Regards.
Re: Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++
On 23 September 2011 09:14, Amker.Cheng wrote: > Hi, > > In libstdc++-v3/libsupc++/eh_term_handler.cc, it says by default the > demangler things are pulled in, > according to whether _GLIBCXX_HOSTED is defined. the demangler > exception terminating handler > are really big, especially for embedded system. > > Secondly, _GLIBCXX_HOSTED is now defined if --enable-hosted-libstdcxx > is given(by default it is). > This option also controls whether libstdc++.a itself is built for target > system. > > So, for an embedded system, how could I provide the earlier "silent > death" handler by defining _GLIBCXX_HOSTED, > also with libstdc++ built? > > Any suggestion? Thanks in advance. > FYI, all above are talking about cross-toolchain. (Any reason this wasn't sent to the libstdc++ list?) http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43852 proposes a "quiet mode" which would reduce code size by disabling some of the code in eh_term_handler.cc and pure.cc - would that do what you want? I've not had time to do anything about it, but I think Sebastian (CC'd) has a copyright assignment in place now, and he's provided a patch implementing it.
Re: Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++
> (Any reason this wasn't sent to the libstdc++ list?) > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43852 proposes a "quiet > mode" which would reduce code size by disabling some of the code in > eh_term_handler.cc and pure.cc - would that do what you want? > > I've not had time to do anything about it, but I think Sebastian > (CC'd) has a copyright assignment in place now, and he's provided a > patch implementing it. > Sorry for missing the list, cced now. It is exactly what I meant, thanks very much. -- Best Regards.
Re: Volatile qualification on pointer and data
On 22/09/11 22:15, Richard Guenther wrote: Btw, I think this is an old bug that has been resolved. Did you make sure to test a recent 4.6 branch snapshot or svn head? Should have tested git head. Compiling git head now to check the current status of this issue. -- PMatos
Re: Volatile qualification on pointer and data
On 23/09/11 12:33, Paulo J. Matos wrote: On 22/09/11 22:15, Richard Guenther wrote: Btw, I think this is an old bug that has been resolved. Did you make sure to test a recent 4.6 branch snapshot or svn head? Should have tested git head. Compiling git head now to check the current status of this issue. Git head 36181f98f doesn't compile (x86_64, --enable-checking=all, GCC 4.5.2): gcc -c -g -fkeep-inline-functions -DIN_GCC -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common -DHAVE_CONFIG_H -I. -I. -I../../../repositories/gcc/gcc -I../../../repositories/gcc/gcc/. -I../../../repositories/gcc/gcc/../include -I../../../repositories/gcc/gcc/../libcpp/include -I../../../repositories/gcc/gcc/../libdecnumber -I../../../repositories/gcc/gcc/../libdecnumber/bid -I../libdecnumber ../../../repositories/gcc/gcc/fold-const.c -o fold-const.o ../../../repositories/gcc/gcc/fold-const.c: In function ‘fold_overflow_warning’: ../../../repositories/gcc/gcc/fold-const.c:326:5: warning: format not a string literal and no format arguments ../../../repositories/gcc/gcc/fold-const.c: In function ‘fold_checksum_tree’: ../../../repositories/gcc/gcc/fold-const.c:13803:3: error: invalid application of ‘sizeof’ to incomplete type ‘struct tree_type’ -- PMatos
Re: Use of FLAGS_REGNUM clashes with generates insn
On 23/09/11 08:21, Joern Rennecke wrote: Quoting "Paulo J. Matos" : My addition instruction sets all the flags. So I have: This is annoying, but can be handled. Been there, done that. dse.c needs a small patch, which I intend to submit sometime in the future. Ok. Actually I was quite happy with my solution too which avoided having to change the core. However, it was not heavily tested. And all my (define_insn "*mov..." are tagged with a (clobber (reg:CC RCC)). This generates all kinds of trouble since GCC generates moves internally without the clobber that fail to match. I don't think that can be overcome without cc0. Unless you want to hide your flags register altogether. That's seriously annoying. The idea was to ditch cc0 and explicitly represent CC in a register to perform optimizations like splitting add and addc for a double word addition. If by hiding my register flags means going back to cc0, then it seems that the only way to go unless I get it to work somehow. If you having anything else in mind to get it to work let me know. What I currently have in mind is to have a backend macro listing all the move for which a move clobber CC_REG, then whenever GCC generates a move, it queries the macro to know if the move requires clobbering and emits the clobber if required. However, I am unsure how deep the rabbit hole goes. -- PMatos
Re: RFC: Improving support for known testsuite failures
On Thu, Sep 22, 2011 at 20:06, Hans-Peter Nilsson wrote: > On Thu, 8 Sep 2011, Diego Novillo wrote: > >> On Thu, Sep 8, 2011 at 04:31, Richard Guenther >> wrote: >> >> > I think it would be more useful to have a script parse gcc-testresults@ >> > postings from the various autotesters and produce a nice webpage >> > with revisions and known FAIL/XPASSes for the target triplets that >> > are tested. >> >> Sure, though that describes a different tool. I'm after a tool that >> will 'exit 0' if the testsuite finished with nominal results. > > Not to stop you from (partly) reinventing the wheel, but that's > pretty much what contrib/regression/btest-gcc.sh already does, > though you have to feed it a baseline a set of processed .sum > files which could (for a calling script or a modified > btest-gcc.sh) live in, say, contrib/target-results/. > It handles "duplicate" test names by marking it as failing if > any of them has failed. Works good enough. Yeah, I actually considered using it by extracting the actual .sum file processing out of it (I was not interested in it running the build nor the tests). However, I also needed to add support for marking flaky tests and putting an expiration date on failures. Additionally, I needed versioned failure manifests, and I could not justify storing in SVN multiple directories with 12Mb worth of .sum files in them. The small manifest file also has the local advantage of serving as release documentation for what we expect to fail and why. Diego.
Re: Volatile qualification on pointer and data
On 22/09/11 22:15, Richard Guenther wrote: Btw, I think this is an old bug that has been resolved. Did you make sure to test a recent 4.6 branch snapshot or svn head? My hopes were high but unfortunately it is not fixed yet. git head 36181f98 still generates the same unexpected code. Cheers, -- PMatos
Re: Use of FLAGS_REGNUM clashes with generates insn
Quoting "Paulo J. Matos" : That's seriously annoying. The idea was to ditch cc0 and explicitly represent CC in a register to perform optimizations like splitting add and addc for a double word addition. If by hiding my register flags means going back to cc0, then it seems that the only way to go unless I get it to work somehow. If you having anything else in mind to get it to work let me know. Hiding the flags register would mean it is not represented in the rtl at all. You can have combined compare-branch instructions. Of course, going that route would mean that the model you present to GCC is even further from the hardware than one that uses cc0. What I currently have in mind is to have a backend macro listing all the move for which a move clobber CC_REG, then whenever GCC generates a move, it queries the macro to know if the move requires clobbering and emits the clobber if required. However, I am unsure how deep the rabbit hole goes. Oh, so you do have variants that can do without the clobber. If you can make all the reloads without introducing explicit flag clobbers, that it should work. But you can't just pull a flag clobber out of thin air. You should have some way to generate valid code when the flags register is unavailable / must be saved. Then you can use peephole2 to add flag clobbers where the flags register is available. Or you can use machine_dependent_reorg or another machine-specific pass inserted with the pass manager to rewrite clobber-free instructions into ones that have a hardware equivalent; but you must make sure that your data flow remains sound in the process.
Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
On Fri, 23 Sep 2011 02:21:44 +0200, Cary Coutant wrote: > * .debug_pubtypes - Public types for use in building the > .gdb_index section at link time. This section will have an > extended format to allow it to represent both types in the > .debug_dwo_info section and type units in .debug_types. ^^^ = .dwo_info , maybe both .debug_info and .dwo_info > * .dwo_abbrev - Defines the abbreviation codes used by the > .debug_dwo_info section. ^^^ = .dwo_info I find this .dwo_* setup is great for rapid development rebuilds but it should remain optional as the currently used DWARF final separate .debug info file is smaller than all the .dwo files together. In the case of the final linked .debug builds (rpm/deb/...) one does not consider the build speed as important. It probably does not make sense to merge + convert .dwo files back to a single .debug file for the rpm/deb/... build performance reasons. Thanks, Jan
Re: Use of FLAGS_REGNUM clashes with generates insn
On Fri, 23 Sep 2011 09:30:48 -0400, amylaar wrote: > Hiding the flags register would mean it is not represented in the rtl at > all. You can have combined compare-branch instructions. Of course, > going that route would mean that the model you present to GCC is even > further from the hardware than one that uses cc0. > Got it! That seems that it would go against the whole point of replacing cc0 for CC_REGNUM in my specific case. Oh well... >> What I currently have in mind is to have a backend macro listing all >> the move for which a move clobber CC_REG, then whenever GCC generates a >> move, it queries the macro to know if the move requires clobbering and >> emits the clobber if required. However, I am unsure how deep the rabbit >> hole goes. > > Oh, so you do have variants that can do without the clobber. Actually I don't... My explanation was supposed to be referring to a general solution. In my case, the macro would list all moves since all moves clobber CC. > If you can > make all the reloads without introducing explicit flag clobbers, that it > should work. Unfortunately I can't. > But you can't just pull a flag clobber out of thin air. Understood. > You should have > some way to generate valid code when the flags register is unavailable / > must be saved. Then you can use peephole2 to add flag clobbers where > the flags register is available. > > Or you can use machine_dependent_reorg or another machine-specific pass > inserted with the pass manager to rewrite clobber-free instructions into > ones that have a hardware equivalent; but you must make sure that your > data flow remains sound in the process. I think your last suggestion of having a pass to rewrite the clobber free instructions into one with a hardware equivalent seems the one to go for me. Thanks for the suggestions, -- PMatos
misbehaviour with md5_process_bytes and maybe in optimization
Hello, I recently asked for some help as I got a problem when using md5_process_bytes (in libiberty/md5.c): http://gcc.gnu.org/ml/gcc-help/2011-09/msg00126.html, http://gcc.gnu.org/ml/gcc-help/2011-09/msg00127.html and it appears that there is a bug in md5_process_bytes. The bug can conduct to a miscomputed md5 result. It tooks time to me to make the bug reproducible but I was finally able to do so. The fact is that it only appears in very particular situation. I have written a small gcc plugin, allowing to reproduce it (see attachment). The bad news is that the bug only appears when use libiberty compiled in -g -O0 (it works well with -O2). It is quite sad, because It could means another bug in an optimization function. I have attached a README which detail how to use the plugin and how to explain the bug. I have tried to explain as good as possible (and I apologize for my very bad english). The bug appears when: 1) We use libiberty compiled with -O0 2) We first call md5_process_bytes with a less than 64 bits buffer (we call his size len1). 3) We make a new call of md5_process_bytes with a buffer which has a size len2 such as: len2 > 127 + 65 (so test in line 228 of md5.C will be true) 128 -len1 != Mulint with Mulint % __alignof__ (md5_uint32) != 0 (so condition on line 238 is true) len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of line 239 is broken with len = 64, this leads to the bug as, line 249, (len & ~63) = 64 and we shift the buffer without processing the data). Please, can you reproduce the bug? Is there any useful informations I can add? Must I contact somebody from libiberty (I don't know the status of this library (is this part of gcc or from another project?)). I already sent a patch correcting this issue (it does not correct the fact that we don't get the bug with an optimised libiberty): http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01098.html. It has not been reviewed, could someone reviews this? Thanks! Pierre Vittet md5sum_plugin.tar.gz Description: application/gzip
Re: misbehaviour with md5_process_bytes and maybe in optimization
Pierre Vittet writes: > The bug appears when: > 1) We use libiberty compiled with -O0 > 2) We first call md5_process_bytes with a less than 64 bits buffer (we > call his size len1). > 3) We make a new call of md5_process_bytes with a buffer which has a > size len2 such as: > len2 > 127 + 65 (so test in line 228 of md5.C will be true) > 128 -len1 != Mulint with Mulint % __alignof__ (md5_uint32) != 0 (so > condition on line 238 is true) > len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of > line 239 is broken with len = 64, this leads to the bug as, line 249, > (len & ~63) = 64 and we shift the buffer without processing the data). The line numbers you mention do not correspond to any version of libiberty/md5.c that I can see. Can you list the exact line for each line number you mention, so that your explanation is easier to follow? Thanks. Ian
Re: passing arguments to gcc build in eclipse
ok sorrythanks for replying..!! Andrew Haley wrote: > > On 09/16/2011 11:30 AM, pankajsejwal wrote: >> >> I have build gcc and imported it on eclipse and started to debug it from >> main >> but after a few steps it stops and sends "malloc.c" not found error and >> asks >> to give a source path to it. >> I believe the problem is because of the arguments that it requires to >> proceed for example "" as gcc takes some arguments to work on >> in >> terminal. >> >> Can someone please tell me the error I am facing and it i am correct can >> u >> tell me how to pass arguments to the built code that it can recognize it >> as >> a .C file. > > This is not an appropriate message for gcc@gcc.gnu.org, which is > only about the development of gcc itself. > > Most of us don't use Eclipse. I think you'd be much better advised > to direct this to an Eclipse-specific list, where the experts will be. > > Andrew. > > -- View this message in context: http://old.nabble.com/passing-arguments-to-gcc-build-in-eclipse-tp32477948p32503880.html Sent from the gcc - Dev mailing list archive at Nabble.com.
Wrong documentation of TARGET_ADDR_SPACE_SUBSET_P
Hi, I notice the following description is different from how spu & m32c use it. In internal manual: bool TARGET_ADDR_SPACE_SUBSET_P (addr space t superset, [Target Hook] addr space t subset) Define this to return whether the subset named address space is contained within the superset named address space. Pointers to a named address space that is a subset of another named address space will be converted automatically without a cast if used together in arithmetic operations. Pointers to a superset address space can be converted to pointers to a subset address space via explicit casts. In spu & m32c ports: m32c_addr_space_subset_p (addr_space_t subset, addr_space_t superset) spu_addr_space_subset_p (addr_space_t subset, addr_space_t superset) I believe the document is wrong. The first argument is subset and the second one is superset. Should I submit a patch? Cheers, Bingfeng Mei
Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
>> * .debug_pubtypes - Public types for use in building the >> .gdb_index section at link time. This section will have an >> extended format to allow it to represent both types in the >> .debug_dwo_info section and type units in .debug_types. > ^^^ > = .dwo_info , maybe both .debug_info and .dwo_info > > >> * .dwo_abbrev - Defines the abbreviation codes used by the >> .debug_dwo_info section. > ^^^ > = .dwo_info Thanks, I've fixed the wiki page. > I find this .dwo_* setup is great for rapid development rebuilds but it should > remain optional as the currently used DWARF final separate .debug info file is > smaller than all the .dwo files together. In the case of the final linked > .debug builds (rpm/deb/...) one does not consider the build speed as > important. > It probably does not make sense to merge + convert .dwo files back to a single > .debug file for the rpm/deb/... build performance reasons. Yes, we'll definitely make this a compile-time option. While I haven't finished designing the package format for collecting all the .dwo files, I do plan on having the packaging tool do at least duplicate type elimination to reduce the size of the package file. -cary
Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
> The Apple approach has both the features of the Sun/HP implementation as well > as the ability to create a standalone debug info file. Thanks for the clarifications. I based my comments on a description you sent me a couple of years ago, and I apologize for any oversimplifications I introduced. > The compiler puts DWARF in the .o file, the linker adds some records in the > executable which help us to understand where files/function/symbols landed in > the final executable[1]. Did you intend to add a footnote? > If the user runs our gdb or lldb on one of these binaries, the debugger will > read the DWARF directly out of the .o files on the fly. Because the linker > doesn't need to copy around/update/modify the DWARF, link times are very > fast. If the developer decides to debug the program, no extra steps are > required - the debugger can be started up & used with the debug info still in > the .o files. We're trying to achieve something very similar, but we have the additional goal of separating the info from the .o files because of our distributed build environment. I also wanted to attempt to standardize the approach, instead of having each vendor go in separate directions. Thanks, -cary
Re: misbehaviour with md5_process_bytes and maybe in optimization
Thanks for your interest, I just checked revision 179127 of GCC. Last revision is 177700, it has not been change for 6 weeks. My file is the same as this one: http://gcc.gnu.org/viewcvs/trunk/libiberty/md5.c?revision=177700&view=markup in libiberty/md5.c, function md5_process_bytes start line 203. On 23/09/2011 17:13, Ian Lance Taylor wrote: > Pierre Vittet writes: > >> The bug appears when: >> 1) We use libiberty compiled with -O0 >> 2) We first call md5_process_bytes with a less than 64 bits buffer (we >> call his size len1). >> 3) We make a new call of md5_process_bytes with a buffer which has a >> size len2 such as: >> len2 > 127 + 65 (so test in line 228 of md5.C will be true) line 228 is the following:if (len > 64) >> 128 -len1 != Mulint with Mulint % __alignof__ (md5_uint32) != 0 (so >> condition on line 238 is true) line 238 is the following: if (UNALIGNED_P (buffer)) >> len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of >> line 239 is broken with len = 64, this leads to the bug as, line 249, >> (len & ~63) = 64 and we shift the buffer without processing the data). line 239 is the following: while (len > 64) line 249: buffer = (const void *) ((const char *) buffer + (len & ~63)); > > The line numbers you mention do not correspond to any version of > libiberty/md5.c that I can see. Can you list the exact line for each > line number you mention, so that your explanation is easier to follow? > Thanks. I give about the same explanation in the README (which is in the attached archive of my previous mail) but I does not use line number but direct quote of the code. It mights be more easy to try the plugin with gdb but it needs to compile libiberty.a with -O0. > > Ian >
Re: misbehaviour with md5_process_bytes and maybe in optimization
Pierre Vittet writes: > Thanks for your interest, > > I just checked revision 179127 of GCC. Last revision is 177700, it has > not been change for 6 weeks. > > My file is the same as this one: > http://gcc.gnu.org/viewcvs/trunk/libiberty/md5.c?revision=177700&view=markup > > in libiberty/md5.c, function md5_process_bytes start line 203. > > On 23/09/2011 17:13, Ian Lance Taylor wrote: >> Pierre Vittet writes: >> >>> The bug appears when: >>> 1) We use libiberty compiled with -O0 >>> 2) We first call md5_process_bytes with a less than 64 bits buffer (we >>> call his size len1). >>> 3) We make a new call of md5_process_bytes with a buffer which has a >>> size len2 such as: >>> len2 > 127 + 65 (so test in line 228 of md5.C will be true) > line 228 is the following:if (len > 64) >>> 128 -len1 != Mulint with Mulint % __alignof__ (md5_uint32) != 0 (so >>> condition on line 238 is true) > line 238 is the following: if (UNALIGNED_P (buffer)) >>> len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of >>> line 239 is broken with len = 64, this leads to the bug as, line 249, >>> (len & ~63) = 64 and we shift the buffer without processing the data). > > line 239 is the following: while (len > 64) > line 249: buffer = (const void *) ((const char *) buffer + (len & ~63)); >> >> The line numbers you mention do not correspond to any version of >> libiberty/md5.c that I can see. Can you list the exact line for each >> line number you mention, so that your explanation is easier to follow? >> Thanks. > > I give about the same explanation in the README (which is in the > attached archive of my previous mail) but I does not use line number but > direct quote of the code. It mights be more easy to try the plugin with > gdb but it needs to compile libiberty.a with -O0. Thanks, I think I have it sorted out now. It does not happen on x86 glibc-based systems at -O2 because at -O2 #defines STRING_ARCH_unaligned, so the problematic code is not compiled or executed. The error was introduced by this change: 2005-07-03 Steve Ellcey PR other/13906 * md5.c (md5_process_bytes): Check alignment. Thanks for noticing this problem, analyzing it, and reporting it. I committed this patch to mainline to fix the problem. Bootstrapped on x86_64-unknown-linux-gnu. Ian 2011-09-23 Ian Lance Taylor * md5.c (md5_process_bytes): Correct handling of unaligned buffer. Index: md5.c === --- md5.c (revision 179127) +++ md5.c (working copy) @@ -1,6 +1,6 @@ /* md5.c - Functions to compute MD5 message digest of files or memory blocks according to the definition of MD5 in RFC 1321 from April 1992. - Copyright (C) 1995, 1996 Free Software Foundation, Inc. + Copyright (C) 1995, 1996, 2011 Free Software Foundation, Inc. NOTE: This source is derived from an old version taken from the GNU C Library (glibc). @@ -245,9 +245,11 @@ md5_process_bytes (const void *buffer, s } else #endif - md5_process_block (buffer, len & ~63, ctx); - buffer = (const void *) ((const char *) buffer + (len & ~63)); - len &= 63; + { + md5_process_block (buffer, len & ~63, ctx); + buffer = (const void *) ((const char *) buffer + (len & ~63)); + len &= 63; + } } /* Move remaining bytes in internal buffer. */
Re: Incorrect optimized (-O2) linked list code with 4.3.2
On Mon, Sep 12, 2011 at 10:10 AM, pavan tc wrote: > Hi, > > I would like to know if there have been issues with optimized linked > list code with GCC 4.3.2. [optiimization flag : -O2] > > The following is the inlined code that has the problem: > > static inline void > list_add_tail (struct list_head *new, struct list_head *head) > { > new->next = head; > new->prev = head->prev; > > new->prev->next = new; > new->next->prev = new; > } > > The above code has been used in the loop as below: > > pool = GF_CALLOC (count, padded_sizeof_type, gf_common_mt_long); > if (!pool) { > GF_FREE (mem_pool); > return NULL; > } > > for (i = 0; i < count; i++) { > list = pool + (i * (padded_sizeof_type)); > INIT_LIST_HEAD (list); > list_add_tail (list, &mem_pool->list); > > } > > '&mem_pool-> list' is used as the list head. mem_pool is a pointer to type : > struct mem_pool { > struct list_head list; > int hot_count; > int cold_count; > gf_lock_t lock; > unsigned long padded_sizeof_type; > void *pool; > void *pool_end; > int real_sizeof_type; > }; > > 'list' is the new member being added to the tail of the list pointed to by > head. > It is a pointer to type: > struct list_head { > struct list_head *next; > struct list_head *prev; > }; > > The generated assembly for the loop (with the linined list_add_tail()) > is as below: > > 40a1c: e8 0f 03 fd ff callq 10d30 <__gf_calloc@plt> > 40a21: 48 85 c0 test %rax,%rax > 40a24: 48 89 c7 mov %rax,%rdi > 40a27: 0f 84 bf 00 00 00 je 40aec > 40a2d: 48 8b 73 08 mov 0x8(%rbx),%rsi > 40a31: 4d 8d 44 24 01 lea 0x1(%r12),%r8 > 40a36: 31 c0 xor %eax,%eax > 40a38: b9 01 00 00 00 mov $0x1,%ecx > 40a3d: 0f 1f 00 nopl (%rax) > 40a40: 49 0f af c5 imul %r13,%rax > <=== loop start > 40a44: 48 8d 04 07 lea (%rdi,%rax,1),%rax > 40a48: 48 89 18 mov %rbx,(%rax) > # list->next = head > 40a4b: 48 89 06 mov %rax,(%rsi) > # head->prev->next = list > 40a4e: 48 8b 10 mov (%rax),%rdx > # rdx holds list->next > 40a51: 48 89 70 08 mov %rsi,0x8(%rax) # > list->prev = head->prev; > 40a55: 48 89 42 08 mov %rax,0x8(%rdx) # > list->next->prev = list > 40a59: 48 89 c8 mov %rcx,%rax > 40a5c: 48 83 c1 01 add $0x1,%rcx > 40a60: 4c 39 c1 cmp %r8,%rcx > 40a63: 75 db jne 40a40 > > In the assembly above, %rbx holds the address of 'head'. > %rsi holds the value of head->prev. This is assigned outside the loop and the > compiler classifies it as a loop invariant, which is where, I think, > the problem is. > This line of code should have been inside the loop. > - %rsi still holds the value of head->prev that was assigned > outside the loop. > > The following experiments eliminate the problem: > > 1. Using 'volatile' on the address that 'head' points to. > 2. Using a function call (logging calls, for example) inside the loop. > 3. Using the direct libc calloc instead of the GF_CALLOC. > [GF_CALLOC does some accounting when accounting is enabled. Calls vanilla > libc calloc() otherwise]. > > So, anything that necessitates a different usage of %rsi seems to be > correcting > the behaviour. > > 4. Using gcc 4.4.3 [ The obvious solution would then be to use 4.4.3, > but I would > like to understand if this is a known problem with 4.3.2. Small > programs written to > emulate this problem do not exhibit the erroneous behaviour.] > > Please let me know if any more details about this behaviour are required. > I'll be glad to provide them. Use -fno-strict-aliasing. Your code invokes undefined behavior. > TIA, > Pavan >
Re: A case that PRE optimization hurts performance
On Fri, Sep 16, 2011 at 4:00 AM, Jiangning Liu wrote: > Hi Richard, > > I slightly changed the case to be like below, > > int f(char *t) { > int s=0; > > while (*t && s != 1) { > switch (s) { > case 0: /* path 1 */ > s = 2; > break; > case 2: /* path 2 */ > s = 3; /* changed */ > break; > default: /* path 3 */ > if (*t == '-') > s = 2; > break; > } > t++; > } > > return s; > } > > "-O2" is still worse than "-O2 -fno-tree-pre". > > "-O2 -fno-tree-pre" result is > > f: > pushl %ebp > xorl %eax, %eax > movl %esp, %ebp > movl 8(%ebp), %edx > movzbl (%edx), %ecx > jmp .L14 > .p2align 4,,7 > .p2align 3 > .L5: > movl $2, %eax > .L7: > addl $1, %edx > cmpl $1, %eax > movzbl (%edx), %ecx > je .L3 > .L14: > testb %cl, %cl > je .L3 > testl %eax, %eax > je .L5 > cmpl $2, %eax > .p2align 4,,5 > je .L17 > cmpb $45, %cl > .p2align 4,,5 > je .L5 > addl $1, %edx > cmpl $1, %eax > movzbl (%edx), %ecx > jne .L14 > .p2align 4,,7 > .p2align 3 > .L3: > popl %ebp > .p2align 4,,2 > ret > .p2align 4,,7 > .p2align 3 > .L17: > movb $3, %al > .p2align 4,,3 > jmp .L7 > > While "-O2" result is > > f: > pushl %ebp > xorl %eax, %eax > movl %esp, %ebp > movl 8(%ebp), %edx > pushl %ebx > movzbl (%edx), %ecx > jmp .L14 > .p2align 4,,7 > .p2align 3 > .L5: > movl $1, %ebx > movl $2, %eax > .L7: > addl $1, %edx > testb %bl, %bl > movzbl (%edx), %ecx > je .L3 > .L14: > testb %cl, %cl > je .L3 > testl %eax, %eax > je .L5 > cmpl $2, %eax > .p2align 4,,5 > je .L16 > cmpb $45, %cl > .p2align 4,,5 > je .L5 > cmpl $1, %eax > setne %bl > addl $1, %edx > testb %bl, %bl > movzbl (%edx), %ecx > jne .L14 > .p2align 4,,7 > .p2align 3 > .L3: > popl %ebx > popl %ebp > ret > .p2align 4,,7 > .p2align 3 > .L16: > movl $1, %ebx > movb $3, %al > jmp .L7 > > You may notice that register ebx is introduced, and some more instructions > around ebx are generated as well. i.e. > > setne %bl > testb %bl, %bl > > I agree with you that in theory PRE does the right thing to minimize the > computation cost on gimple level. However, the problem is the cost of > converting comparison result to a bool value is not considered, so it > actually makes binary code worse. For this case, as I summarized below, to > complete the same functionality "With PRE" is worse than "Without PRE" for > all three paths, > > * Without PRE, > > Path1: > movl $2, %eax > cmpl $1, %eax > je .L3 > > Path2: > movb $3, %al > cmpl $1, %eax > je .L3 > > Path3: > cmpl $1, %eax > jne .L14 > > * With PRE, > > Path1: > movl $1, %ebx > movl $2, %eax > testb %bl, %bl > je .L3 > > Path2: > movl $1, %ebx > movb $3, %al > testb %bl, %bl > je .L3 > > Path3: > cmpl $1, %eax > setne %bl > testb %bl, %bl > jne .L14 > > Do you have any more thoughts? It seems to me that with PRE all the testb %bl, %bl should be evaluated at compile-time considering the preceeding movl $1, %ebx. Am I missing something? Richard. > Thanks, > -Jiangning > >> -Original Message- >> From: Richard Guenther [mailto:richard.guent...@gmail.com] >> Sent: Tuesday, August 02, 2011 5:23 PM >> To: Jiangning Liu >> Cc: gcc@gcc.gnu.org >> Subject: Re: A case that PRE optimization hurts performance >> >> On Tue, Aug 2, 2011 at 4:37 AM, Jiangning Liu >> wrote: >> > Hi, >> > >> > For the following simple test case, PRE optimization hoists >> computation >> > (s!=1) into the default branch of the switch statement, and finally >> causes >> > very poor code generation. This problem occurs in both X86 and ARM, >> and I >> > believe it is also a problem for other targets. >> > >> > int f(char *t) { >> > int s=0; >> > >> > while (*t && s != 1) { >> > switch (s) { >> > case 0: >> > s = 2; >> > break; >> > case 2: >> > s = 1; >> > break; >> > default: >> > if (*t == '-') >> > s = 1; >> > break; >> > } >> > t++; >> > } >> > >> > return s; >> > } >> > >
Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
On Sep 23, 2011, at 10:58 AM, Cary Coutant wrote: >> The compiler puts DWARF in the .o file, the linker adds some records in the >> executable which help us to understand where files/function/symbols landed >> in the final executable[1]. > > Did you intend to add a footnote? Yeah, I realized after I sent the email - it didn't seem interesting enough to warrant a separate followup. The records that our linker puts in the executable are in the form of stabs entries. There are a handful of stabs records created - file start, file end, function start, function end, symbol, pointer to a .o file, maybe one or two others. We chose that format because it was trivial to support and we already had tools for stripping these records out of the executable once the dSYM had been created. Once a dSYM has been created with all of the DWARF collected in a single file, our DWARF is parseable by any debug info consumer with minimal changes -- they need to know to look in a separate file for the DWARF from the main executable, but the format itself is unchanged. Supporting the debug-information-in-.o-files is more involved, I don't know if any of the third-party debuggers on our platform work with it. > We're trying to achieve something very similar, but we have the > additional goal of separating the info from the .o files because of > our distributed build environment. I also wanted to attempt to > standardize the approach, instead of having each vendor go in separate > directions. Yeah, if your regular build environment involves distributed compilation, and the .o files need to be copied to a central system for the linker, then I can see why you're pursuing this approach. For us, the most common usage is single-computer compilation & linking -- where the linker never pages in the debug info sections from the .o files so their size is not particular important. J
Re: [Dwarf-Discuss] RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
Hi Jason, Jason Molenda wrote: > On Sep 23, 2011, at 10:58 AM, Cary Coutant wrote: > >>> The compiler puts DWARF in the .o file, the linker adds some records in the >>> executable which help us to understand where files/function/symbols landed >>> in the final executable[1]. >> Did you intend to add a footnote? > > Yeah, I realized after I sent the email - it didn't seem interesting enough > to warrant a separate followup. > > The records that our linker puts in the executable are in the form of stabs > entries. There are a handful of stabs records created - file start, file > end, function start, function end, symbol, pointer to a .o file, maybe one or > two others. We chose that format because it was trivial to support and we > already had tools for stripping these records out of the executable once the > dSYM had been created. I don't remember the exact details, but the problem I recall with the Darwin scheme is that it builds an incomplete index in the Mach-O symbol table. IIRC, it was missing things that a user might want to lookup by-name in the debugger, like static functions or variables, and type names with external linkage. Without a reasonably complete index, the debugger can't know where to find the definitions of certain things, and that forces the user to navigate using other information, like source file name or global function definitions to force the debug information in the object to be read. Of course, the current DWARF indexes (like pubnames/pubtypes) have the same problem, and some compilers do a really bad job at generating those sections. But at least when there's a single .debug_info section, the debugger can decide to ignore the indexes and "skim" the full debug information. The compilers on IRIX did a better job at generating indexes, so the debugger could find by-name static functions/objects. > Once a dSYM has been created with all of the DWARF collected in a single > file, our DWARF is parseable by any debug info consumer with minimal changes > -- they need to know to look in a separate file for the DWARF from the main > executable, but the format itself is unchanged. Supporting the > debug-information-in-.o-files is more involved, I don't know if any of the > third-party debuggers on our platform work with it. TotalView supports debug information in .o files on Darwin, and has since day one. Perhaps you recall all those email exchanges you and I had several years back. It was a modest amount of work, given that we already supported debug information in .o files on the Sun and HP platforms. I seem to recall one of the sore spots for us on Dawrin was getting good address information for certain DWARF location operations, like DW_OP_addr. Fortran was a particularly messy because some compilers didn't supply a linkage name attribute, so the debugger had to make several guesses at the name, and look things up by trial and error. Cheers, John D. >> We're trying to achieve something very similar, but we have the >> additional goal of separating the info from the .o files because of >> our distributed build environment. I also wanted to attempt to >> standardize the approach, instead of having each vendor go in separate >> directions. > > > Yeah, if your regular build environment involves distributed compilation, and > the .o files need to be copied to a central system for the linker, then I can > see why you're pursuing this approach. For us, the most common usage is > single-computer compilation & linking -- where the linker never pages in the > debug info sections from the .o files so their size is not particular > important. > > J > ___ > Dwarf-Discuss mailing list > dwarf-disc...@lists.dwarfstd.org > http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org >
AIX libstdc++ missing symbols
My latest bootstrap of GCC on AIX failed due to missing symbols in libstdc++ expected by libgmpxx: exec(): 0509-036 Cannot load program exec(): 0509-036 Cannot load program /tmp/20110922/./gcc/cc1plus/tmp/20110922/./g cc/cc1plus because of the following errors: because of the following errors: 0509-130 Symbol resolution failed for 0509-130 Symbol resolution failed for /usr/gnu/lib/libgmpxx.a(libgmpxx .so.4)/usr/gnu/lib/libgmpxx.a(libgmpxx.so.4) because: because: 0509-136 Symbol 0509-136 Symbol _ZNSt6localeD1Ev_ZNSt6localeD1Ev (number (number 44) is not exporte d from dependent module ) is not exported from dependent module /tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++. so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6). . 0509-136 Symbol 0509-136 Symbol _ZNSt6localeC1ERKS__ZNSt6localeC1ERKS_ (number (number 66) is not e xported from dependent module ) is not exported from dependent module /tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++. so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6). . 0509-136 Symbol 0509-136 Symbol _ZNSt8ios_base4InitD1Ev_ZNSt8ios_base4InitD1Ev (number (number 1010 ) is not exported from dependent module ) is not exported from dependent module /tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++. so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6). . 0509-136 Symbol 0509-136 Symbol _ZNSt8ios_base4InitC1Ev_ZNSt8ios_base4InitC1Ev (number (number ) is not exported from dependent module ) is not exported from dependent module /tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++. so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6). Any idea what has changed and why those symbols no longer are exported by libstdc++? This seems like a libstdc++ ABI change if they really disappeared. Thanks, David
Re: AIX libstdc++ missing symbols
On 09/24/2011 12:23 AM, David Edelsohn wrote: My latest bootstrap of GCC on AIX failed due to missing symbols in libstdc++ expected by libgmpxx: On x86_64-linux are both still exported. And for sure nobody worked on the code itself. I would say, it's a compiler issue.. Paolo.
gcc-4.6-20110923 is now available
Snapshot gcc-4.6-20110923 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20110923/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.6 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch revision 179133 You'll find: gcc-4.6-20110923.tar.bz2 Complete GCC MD5=85f2513ed81259e02029c7b20e0a53bb SHA1=bdef841f21d3e2753bc7f5fad8505eef500456b3 Diffs from 4.6-20110916 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.6 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.