MELT 0.9.9 rc1 plugin release candidate for GCC 4.6, 4.7, 4.8
Dear All, It is my pleasure to announce the long-awaited MELT 0.9.9 rc1 plugin release candidate 1 for GCC 4.6, 4.7, and 4.8. This is a significant improvement over previous MELT plugins. In short, mixing at will MELT and C/C++ code is even much easier than before (in particular, because with hooks, MELT can generate almost arbitrary C/C++ functions coded in MELT, so MELT can interact much more with GCC, because you can now code in MELT nearly arbitrary functions callable from C/C++ code. This also facilitates a lot coding in MELT arbitrary callbacks for external libraries.) You can download the gnuzipped source tarball from http://gcc-melt.org/melt-0.9.9-rc1-plugin-for-gcc-4.6-or-4.7-or-4.8.tar.gz This is a gzipped source tarball of 5691552 bytes of md5sum 2d9e420c63e55fce0e3d6f7a846d extracted from MELT branch svn rev. 199574 on june 01st 2013 NEWS for 0.9.9 MELT plugin for GCC 4.6 & 4.7 & 4.8 [[june, 01st, 2013]] This is a significant release. A lot of new features are appearing. Much more ability to mix arbitrary C/C++ & MELT code in any way and both directions! This is the last MELT release supporting GCC 4.6 and GCC compilers in C only. Future MELT releases with will be C++ only (i.e. emit C++ code), for GCC 4.7 & 4.8 or later... Language improvements = *** * Code chunks can contain void side-effecting expressions of :VOID ctype, translated to some C/C++ block. Hence code chunks can even be indirectly nested. Within the macro-string, write $(...) for the expression. You may want to make it a PROGN ending with (VOID) to force it to be void. Sub-expressions -of :VOID ctype- inside code chunks are evaluated at the place of appearance, and are expanded to C/C++ blocks at their occurrence. *** * Expression chunks are expanded into C/C++ expressions. Syntax is (EXPR_CHUNK...) For instance, to get the pid of the cc1 running your MELT extension, use (expr_chunk getpid_chk :long #{/*$GETPID_CHK*/ (long)getpid()}#) Important notice: any sub-expression appearing in some EXPR_CHUNK is evaluated before, like for primitives, so will always be evaluated. *** * Ability to emit some C code in the implementation part. Syntax (CIMPLEMENT ) This is a companion to the existing CHEADER. Useful to declare some specific non-inlined C++ function or static variable with CHEADER and to define it with CIMPLEMENT. *** * New ability to add module "static" value variables in MELT (DEFVAR ) The so defined variable is (in the generated C/C++ code) a pointer [to some MELT value] inside a static array. Garbage collection support is generated. The variable cannot be exported with EXPORT_VALUES, you need to export functions accessing or changing it. Once defined, you can use SETQ to assign to such a module variable (and also DEFHOOK with :VAR) *** * New ability to define hooks, that is C/C++ functions coded in MELT. Syntax is (DEFHOOK [:var ] ) The optional given with a :var annotation should be a module variable previously defined with DEFVAR. For example, with the following code (defvar varlist) (setq varlist (list)) (defvar varhook) (defhook appendnumhk (:long n) (:long outlen) :void :var varhook (list_append varlist (constant_box n)) (setq outlen (list_length varlist))) you get two generated extern "C" functions in the emitted C/C++ code void melthook_APPENDNUMHK (melt_ptr_t hook, long n, long* outlen); and void melthookproc_APPENDNUMHK (long n, long* outlen); which you could use e.g. in some other code_chunk-s. The first function melthook_APPENDNUMHK needs the hook value `appendnumhk' as its first argument; the second function melthookproc_APPENDNUMHK is generated because a :var annotation has been given, and uses the hook value automagically stored in that `varhook' module variable. Many functions previously coded in C/C++ inside the melt-runtime.c have been migrated to predefined hooks coded in MELT inside melt/warmelt-hooks.melt etc... Hooks are a very important addition: with them you can mix C/C++ & MELT code at will. Runtime improvements *** * To register your MELT pass, use the INSTALL_MELT_PASS_IN_GCC function instead of the obsolete INSTALL_MELT_GCC_PASS primitive. Notice that INSTALL_MELT_PASS_IN_GCC is incompatible with the older INSTALL_MELT_GCC_PASS, because you need to pass the :after keyword instead of the "after" cstring, etc, etc... *** * Many plugin events are interfaced (using hook machinery). Some of them are incompatible with previous MELT releases. *** * Add
Re: Excessive calls to iterate_phdr during exception handling
On 29/05/2013 9:41 AM, Ian Lance Taylor wrote: On Tue, May 28, 2013 at 9:02 PM, Ryan Johnson wrote: Maybe I misunderstood... there's currently a (very small) cache (unwind-dw2-fde-dip.c) that lives behind the loader mutex. It contains 8 entries and each entry holds the start and end addresses for one loaded object, along with a pointer to the eh_frame_header. The cache resides in static storage, and so accessing it is always safe. I think what you're saying is that the p_eh_frame_hdr field could end up with a dangling pointer due to a dlclose call? Yes, that can happen. If so, my argument is that, as long as the cache is up to date as of the start of unwind, any attempt to access a dangling p_eh_frame_hdr means that in-use code was dlclosed, in which case unwind is guaranteed to fail anyway. The failure would just have different symptoms with such a cache in place. Am I missing something? I think you're right about that. But what happens if the entry is not in the cache? Or, do you mean you want to look in the cache before calling dl_iterate_phdr? That should be safe but of course you still need a lock as multiple threads can be manipulating the cache at the same time. OK, here's a proof of concept I threw together as a preload library of sorts. You can compile it as a .o and link into the app directly, or compile it as .so and LD_PRELOAD it into an existing app. Linking directly against the .so doesn't seem to work for some reason. However the app incorporates it, though, I confirmed that it removes the bottleneck. The short version is that it overrides _Unwind_Find_FDE and allocates a 4kB global FDE table cache (~85 entries), which threads search lock-free. Per-entry checksums ensure that threads don't try to use inconsistent entries in the event they race with an updater. Calls to dlclose blow away the cache. Whenever a thread misses in cache, it calls dl_iterate_phdr to serve the miss and, if a suitable FDE table is found also inserts a new cache entry for it. In the event that this also fails (if the FDE table is not sorted, or if it's an obsolete object that uses registered unwind info rather than eh_frame), it falls back to the original implementation. The .h just cobbles together various definitions from the gcc sources (with citations) so the code can compile stand-alone; the .cpp started from the original _Unwind_Find_FDE and tweaked it to add the caching. I'm not sure the best way to incorporate something like this into gcc, but it turned out surprisingly self-contained, which should at least help uptake. Some of the uglier parts of the code (the dlsym hacks and the extra mutex) would be unnecessary in a properly integrated solution. FYI, for a bit I had a bug where readers grabbed the mutex as well, and performance was almost as good as lock-free because the critical section was so much shorter than before (FDE table search having been moved outside). Still, I was only using a few threads in my tests, so the lock-free method is probably better. Thoughts? Ryan /* This file is where we get to mess with things */ #include "debug-find-fde.h" #include #include #include #include #ifdef __cplusplus extern "C" { #endif #if 0 } #endif struct fde_table { signed initial_loc __attribute__ ((mode (SI))); signed fde __attribute__ ((mode (SI))); }; /* Conceptually, the cache contains a sorted concatenation of all FDE entries we know about. However, the tables we're willing to consider are already sorted in their respective eh_fram_hdr sections in memory, so we instead record the location of each such table. The header cache is sorted, and readers use binary search over the range 0..hdr_cache_sz to find a lower bound; in the absence of concurrent updates (see below) the returned entry will contain PC, if any entry does. From there, the reader performs a second binary search, this time on the FDE table, and returns whatever it finds. Each call to dlopen updates the cache on success, a fairly non-invasive action. Calls to dlclose clear the cache before unloading the library and rebuild it afterward, so unwinding threads will have to fall back to the old way during the interim. Cache updates involve moving elements around, which can confuse readers if they have the bad luck to unwind during a cache reorg: 1. The binary search may end at the wrong lower bound if entries shift around during the search. This will cause a miss where there should have been a hit. There are already fallbacks in place to cover cache misses that can arise for a number of reasons, so we choose to live with an occasional extra miss. 2. Binary search may end at (what appears to be) the proper lower bound, but the reader retrieves an inconsistent cache entry due to a race with some writer. This could send the reader into la-la land if undetected, so we protect the reader-accessed parts of each cache ent
gcc 4.8.1: -O2 warnings one element off
Hi list. I think I've found a bug in where gcc checks if array indices are in range. Here's my test-code: ---8<-8<-8<- /* * file: elementTest.c * command-line: * arm-none-eabi-gcc -O2 elementTest.c -o elementTest */ #include uint8_t eightMembers[8]; int main(int argc, const char *argv[]) { uint8_t i; for(i = 0; i < 8; i++) { eightMembers[i] = 0; } for(i = 0; i < 9; i++) { eightMembers[i] = 0; } for(i = 0; i < 10; i++) { eightMembers[i] = 0; } return(0); } --->8->8->8- Here's my result: ---8<-8<-8<- elementTest.c: In function 'main': elementTest.c:27:19: warning: iteration 8u invokes undefined behavior [-Waggressive-loop-optimizations] eightMembers[i] = 0; ^ elementTest.c:25:2: note: containing loop for(i = 0; i < 10; i++) ^ --->8->8->8- I would expect gcc to complain when it meets the second loop as well as the third loop, but it didn't detect that there is something wrong with the second loop. ...Is this a bug ? Love Jens
Re: gcc 4.8.1: -O2 warnings one element off
On 06/01/2013 10:08 PM, Jens Bauer wrote: > I would expect gcc to complain when it meets the second loop as well as the > third loop, but it didn't detect that there is something wrong with the > second loop. > > ...Is this a bug ? C allows you to index one element beyond the end of an array. So, this uint8_t eightMembers[8]; uint8_t *p = &eightMembers[8]; is legal, but uint8_t eightMembers[8]; uint8_t *p = &eightMembers[9]; is not. In both cases you cannot actually use the memory at *p. I think gcc is detecting the indexing but not the access. Andrew.
Re: gcc 4.8.1: -O2 warnings one element off
On Sat, Jun 01, 2013 at 11:08:21PM +0200, Jens Bauer wrote: > Here's my result: > > ---8<-8<-8<- > elementTest.c: In function 'main': > elementTest.c:27:19: warning: iteration 8u invokes undefined behavior > [-Waggressive-loop-optimizations] >eightMembers[i] = 0; >^ > elementTest.c:25:2: note: containing loop > for(i = 0; i < 10; i++) > ^ > --->8->8->8- > > I would expect gcc to complain when it meets the second loop as well as the > third loop, but it didn't detect that there is something wrong with the > second loop. > > ...Is this a bug ? It is not a bug, the warning isn't guaranteed to report all such cases of undefined behavior, and due to lack of infrastructure in GCC 4.8 can't be reported if certain passes discover the undefined behavior. GCC 4.9 warns on both of the bad loops, but even in 4.9, if the loop doesn't have constant bounds or has multiple exits etc., GCC will not warn and just might optimize based on the fact that undefined behavior doesn't happen in correct code. Lack of warning doesn't mean code is bug free. Jakub
Re: gcc 4.8.1: -O2 warnings one element off
On Sat, 01 Jun 2013 22:12:51 +0100, Andrew Haley wrote: > In both cases you cannot actually use the memory at *p. I think gcc is > detecting the indexing but not the access. That makes sense! On Sat, 1 Jun 2013 23:19:45 +0200, Jakub Jelinek wrote: > It is not a bug, the warning isn't guaranteed to report all such cases of > undefined behavior, and due to lack of infrastructure in GCC 4.8 can't be > reported if certain passes discover the undefined behavior. > GCC 4.9 warns on both of the bad loops, but even in 4.9, if the loop > doesn't have constant bounds or has multiple exits etc., GCC will not warn > and just might optimize based on the fact that undefined behavior doesn't > happen in correct code. Lack of warning doesn't mean code is bug free. ...If that was the case, I'd just turn all warnings off. ;) Thank you both, for the excellent answers. =) Love Jens
gcc-4.7-20130601 is now available
Snapshot gcc-4.7-20130601 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.7-20130601/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.7 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_7-branch revision 199587 You'll find: gcc-4.7-20130601.tar.bz2 Complete GCC MD5=eecd3732d5b466ccaf5d7807babcb4e2 SHA1=49a6e259b68ae1496fe68f5aff6e452dfbc32a9c Diffs from 4.7-20130525 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.7 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: I would like to provide a mirror for GCC
Hi Timo, On Fri, 24 May 2013, timo.ja...@go-part.com wrote: > I was wondering if you are still accepting mirrors for GCC. > > http://gcc.gnu.org/mirrors.html > > If you are, I will like to provide one with HTTP, FTP and Rsync access via > a VPS located in Australia, USA, UK, Philippines or Japan or any > combinations of those locations. yes, we are happy to accept mirrors. Procedure-wise, just go ahead and start mirroring ftp://gcc.gnu.org/pub/gcc and then send us a patch for http://gcc.gnu.org/mirrors.html and I'll take care of that (feel free to copy me). Thanks, Gerald
Generating data at link time
Dear all, I'm trying to generate a (very simple) special data section at link-time. It seems that LTO is too heavy, though I'm also not quite familiar with it. I think that what I first need is a new output format which wraps the standard ELF, so can anyone help to point out where I can get started? Also if I want to generate data, is there any tool lighter than and compatible with GCC can be used to deal with the platform dependent stuff?
Re: Generating data at link time
On Sat, Jun 1, 2013 at 5:59 PM, YU Chenkan wrote: > Dear all, > > I'm trying to generate a (very simple) special data section at > link-time. It seems that LTO is too heavy, though I'm also not quite > familiar with it. > > I think that what I first need is a new output format which wraps the > standard ELF, so can anyone help to point out where I can get started? > > Also if I want to generate data, is there any tool lighter than and > compatible with GCC can be used to deal with the platform dependent > stuff? What kind of data section? Could you use a plugin with the linker (GNU ld supports plugins) which creates that? Thanks, Andrew Pinski