MELT 0.9.9 rc1 plugin release candidate for GCC 4.6, 4.7, 4.8

2013-06-01 Thread Basile Starynkevitch
Dear All,

It is my pleasure to announce the long-awaited MELT 0.9.9 rc1 plugin release 
candidate 1 for GCC 4.6, 4.7, and 4.8. This is a significant improvement 
over previous MELT plugins.  In short, mixing at will MELT and C/C++ code 
is even much easier than before (in particular, because with hooks, MELT can 
generate almost arbitrary C/C++ functions coded in MELT, 
so MELT can interact much more with GCC, because you can now code in MELT 
nearly arbitrary 
functions callable from C/C++ code. This also facilitates a lot coding in MELT 
arbitrary callbacks for external libraries.)

You can download the gnuzipped source tarball from

   http://gcc-melt.org/melt-0.9.9-rc1-plugin-for-gcc-4.6-or-4.7-or-4.8.tar.gz

This is a gzipped source tarball of 5691552 bytes of md5sum 
2d9e420c63e55fce0e3d6f7a846d
extracted from MELT branch svn rev. 199574 on june 01st 2013



NEWS for 0.9.9 MELT plugin for GCC 4.6 & 4.7 & 4.8
[[june, 01st, 2013]]

This is a significant release. A lot of new features are
appearing. Much more ability to mix arbitrary C/C++ & MELT code in any
way and both directions!

This is the last MELT release supporting GCC 4.6 and GCC compilers in
C only. Future MELT releases with will be C++ only (i.e. emit C++
code), for GCC 4.7 & 4.8 or later...


   Language improvements
   =

   ***
   * Code chunks can contain void side-effecting expressions of :VOID
 ctype, translated to some C/C++ block. Hence code chunks can even
 be indirectly nested. Within the macro-string, write $(...) for
 the expression. You may want to make it a PROGN ending with
 (VOID) to force it to be void.  Sub-expressions -of :VOID ctype-
 inside code chunks are evaluated at the place of appearance, and
 are expanded to C/C++ blocks at their occurrence.


   ***
   * Expression chunks are expanded into C/C++ expressions. Syntax is

(EXPR_CHUNK...)

   For instance, to get the pid of the cc1 running your MELT extension, use
  (expr_chunk getpid_chk :long #{/*$GETPID_CHK*/ (long)getpid()}#)

   Important notice: any sub-expression appearing in some EXPR_CHUNK
   is evaluated before, like for primitives, so will always be
   evaluated.


   ***
   * Ability to emit some C code in the implementation part. Syntax
 (CIMPLEMENT )

   This is a companion to the existing CHEADER. Useful to declare some
   specific non-inlined C++ function or static variable with CHEADER
   and to define it with CIMPLEMENT.



   ***
   * New ability to add module "static" value variables in MELT
   
(DEFVAR )
   
   The so defined variable  is (in the generated C/C++ code)
   a pointer [to some MELT value] inside a static array. Garbage
   collection support is generated.  The variable  cannot be
   exported with EXPORT_VALUES, you need to export functions accessing
   or changing it.  Once defined, you can use SETQ to assign to such a
   module variable (and also DEFHOOK with :VAR)


   ***
   * New ability to define hooks, that is C/C++ functions coded in
 MELT. Syntax is 

(DEFHOOK 
 [:var ] )
 
The optional  given with a :var annotation should be 
a module variable previously defined with DEFVAR. 
For example, with the following code

(defvar varlist)
(setq varlist (list))
(defvar varhook)
(defhook appendnumhk (:long n) (:long outlen) :void 
 :var varhook
(list_append varlist (constant_box n))
(setq outlen (list_length varlist)))

you get two generated extern "C" functions in the emitted C/C++ code
void melthook_APPENDNUMHK (melt_ptr_t hook, long n, long* outlen);
and
void melthookproc_APPENDNUMHK (long n, long* outlen);

which you could use e.g. in some other code_chunk-s. The first
function melthook_APPENDNUMHK needs the hook value `appendnumhk' as
its first argument; the second function melthookproc_APPENDNUMHK
is generated because a :var annotation has been given, and uses
the hook value automagically stored in that `varhook' module
variable.

Many functions previously coded in C/C++ inside the melt-runtime.c
have been migrated to predefined hooks coded in MELT inside
melt/warmelt-hooks.melt etc...

Hooks are a very important addition: with them you can mix C/C++ & MELT 
code 
at will.


   Runtime improvements
   

   *** 
   * To register your MELT pass, use the INSTALL_MELT_PASS_IN_GCC
   function instead of the obsolete INSTALL_MELT_GCC_PASS
   primitive. Notice that INSTALL_MELT_PASS_IN_GCC is incompatible
   with the older INSTALL_MELT_GCC_PASS, because you need to pass
   the :after keyword instead of the "after" cstring, etc, etc...


   ***
   * Many plugin events are interfaced (using hook machinery). 
   Some of them are incompatible with previous MELT releases.


   ***
   * Add

Re: Excessive calls to iterate_phdr during exception handling

2013-06-01 Thread Ryan Johnson

On 29/05/2013 9:41 AM, Ian Lance Taylor wrote:

On Tue, May 28, 2013 at 9:02 PM, Ryan Johnson
 wrote:

Maybe I misunderstood... there's currently a (very small) cache
(unwind-dw2-fde-dip.c) that lives behind the loader mutex. It contains 8
entries and each entry holds the start and end addresses for one loaded
object, along with a pointer to the eh_frame_header. The cache resides in
static storage, and so accessing it is always safe.

I think what you're saying is that the p_eh_frame_hdr field could end up
with a dangling pointer due to a dlclose call?

Yes, that can happen.


If so, my argument is that, as long as the cache is up to date as of the
start of unwind, any attempt to access a dangling p_eh_frame_hdr means that
in-use code was dlclosed, in which case unwind is guaranteed to fail anyway.
The failure would just have different symptoms with such a cache in place.

Am I missing something?

I think you're right about that.  But what happens if the entry is not
in the cache?  Or, do you mean you want to look in the cache before
calling dl_iterate_phdr?  That should be safe but of course you still
need a lock as multiple threads can be manipulating the cache at the
same time.
OK, here's a proof of concept I threw together as a preload library of 
sorts. You can compile it as a .o and link into the app directly, or 
compile it as .so and LD_PRELOAD it into an existing app. Linking 
directly against the .so doesn't seem to work for some reason. However 
the app incorporates it, though, I confirmed that it removes the 
bottleneck.


The short version is that it overrides _Unwind_Find_FDE and allocates a 
4kB global FDE table cache (~85 entries), which threads search 
lock-free. Per-entry checksums ensure that threads don't try to use 
inconsistent entries in the event they race with an updater. Calls to 
dlclose blow away the cache. Whenever a thread misses in cache, it calls 
dl_iterate_phdr to serve the miss and, if a suitable FDE table is found 
also inserts a new cache entry for it. In the event that this also fails 
(if the FDE table is not sorted, or if it's an obsolete object that uses 
registered unwind info rather than eh_frame), it falls back to the 
original implementation.


The .h just cobbles together various definitions from the gcc sources 
(with citations) so the code can compile stand-alone; the .cpp started 
from the original _Unwind_Find_FDE and tweaked it to add the caching.


I'm not sure the best way to incorporate something like this into gcc, 
but it turned out surprisingly self-contained, which should at least 
help uptake. Some of the uglier parts of the code (the dlsym hacks and 
the extra mutex) would be unnecessary in a properly integrated solution.


FYI, for a bit I had a bug where readers grabbed the mutex as well, and 
performance was almost as good as lock-free because the critical section 
was so much shorter than before (FDE table search having been moved 
outside). Still, I was only using a few threads in my tests, so the 
lock-free method is probably better.


Thoughts?
Ryan

/* This file is where we get to mess with things
 */
#include "debug-find-fde.h"

#include 
#include 
#include 
#include 

#ifdef __cplusplus
extern "C" {
#endif
#if 0
}
#endif


struct fde_table {
  signed initial_loc __attribute__ ((mode (SI)));
  signed fde __attribute__ ((mode (SI)));
};

/* Conceptually, the cache contains a sorted concatenation of all FDE
   entries we know about. However, the tables we're willing to
   consider are already sorted in their respective eh_fram_hdr
   sections in memory, so we instead record the location of each such
   table.

   The header cache is sorted, and readers use binary search over the
   range 0..hdr_cache_sz to find a lower bound; in the absence of
   concurrent updates (see below) the returned entry will contain PC,
   if any entry does. From there, the reader performs a second binary
   search, this time on the FDE table, and returns whatever it finds.

   Each call to dlopen updates the cache on success, a fairly
   non-invasive action. Calls to dlclose clear the cache before
   unloading the library and rebuild it afterward, so unwinding
   threads will have to fall back to the old way during the interim.

   Cache updates involve moving elements around, which can confuse
   readers if they have the bad luck to unwind during a cache reorg:

   1. The binary search may end at the wrong lower bound if entries
   shift around during the search. This will cause a miss where there
   should have been a hit. There are already fallbacks in place to
   cover cache misses that can arise for a number of reasons, so we
   choose to live with an occasional extra miss.

   2. Binary search may end at (what appears to be) the proper lower
   bound, but the reader retrieves an inconsistent cache entry due to
   a race with some writer. This could send the reader into la-la land
   if undetected, so we protect the reader-accessed parts of each
   cache ent

gcc 4.8.1: -O2 warnings one element off

2013-06-01 Thread Jens Bauer
Hi list.

I think I've found a bug in where gcc checks if array indices are in range.

Here's my test-code:

---8<-8<-8<-
/*
 * file: elementTest.c
 * command-line:
 *   arm-none-eabi-gcc -O2 elementTest.c -o elementTest
 */

#include 

uint8_t eightMembers[8];

int main(int argc, const char *argv[])
{
uint8_t i;

for(i = 0; i < 8; i++)
{
eightMembers[i] = 0;
}

for(i = 0; i < 9; i++)
{
eightMembers[i] = 0;
}

for(i = 0; i < 10; i++)
{
eightMembers[i] = 0;
}

return(0);
}
--->8->8->8-

Here's my result:

---8<-8<-8<-
elementTest.c: In function 'main':
elementTest.c:27:19: warning: iteration 8u invokes undefined behavior 
[-Waggressive-loop-optimizations]
   eightMembers[i] = 0;
   ^
elementTest.c:25:2: note: containing loop
  for(i = 0; i < 10; i++)
  ^
--->8->8->8-

I would expect gcc to complain when it meets the second loop as well as the 
third loop, but it didn't detect that there is something wrong with the second 
loop.

...Is this a bug ?


Love
Jens


Re: gcc 4.8.1: -O2 warnings one element off

2013-06-01 Thread Andrew Haley
On 06/01/2013 10:08 PM, Jens Bauer wrote:
> I would expect gcc to complain when it meets the second loop as well as the 
> third loop, but it didn't detect that there is something wrong with the 
> second loop.
> 
> ...Is this a bug ?

C allows you to index one element beyond the end of an array.  So, this

  uint8_t eightMembers[8];
  uint8_t *p = &eightMembers[8];

is legal, but

  uint8_t eightMembers[8];
  uint8_t *p = &eightMembers[9];

is not.

In both cases you cannot actually use the memory at *p.  I think gcc is
detecting the indexing but not the access.

Andrew.



Re: gcc 4.8.1: -O2 warnings one element off

2013-06-01 Thread Jakub Jelinek
On Sat, Jun 01, 2013 at 11:08:21PM +0200, Jens Bauer wrote:
> Here's my result:
> 
> ---8<-8<-8<-
> elementTest.c: In function 'main':
> elementTest.c:27:19: warning: iteration 8u invokes undefined behavior 
> [-Waggressive-loop-optimizations]
>eightMembers[i] = 0;
>^
> elementTest.c:25:2: note: containing loop
>   for(i = 0; i < 10; i++)
>   ^
> --->8->8->8-
> 
> I would expect gcc to complain when it meets the second loop as well as the 
> third loop, but it didn't detect that there is something wrong with the 
> second loop.
> 
> ...Is this a bug ?

It is not a bug, the warning isn't guaranteed to report all such cases of
undefined behavior, and due to lack of infrastructure in GCC 4.8 can't be
reported if certain passes discover the undefined behavior.
GCC 4.9 warns on both of the bad loops, but even in 4.9, if the loop
doesn't have constant bounds or has multiple exits etc., GCC will not warn
and just might optimize based on the fact that undefined behavior doesn't
happen in correct code.  Lack of warning doesn't mean code is bug free.

Jakub


Re: gcc 4.8.1: -O2 warnings one element off

2013-06-01 Thread Jens Bauer
On Sat, 01 Jun 2013 22:12:51 +0100, Andrew Haley wrote:
> In both cases you cannot actually use the memory at *p.  I think gcc is
> detecting the indexing but not the access.

That makes sense!

On Sat, 1 Jun 2013 23:19:45 +0200, Jakub Jelinek wrote:
> It is not a bug, the warning isn't guaranteed to report all such cases of
> undefined behavior, and due to lack of infrastructure in GCC 4.8 can't be
> reported if certain passes discover the undefined behavior.
> GCC 4.9 warns on both of the bad loops, but even in 4.9, if the loop
> doesn't have constant bounds or has multiple exits etc., GCC will not warn
> and just might optimize based on the fact that undefined behavior doesn't
> happen in correct code.  Lack of warning doesn't mean code is bug free.
...If that was the case, I'd just turn all warnings off. ;)

Thank you both, for the excellent answers. =)


Love
Jens


gcc-4.7-20130601 is now available

2013-06-01 Thread gccadmin
Snapshot gcc-4.7-20130601 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.7-20130601/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.7 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_7-branch 
revision 199587

You'll find:

 gcc-4.7-20130601.tar.bz2 Complete GCC

  MD5=eecd3732d5b466ccaf5d7807babcb4e2
  SHA1=49a6e259b68ae1496fe68f5aff6e452dfbc32a9c

Diffs from 4.7-20130525 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.7
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: I would like to provide a mirror for GCC

2013-06-01 Thread Gerald Pfeifer
Hi Timo,

On Fri, 24 May 2013, timo.ja...@go-part.com wrote:
> I was wondering if you are still accepting mirrors for GCC.
> 
> http://gcc.gnu.org/mirrors.html
> 
> If you are, I will like to provide one with HTTP, FTP and Rsync access via
> a VPS located in Australia, USA, UK, Philippines or Japan or any
> combinations of those locations.

yes, we are happy to accept mirrors.  Procedure-wise, just go ahead
and start mirroring ftp://gcc.gnu.org/pub/gcc and then send us a patch
for http://gcc.gnu.org/mirrors.html and I'll take care of that (feel
free to copy me).

Thanks,
Gerald


Generating data at link time

2013-06-01 Thread YU Chenkan
Dear all,

  I'm trying to generate a (very simple) special data section at
link-time. It seems that LTO is too heavy, though I'm also not quite
familiar with it.

  I think that what I first need is a new output format which wraps the
standard ELF, so can anyone help to point out where I can get started?

  Also if I want to generate data, is there any tool lighter than and
compatible with GCC can be used to deal with the platform dependent
stuff?



Re: Generating data at link time

2013-06-01 Thread Andrew Pinski
On Sat, Jun 1, 2013 at 5:59 PM, YU Chenkan  wrote:
> Dear all,
>
>   I'm trying to generate a (very simple) special data section at
> link-time. It seems that LTO is too heavy, though I'm also not quite
> familiar with it.
>
>   I think that what I first need is a new output format which wraps the
> standard ELF, so can anyone help to point out where I can get started?
>
>   Also if I want to generate data, is there any tool lighter than and
> compatible with GCC can be used to deal with the platform dependent
> stuff?


What kind of data section?  Could you use a plugin with the linker
(GNU ld supports plugins) which creates that?


Thanks,
Andrew Pinski