Re: [RFC patch] spindep: add cross cache lines checking

2012-03-07 Thread Alex Shi
On Tue, 2012-03-06 at 09:32 +, Arnd Bergmann wrote:
> On Tuesday 06 March 2012, Alex Shi wrote:
> > I have one concern and one questions here:
> > concern: maybe the lock is in a well designed 'packed' struct, and it is
> > safe for cross lines issue. but __alignof__ will return 1;
> > 
> > struct abc{
> > raw_spinlock_t lock1;
> > chara;
> > charb;
> > }__attribute__((packed));
> > 
> > Since the lock is the first object of struct, usually it is well placed.
> 
> No, it's actually not. The structure has an external alignment of 1, so
> if you have an array of these or put it into another struct like
> 
> struct xyz {
>   char x;
>   struct abc a;
> };
> 
> then it will be misaligned. Thre is no such thing as a well designed 'packed'
> struct. The only reason to use packing is to describe structures we have no
> control over such as hardware layouts and on-wire formats that have unusal
> alignments, and those will never have a spinlock on them.

Understand. thx. So is the following checking that your wanted?
===
diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h
index bc2994e..64828a3 100644
--- a/include/linux/rwlock.h
+++ b/include/linux/rwlock.h
@@ -21,10 +21,12 @@
 do {   \
static struct lock_class_key __key; \
\
+   BUILD_BUG_ON(__alignof__(lock) == 1);   \
__rwlock_init((lock), #lock, &__key);   \
 } while (0)
 #else
 # define rwlock_init(lock) \
+   BUILD_BUG_ON(__alignof__(lock) == 1);   \
do { *(lock) = __RW_LOCK_UNLOCKED(lock); } while (0)
 #endif
 
diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 7df6c17..df8a992 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -96,11 +96,13 @@
 do {   \
static struct lock_class_key __key; \
\
+   BUILD_BUG_ON(__alignof__(lock) == 1);   \
__raw_spin_lock_init((lock), #lock, &__key);\
 } while (0)
 
 #else
 # define raw_spin_lock_init(lock)  \
+   BUILD_BUG_ON(__alignof__(lock) == 1);   \
do { *(lock) = __RAW_SPIN_LOCK_UNLOCKED(lock); } while (0)
 #endif
 
===

Btw, 
1, it is alignof bug for default gcc on my fc15 and Ubuntu 11.10 etc?

struct sub {
int  raw_lock;
char a;
};
struct foo {
struct sub z;
int slk;
char y;
}__attribute__((packed));

struct foo f1;

__alignof__(f1.z.raw_lock) is 4, but its address actually can align on
one byte. 

 
> 
>   Arnd




Re: User directed Function Multiversioning (MV) via Function Overloading

2012-03-07 Thread Richard Guenther
On Wed, Mar 7, 2012 at 1:42 AM, Sriraman Tallam  wrote:
> Hi,
>
> User directed Function Multiversioning (MV) via Function Overloading
> ===
>
> I have created a set of patches to add support for user directed
> function MV via function overloading. This was discussed in this
> thread previously:
> http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html
>
> Two patches have been created now to support this:
> * The patch with the front-end changes to support versioned functions is:
>  http://codereview.appspot.com/5752064/
>
> * The patch to add runtime CPU type detection support is here:
>  http://codereview.appspot.com/5754058/

Please post the patches to gcc-patches.

> With this support, here is an example of writing a program with
> function versions:
>
> int foo ();  /* Default version */
> int foo () __attribute__ ((targetv("arch=corei7"))); /*Specialized for corei7 
> */
> int foo () __attribute__ ((targetv("arch=amdfam10"))); /*Specialized
> for amdfam10 */

I don't like specifying 'arch' at all.  Instead you _always_ want architecture
feature tests, not architecture tests.  Because, does amdfam10 also cover
bdver1?  [it can't! bdver1 does no longer have 3dnow! but that's entirely
surprising for a user]

Thus, only allow feature specifications.  [Why not re-use the existing
'target' attribute?]

I'll have a look at the patches once posted.

Richard.

>
> int main ()
> {
>  int (*p)() = &foo;
>  return foo () + (*p)();
> }
>
> int foo ()
> {
>  return 0;
> }
>
> int __attribute__ ((targetv("arch=corei7")))
> foo ()
> {
>  ...
>  return 0;
> }
>
> int __attribute__ ((targetv("arch=amdfam10")))
> foo ()
> {
>  ...
>  return 0;
> }
>
> The above example has foo defined 3 times, but all 3 definitions of
> foo are different versions of the same function. The call to foo in
> main, directly and via a pointer, are calls to the multi-versioned
> function foo which is dispatched to the right foo at run-time.
>
> Function versions must have the same signature but must differ in the
> specifier string provided to a new attribute called "targetv", which
> is nothing but the target attribute with an extra specification to
> indicate a version. Any number of versions can be created using the
> targetv attribute but it is mandatory to have one function without the
> attribute, which is treated as the default version. The front-end
> support is available in this patch:
>  http://codereview.appspot.com/5752064/
>
> The front-end treats multiple definitions of foo with the same
> signature but with different targetv attributes as legitimate
> candidates for overloading. Also, all the function versions of one
> function are grouped together. Then, calls to foo and pointer access
> of foo will be replaced by an IFUNC function (foo.ifunc) which will
> call the dispatcher code at run-time to figure out the right version
> to execute. For the above example, the following functions will be
> created :
>
> * _Z3foov.ifunc : ifunc dispatcher for multi-versioned function foo and
>  aliases to _Z3foov.resolver. All calls and pointer accesses to foo are
>  replaced by an call or pointer access to this function.
> * _Z3foov.resolver : The code to determine which version to execute at
>  run-time.
> * _Z3foov : The default version of foo.
> * _Z3foov.arch_corei7 : The corei7 version of foo.
> * _Z3foov.arch_amdfam10 : The amdfam10 version of foo.
>
> Note that using IFUNC  blocks inlining of versioned functions. I had
> implemented an optimization earlier to do hot path cloning to allow
> versioned functions to be inlined. Please see :
> http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html
> In the next iteration, I plan to merge these two. With that, hot code
> paths with versioned functions will be cloned so that versioned
> functions can be inlined.
>
> The version dispatch itself happens in a newly created pass added to
> be one of the initial lowering passes. The pass communicates with the
> target to determine the appropriate predicates to use to figure out
> which version to dispatch at run-time. The predicates are target
> builtins which determine the platform type at run-time and are added
> in this patch :
> http://codereview.appspot.com/5754058/
>
> The following features are being developed for the next iteration:
>
> 1) Support for hot path cloning to inline versioned functions.
> 2) Specifying multiple versions in a single function definition.
>
> This will be done using the following syntax:
> int foo ()
> __attribute__ ((targetv (("arch=corei7"),("arch=amdfam10"), ("arch=core2";
>
> which means the same body of foo must be cloned for corei7, amdfam10, and 
> core2.
>
> 3) Specifying ISA types in the attribute. Only "arch=" is supported now.
>
> For example,
> int foo ()
> __attribute__ ((targetv ("popcnt,ssse3")));
>
> means the version is only to be executed when popcount and ssse3
> instructions are available.
>
> 4) Other dispatching mechanism.
>
> IFUNC i

Re: [RFC patch] spindep: add cross cache lines checking

2012-03-07 Thread Arnd Bergmann
On Wednesday 07 March 2012, Alex Shi wrote:

> Understand. thx. So is the following checking that your wanted?
> ===
> diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h
> index bc2994e..64828a3 100644
> --- a/include/linux/rwlock.h
> +++ b/include/linux/rwlock.h
> @@ -21,10 +21,12 @@
>  do { \
>   static struct lock_class_key __key; \
>   \
> + BUILD_BUG_ON(__alignof__(lock) == 1);   \
>   __rwlock_init((lock), #lock, &__key);   \
>  } while (0)
>  #else
>  # define rwlock_init(lock)   \
> + BUILD_BUG_ON(__alignof__(lock) == 1);   \
>   do { *(lock) = __RW_LOCK_UNLOCKED(lock); } while (0)
>  #endif

I think the check should be (__alignof__(lock) < __alignof__(rwlock_t)),
otherwise it will still pass when you have structure with 
attribute((packed,aligned(2)))

> 1, it is alignof bug for default gcc on my fc15 and Ubuntu 11.10 etc?
> 
> struct sub {
> int  raw_lock;
> char a;
> };
> struct foo {
> struct sub z;
> int slk;
> char y;
> }__attribute__((packed));
> 
> struct foo f1;
> 
> __alignof__(f1.z.raw_lock) is 4, but its address actually can align on
> one byte. 

That looks like correct behavior, because the alignment of raw_lock inside of
struct sub is still 4. But it does mean that there can be cases where the
compile-time check is not sufficient, so we might want the run-time check
as well, at least under some config option.

Arnd


Re: [RFC patch] spindep: add cross cache lines checking

2012-03-07 Thread Alex Shi

> I think the check should be (__alignof__(lock) < __alignof__(rwlock_t)),
> otherwise it will still pass when you have structure with 
> attribute((packed,aligned(2)))


reasonable!

> 
>> 1, it is alignof bug for default gcc on my fc15 and Ubuntu 11.10 etc?
>>
>> struct sub {
>> int  raw_lock;
>> char a;
>> };
>> struct foo {
>> struct sub z;
>> int slk;
>> char y;
>> }__attribute__((packed));
>>
>> struct foo f1;
>>
>> __alignof__(f1.z.raw_lock) is 4, but its address actually can align on
>> one byte. 
> 
> That looks like correct behavior, because the alignment of raw_lock inside of
> struct sub is still 4. But it does mean that there can be cases where the
> compile-time check is not sufficient, so we might want the run-time check
> as well, at least under some config option.


what's your opinion of this, Ingo?


Re: [RFC patch] spindep: add cross cache lines checking

2012-03-07 Thread Ingo Molnar

* Alex Shi  wrote:

> > I think the check should be (__alignof__(lock) < 
> > __alignof__(rwlock_t)), otherwise it will still pass when 
> > you have structure with attribute((packed,aligned(2)))
> 
> reasonable!
> 
> >> 1, it is alignof bug for default gcc on my fc15 and Ubuntu 11.10 etc?
> >>
> >> struct sub {
> >> int  raw_lock;
> >> char a;
> >> };
> >> struct foo {
> >> struct sub z;
> >> int slk;
> >> char y;
> >> }__attribute__((packed));
> >>
> >> struct foo f1;
> >>
> >> __alignof__(f1.z.raw_lock) is 4, but its address actually can align on
> >> one byte. 
> > 
> > That looks like correct behavior, because the alignment of 
> > raw_lock inside of struct sub is still 4. But it does mean 
> > that there can be cases where the compile-time check is not 
> > sufficient, so we might want the run-time check as well, at 
> > least under some config option.
> 
> what's your opinion of this, Ingo?

Dunno. How many real bugs have you found via this patch?

Thanks,

Ingo


Re: GCC 4.7.0 Release Candidate available from gcc.gnu.org

2012-03-07 Thread NightStrike
On Fri, Mar 2, 2012 at 8:44 AM, Richard Guenther  wrote:
>
> GCC 4.7.0 Release Candidate available from gcc.gnu.org
>
> The first release candidate for GCC 4.7.0 is available from
>
>  ftp://gcc.gnu.org/pub/gcc/snapshots/4.7.0-RC-20120302
>
> and shortly its mirrors.  It has been generated from SVN revision 184777.
>
> I have so far bootstrapped and tested the release candidate on
> x86_64-linux.  Please test it and report any issues to bugzilla.
>
> If all goes well, I'd like to release 4.7.0 in about three weeks.


Building gmp/mpfr/mpc in tree fails in the configure-stage1-mpc step
with the current version of mpfr version 3.1.0, out since last
October, and mpc, version 0.9, out since Feb of 2011.  I'm guessing
the sources moved or something.

For instance, just to get the configure step to pass, I had to change
the last line of the configure step in the generated Makefile from
this:

  --disable-shared --with-gmp-include=$$r/$(HOST_SUBDIR)/gmp
--with-gmp-lib=$$r/$(HOST_SUBDIR)/gmp/.libs
--with-mpfr-include=$$s/mpfr
--with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/.libs

to this:

  --disable-shared --with-gmp-include=$$r/$(HOST_SUBDIR)/gmp
--with-gmp-lib=$$r/$(HOST_SUBDIR)/gmp/.libs
--with-mpfr-include=$$s/mpfr/src
--with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/.libs


The key section is adding /src on the end of --with-mpfr-include=$$s/mpfr/src


That gets the build further, but it still doesn't work.

This same problem affects 4.6.


GCC 4.7.0 and C++ atomics

2012-03-07 Thread Sebastian Huber

Hello,

I run the GCC testsuite for GCC 4.7.0 20120307

http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg00782.html

I got a lot of errors like this:

FAIL: g++.dg/simulate-thread/atomics-1.C  -O0 -g  (test for excess errors)
148796 Excess errors:
148797 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148798 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148799 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148800 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148801 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:438: 
undefined reference to `__sync_synchronize'
148802 
/tmp/ccgSyPe7.o:/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:438: 
more undefined references to `__sync_synchronize' follow


Who is supposed to provide this function?

--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone   : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.


Re: GCC 4.7.0 Release Candidate available from gcc.gnu.org

2012-03-07 Thread Marc Glisse

On Wed, 7 Mar 2012, NightStrike wrote:


Building gmp/mpfr/mpc in tree fails in the configure-stage1-mpc step
with the current version of mpfr version 3.1.0, out since last
October, and mpc, version 0.9, out since Feb of 2011.  I'm guessing
the sources moved or something.


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50461
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51935
(second one has a patch)

--
Marc Glisse


Re: User directed Function Multiversioning (MV) via Function Overloading

2012-03-07 Thread Andi Kleen
Richard Guenther  writes:
>
> I don't like specifying 'arch' at all.  Instead you _always_ want architecture
> feature tests, not architecture tests.  Because, does amdfam10 also cover
> bdver1?  [it can't! bdver1 does no longer have 3dnow! but that's entirely
> surprising for a user]

There's still a case for checking for specific models: some micro
architectures prefer specific code paths over others, without
necessarily using special features.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only


questions about dependence analysis in CG

2012-03-07 Thread Peng Zhao





Hello,

I am a little confused by the code in sched-deps.c. 

1. what is the purpose of the reg_note and ds_t? I see the function dk_to_ds, 
and the comment in sched-int.h, "Dependence on instruction can be of multiple 
types
(e.g. true and output). This fields enhance REG_NOTE_KIND information of the 
dependence.". What is enhanced from REG_DEP_TRUE to DEP_TRUE?

in struct _dep, I only find the dependence that is related with REG_NOTE (or 
registers). Even ds_t only as DEP_TRUE/DEP_OUTPUT/DEP_ANTI. Does this mean that 
gcc doesn't differentiate reg dependence and memory dependence?

Where can I find information about memory dependences such as a write and read 
on elementa[100] ? Sometimes meory dependence should be handled differenly with 
reg dependece in CG, such as the latency between the producer and the consumer.

2. What different purpose are haifa_note_mem_dep and haifa_note_dep for?

thanks


Re: GCC 4.7.0 and C++ atomics

2012-03-07 Thread Andrew MacLeod

On 03/07/2012 10:11 AM, Sebastian Huber wrote:

Hello,

I run the GCC testsuite for GCC 4.7.0 20120307

http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg00782.html

I got a lot of errors like this:

FAIL: g++.dg/simulate-thread/atomics-1.C  -O0 -g  (test for excess 
errors)

148796 Excess errors:
148797 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148798 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148799 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148800 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458: 
undefined reference to `__sync_synchronize'
148801 
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:438: 
undefined reference to `__sync_synchronize'
148802 
/tmp/ccgSyPe7.o:/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:438: 
more undefined references to `__sync_synchronize' follow


Who is supposed to provide this function? expand_mem_thread_fence

Looks like it is defaulting to emitting the synchronize_libfunc call, 
which is suppose to be provided by libgcc I think...


rth, you are familiar with how this part is suppose to hook up properly...

I traced the code in expand_mem_thread_fence, and the sync_synchronize 
is being emiited by:


else if (synchronize_libfunc != NULL_RTX)
emit_library_call (synchronize_libfunc, LCT_NORMAL, VOIDmode, 0);

presumably something just isn't being linked to the executable?  or 
maybe not being built into libgcc?

Andrew

Andrew




Re: GCC 4.7.0 and C++ atomics

2012-03-07 Thread Joel Sherrill

On 03/07/2012 12:44 PM, Andrew MacLeod wrote:

On 03/07/2012 10:11 AM, Sebastian Huber wrote:

Hello,

I run the GCC testsuite for GCC 4.7.0 20120307

http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg00782.html

I got a lot of errors like this:

FAIL: g++.dg/simulate-thread/atomics-1.C  -O0 -g  (test for excess
errors)
148796 Excess errors:
148797
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458:
undefined reference to `__sync_synchronize'
148798
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458:
undefined reference to `__sync_synchronize'
148799
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458:
undefined reference to `__sync_synchronize'
148800
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:458:
undefined reference to `__sync_synchronize'
148801
/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:438:
undefined reference to `__sync_synchronize'
148802
/tmp/ccgSyPe7.o:/home/sh/rtems-testing/gcc/b-arm-gcc/arm-rtemseabi4.11/libstdc++-v3/include/bits/atomic_base.h:438:
more undefined references to `__sync_synchronize' follow

Who is supposed to provide this function? expand_mem_thread_fence


Looks like it is defaulting to emitting the synchronize_libfunc call,
which is suppose to be provided by libgcc I think...

rth, you are familiar with how this part is suppose to hook up properly...

I traced the code in expand_mem_thread_fence, and the sync_synchronize
is being emiited by:

else if (synchronize_libfunc != NULL_RTX)
  emit_library_call (synchronize_libfunc, LCT_NORMAL, VOIDmode, 0);

presumably something just isn't being linked to the executable?  or
maybe not being built into libgcc?

I haven't looked into this specific case but often it is a case of
a multilib variant not being covered by the existing code.

Andrew

Andrew





--
Joel Sherrill, Ph.D. Director of Research&   Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available (256) 722-9985




Re: GCC 4.7.0 and C++ atomics

2012-03-07 Thread Richard Henderson
On 03/07/12 10:44, Andrew MacLeod wrote:
> rth, you are familiar with how this part is suppose to hook up properly...
> 
> I traced the code in expand_mem_thread_fence, and the sync_synchronize is 
> being emiited by:
> 
> else if (synchronize_libfunc != NULL_RTX)
> emit_library_call (synchronize_libfunc, LCT_NORMAL, VOIDmode, 0);
> 
> presumably something just isn't being linked to the executable?  or maybe not 
> being built into libgcc?

These functions are in libgcc, but only built for linux.
See libgcc/config/arm/t-linux-eabi.

  if (TARGET_AAPCS_BASED)
synchronize_libfunc = init_one_libfunc ("__sync_synchronize");

I assume this is because the default fallback, lacking an optab,
for synchronize is a no-op, and this is actively incorrect for
many ARM cpu revisions.

I also assume that rtems is now encountering this because of the
switch from arm-elf to arm-elf-eabi.  In order to finish the port
to the eabi, rtems will need to provide this symbol somehow.

If rtems is always universally built explicitly targeting a
specific cpu revision, then this can be as simple as

void
__sync_synchronize (void)
{
#if defined(arm revisions supporting dmb)
  asm volatile("dmb" : : : "memory");
#else
  asm volatile("" : : : "memory");
#endif
}

Otherwise you may need help from the rtems kernel, as linux does.


r~


Re: gcc-4.3-20120304 is now available

2012-03-07 Thread Gerald Pfeifer
On Sun, 4 Mar 2012, gccad...@gcc.gnu.org wrote:
> Snapshot gcc-4.3-20120304 is now available on
>   ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20120304/

What happened here?  Some collateral damage while getting the
GCC 4.7 branch set up?

Gerald


Re: [RFC patch] spindep: add cross cache lines checking

2012-03-07 Thread Alex Shi
On Wed, 2012-03-07 at 14:39 +0100, Ingo Molnar wrote:
> * Alex Shi  wrote:
> 
> > > I think the check should be (__alignof__(lock) < 
> > > __alignof__(rwlock_t)), otherwise it will still pass when 
> > > you have structure with attribute((packed,aligned(2)))
> > 
> > reasonable!
> > 
> > >> 1, it is alignof bug for default gcc on my fc15 and Ubuntu 11.10 etc?
> > >>
> > >> struct sub {
> > >> int  raw_lock;
> > >> char a;
> > >> };
> > >> struct foo {
> > >> struct sub z;
> > >> int slk;
> > >> char y;
> > >> }__attribute__((packed));
> > >>
> > >> struct foo f1;
> > >>
> > >> __alignof__(f1.z.raw_lock) is 4, but its address actually can align on
> > >> one byte. 
> > > 
> > > That looks like correct behavior, because the alignment of 
> > > raw_lock inside of struct sub is still 4. But it does mean 
> > > that there can be cases where the compile-time check is not 
> > > sufficient, so we might want the run-time check as well, at 
> > > least under some config option.
> > 
> > what's your opinion of this, Ingo?
> 
> Dunno. How many real bugs have you found via this patch?

None. Guess stupid code was shot in lkml reviewing. But if the patch in,
it is helpful to block stupid code in developing. 
> 
> Thanks,
> 
>   Ingo




Why are libgcc.a and libgcc_eh.a compiled with -fvisibility=hidden?

2012-03-07 Thread Ollie Wild
For reasons outside the scope of this discussion, we're experimenting
with statically linking libgcc.a and libgcc_eh.a into dynamically
linked applications which depend on libc but no other dynamic
libraries.  To make this work, libc needs to access a few functions
for stack unwinding inside pthread_cancel.  With suitable
modifications, everything works, except for one problem: libgcc_eh.a
is compiled with -fvisibility=hidden.

Now, I can put together a hack in our local source tree to remove this
... but why is this the case at all?  It might make sense for
libgcc_s.so, but that's compiled with default visibility (and the set
of explicitly visible symbols is broken)?  The only other use case I
can think of is for shared libraries which (for some reason) want to
embed private copies of these libraries, but on x86_64, libgcc*.a get
compiled by default without -fPIC, so that doesn't even work.

So ... is there a valid reason for this, or is this just an accident
of history?  AFICT, this behavior dates back to 2007 as of r120429
(http://gcc.gnu.org/viewcvs/trunk/libgcc/static-object.mk?view=markup&pathrev=120429).

If no one knows a valid reason for this, I'll submit a patch to remove
it from trunk.  Otherwise, I'll just modify it locally.

Thanks,
Ollie


Re: [RFC patch] spindep: add cross cache lines checking

2012-03-07 Thread Alex Shi
> > 1, it is alignof bug for default gcc on my fc15 and Ubuntu 11.10 etc?
> > 
> > struct sub {
> > int  raw_lock;
> > char a;
> > };
> > struct foo {
> > struct sub z;
> > int slk;
> > char y;
> > }__attribute__((packed));
> > 
> > struct foo f1;
> > 
> > __alignof__(f1.z.raw_lock) is 4, but its address actually can align on
> > one byte. 
> 
> That looks like correct behavior, because the alignment of raw_lock inside of
> struct sub is still 4. But it does mean that there can be cases where the
> compile-time check is not sufficient, so we might want the run-time check
> as well, at least under some config option.

According to explanation of gcc, seems it should return 1 when it can be
align on char. And then it's useful for design intend. Any comments from
gcc guys? 

http://gcc.gnu.org/onlinedocs/gcc/Alignment.html
The keyword __alignof__ allows you to inquire about how an object is
aligned, or the minimum alignment usually required by a type. Its syntax
is just like sizeof.





Re: questions about dependence analysis in CG

2012-03-07 Thread Ian Lance Taylor
Peng Zhao  writes:

> 1. what is the purpose of the reg_note and ds_t?

The reg-note lives in the RTL.  ds_t is the same information represented
as a bitflag.  E.g., see how the bits are accumulated in a ds_t variable
in ask_dependency_caches.

> Where can I find information about memory dependences such as a write
> and read on elementa[100] ? Sometimes meory dependence should be
> handled differenly with reg dependece in CG, such as the latency
> between the producer and the consumer.

The reg-notes do handle some memory dependencies.  Also, see
true_dependence and friends in alias.c.  Richard Guenther probably has a
better answer.

> 2. What different purpose are haifa_note_mem_dep and haifa_note_dep for?

To add a memory dependency.

Ian


Re: Why are libgcc.a and libgcc_eh.a compiled with -fvisibility=hidden?

2012-03-07 Thread Ian Lance Taylor
Ollie Wild  writes:

> So ... is there a valid reason for this, or is this just an accident
> of history?  AFICT, this behavior dates back to 2007 as of r120429
> (http://gcc.gnu.org/viewcvs/trunk/libgcc/static-object.mk?view=markup&pathrev=120429).

No, that's not right.  That change just moves the libgcc build out of
gcc over to libgcc as a top-level directory.  It didn't introduce the
visibility hiding.  I think that dates back to revision 50063.

2002-02-26  Jakub Jelinek  

* configure.in (libgcc_visibility): Substitute.
* configure: Rebuilt.
* mklibgcc.in: If libgcc_visibility = yes, make libgcc.a global
defined symbols .hidden.


The patch was discussed here:
http://gcc.gnu.org/ml/gcc-patches/2002-02/msg01856.html .

However, I don't see any discussion of why this change was made.

CC'ed Jakub to see if he remembers.

Ian


Re: Why are libgcc.a and libgcc_eh.a compiled with -fvisibility=hidden?

2012-03-07 Thread Eric Botcazou
> So ... is there a valid reason for this, or is this just an accident
> of history?  AFICT, this behavior dates back to 2007 as of r120429
> (http://gcc.gnu.org/viewcvs/trunk/libgcc/static-object.mk?view=markup&pathr
>ev=120429).

At least on some platforms, you cannot have more than one libgcc_eh in an 
entire link, otherwise you cannot propagate exceptions across modules.

-- 
Eric Botcazou


Restricting with Multilib

2012-03-07 Thread Mohamed Shafi
Hi,

For the target that i am porting needs a cpu command line option i.e
it doesn't have a default option. Currently it takes 3 variant, say
cpu1, cpu2, cpu3.

So when i enable multilib option

MULTILIB_OPTIONS = mcpu=1/mcpu=2/mcpu=3

I get the following libgcc variants:

cpu1/libgcc
cpu2/libgcc
cpu3/libgcc
libgcc

That includes i variant for each cpu and a default version. Is there
any way to restrict GCC from building the default version?

Regards,
Shafi


Re: Why are libgcc.a and libgcc_eh.a compiled with -fvisibility=hidden?

2012-03-07 Thread Ian Lance Taylor
Eric Botcazou  writes:

>> So ... is there a valid reason for this, or is this just an accident
>> of history?  AFICT, this behavior dates back to 2007 as of r120429
>> (http://gcc.gnu.org/viewcvs/trunk/libgcc/static-object.mk?view=markup&pathr
>>ev=120429).
>
> At least on some platforms, you cannot have more than one libgcc_eh in an 
> entire link, otherwise you cannot propagate exceptions across modules.

True, but not, as far as I can see, an explanation for why the symbols
are hidden.  Hiding the symbols doesn't fix the problem of having
multiple libgcc_eh on those platforms.

Ian


Re: [RFC patch] spindep: add cross cache lines checking

2012-03-07 Thread Ingo Molnar

* Alex Shi  wrote:

> On Wed, 2012-03-07 at 14:39 +0100, Ingo Molnar wrote:
> > * Alex Shi  wrote:
> > 
> > > > I think the check should be (__alignof__(lock) < 
> > > > __alignof__(rwlock_t)), otherwise it will still pass when 
> > > > you have structure with attribute((packed,aligned(2)))
> > > 
> > > reasonable!
> > > 
> > > >> 1, it is alignof bug for default gcc on my fc15 and Ubuntu 11.10 etc?
> > > >>
> > > >> struct sub {
> > > >> int  raw_lock;
> > > >> char a;
> > > >> };
> > > >> struct foo {
> > > >> struct sub z;
> > > >> int slk;
> > > >> char y;
> > > >> }__attribute__((packed));
> > > >>
> > > >> struct foo f1;
> > > >>
> > > >> __alignof__(f1.z.raw_lock) is 4, but its address actually can align on
> > > >> one byte. 
> > > > 
> > > > That looks like correct behavior, because the alignment of 
> > > > raw_lock inside of struct sub is still 4. But it does mean 
> > > > that there can be cases where the compile-time check is not 
> > > > sufficient, so we might want the run-time check as well, at 
> > > > least under some config option.
> > > 
> > > what's your opinion of this, Ingo?
> > 
> > Dunno. How many real bugs have you found via this patch?
> 
> None. Guess stupid code was shot in lkml reviewing. But if the 
> patch in, it is helpful to block stupid code in developing.

The question is, if in the last 10 years not a single such case 
made it through to today's 15 million lines of kernel code, why 
should we add the check now?

If it was a simple build time check then maybe, but judging by 
the discussion it does not seem so simple, does it?

Thanks,

Ingo


Re: Why are libgcc.a and libgcc_eh.a compiled with -fvisibility=hidden?

2012-03-07 Thread Eric Botcazou
> True, but not, as far as I can see, an explanation for why the symbols
> are hidden.  Hiding the symbols doesn't fix the problem of having
> multiple libgcc_eh on those platforms.

Yes, it does, as it prevents libgcc_eh from being linked in shared libraries, 
thus forcing you to use libgcc_s.so, at least on those platforms.

-- 
Eric Botcazou


Re: Why are libgcc.a and libgcc_eh.a compiled with -fvisibility=hidden?

2012-03-07 Thread Ian Lance Taylor
Eric Botcazou  writes:

>> True, but not, as far as I can see, an explanation for why the symbols
>> are hidden.  Hiding the symbols doesn't fix the problem of having
>> multiple libgcc_eh on those platforms.
>
> Yes, it does, as it prevents libgcc_eh from being linked in shared libraries, 
> thus forcing you to use libgcc_s.so, at least on those platforms.

I'm sorry, I'm sitll missing something. I don't understand how having
hidden symbols prevents libgcc_eh from being linked into shared
libraries.  I mean, what is stopping you (aside from the fact that
libgcc_eh is not compiled with -fPIC).  After all, you can link hidden
symbols into a shared library; the library will use those symbols but
will not expose them for use by other libraries.

And it seems to me that if you link libgcc_eh into your main executable,
then things actually would work if the symbols were not hidden--all the
shared libraries would use the version in the main executable.

Ian