[Bug libstdc++/65033] New: C++11 atomics: is_lock_free result does not always match the real lock-free property

2015-02-11 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033

Bug ID: 65033
   Summary: C++11 atomics: is_lock_free result does not always
match the real lock-free property
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bin.x.fan at oracle dot com

Hi,

The is_lock_free result for an object of type atomic, where s3_t is
size=3, alignment=1 C style struct, does not always match the implementation in
libatomic.so for atomic operations on this object. I think there is either a
bug in the g++ header  and the g++ 4.9.2 implementation is not C++11
standard conforming, or there is a bug in libatomic.so.

Here is the source code

-bash-4.1$ cat struct3.cc
#include 
#include 
using namespace std;
#define N 10
struct s3_t {
  char a[3];
};
atomic array[N];
s3_t obj;

int main()
{
  int i;
  for (i=0;i[1] libat_lock_n(ptr = 0x216c6, n = 3U), line 64 in "lock.c"
  [2] libat_store(n = 3U, mptr = 0x216c6, vptr = 0xffbff580, smodel = 5), line
100 in "gstore.c"
  [3] std::atomic::store(this = 0x216c6, __i = STRUCT, _m =
memory_order_seq_cst), line 199 in "atomic"
  [4] std::atomic_store_explicit(__a = 0x216c6, __i = STRUCT, __m =
memory_order_seq_cst), line 828 in "atomic"
  [5] std::atomic_store(__a = 0x216c6, __i = STRUCT), line 895 in
"atomic"
  [6] main(), line 22 in "struct3.cc"

So one of the following two things could be happening here
1. g++ makes lock-free property per-object, which is not C++11 standard
conforming, and report it incorrectly with atomic_is_lock_free, or
2. g++ tries to make lock-free property per-type, but the libatomic.so
implementation does not match. Also, without changing the alignment, I doubt
that size=3 alignment=1 atomic object can always be lock-free on SPARC or x86.


[Bug libstdc++/65033] C++11 atomics: is_lock_free result does not always match the real lock-free property

2015-02-12 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033

--- Comment #5 from Bin Fan  ---
(In reply to Jason Merrill from comment #3)
> (In reply to Bin Fan from comment #0)
> > 2. g++ tries to make lock-free property per-type, but the libatomic.so
> > implementation does not match.
> 
> This.  We always pass a null pointer to libatomic and do not pass any
> information about the alignment of the type.  rth suggested that we might
> try passing a fake, minimally-aligned pointer instead of null as a way of
> communicating the alignment without adding a new entry point.

So after the fix, atomic_is_lock_free will always return 0 for size=3,align=1
atomic struct objects?

I understand currently libatomic tries to make an atomic object lock-free if
its memory location fit in a certain sized window. So for atomic operations
such as atomic_store where the actual address is passed in, the operation can
be still either lock-free or locked, right? I'm wondering if it's standard
conforming since the lock-free property is still per-object, or it can be seen
as an optimization, i.e. atomic_is_lock_free query for the object returns 0,
but atomic operations on the object could be lock-free.


[Bug c/65083] New: Can not indirectly call some C11 atomic library functions

2015-02-16 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65083

Bug ID: 65083
   Summary: Can not indirectly call some C11 atomic library
functions
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bin.x.fan at oracle dot com

C11 defines these as actual functions, not generic functions or macros:

  atomic_thread_fence
  atomic_signal_fence
  atomic_flag_test_and_set
  atomic_flag_test_and_set_explicit
  atomic_flag_clear
  atomic_flag_clear_explicit

User should be able to take their address and call them indirectly. However,
GCC does not provide definitions of these functions in libatomic.so, so GCC
does not allow the user to take the address of these functions.

Here is an example:

-bash-4.1$ gcc -v
Using built-in specs.
COLLECT_GCC=/net/dv104/export/tools/gcc/4.9.2/sparc-S2/bin/gcc.bin
COLLECT_LTO_WRAPPER=/net/dv104/export/tools/gcc/4.9.2/sparc-S2/libexec/gcc/sparc-sun-solaris2.10/4.9.2/lto-wrapper
Target: sparc-sun-solaris2.10
Configured with: ../gcc-4.9.2/configure
--prefix=/net/dv104/export/tools/gcc/4.9.2/sparc-S2
--enable-languages=c,c++,fortran
--with-gmp=/net/dv104/export/tools/gcc/4.9.2/sparc-S2
--with-mpfr=/net/dv104/export/tools/gcc/4.9.2/sparc-S2
--with-mpc=/net/dv104/export/tools/gcc/4.9.2/sparc-S2
Thread model: posix
gcc version 4.9.2 (GCC) 

-bash-4.1$ cat t.c
#include 
void (*func_ptr) (memory_order order);
int main()
{
 func_ptr = &atomic_thread_fence;
 (*func_ptr)(memory_order_seq_cst);
 return 0;
}
-bash-4.1$ gcc t.c -latomic
t.c: In function 'main':
t.c:5:15: error: 'atomic_thread_fence' undeclared (first use in this function)
  func_ptr = &atomic_thread_fence;
  ^
t.c:5:15: note: each undeclared identifier is reported only once for each
function it appears in


[Bug libstdc++/66842] New: libatomic uses multiple locks for locked atomics

2015-07-11 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842

Bug ID: 66842
   Summary: libatomic uses multiple locks for locked atomics
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bin.x.fan at oracle dot com
  Target Milestone: ---

Hi GCC folks,

I'm opening this bug to report an issue that may or may not be a real bug. I
notice that GCC libatomic uses multiple locks for a locked atomic object whose
size is greater than 64 bytes. The granularity seems to be 64 because for every
64 bytes added to the size, one more lock is added.

It seems that this is to protect overlapping locked atomic object. If locked
atomic objects never overlap, then a more efficient way to do locked atomic
operations would be each object being protected by just one lock that is hashed
from its address.

Accessing a member of an atomic struct object is undefined behavior in C11
standard. So, does GCC support it as an extension or using multiple locks is
unnecessary therefore it’s a performance bug?

Here is my code to illustrate the issue. I interpose pthread_mutex_lock to
count how many times it is called. My GCC version is 4.9.2, and its target is
x86_64-unknown-linux-gnu. The libatomic.so I use comes with the GCC 4.9.2
installation.

-bash-4.2$ cat libmythread.c
#define _GNU_SOURCE
#include 
#include 
#include 
#include 

static int counter = 0;

int pthread_mutex_lock (pthread_mutex_t *mutex)
{
static int (*real_pthread_mutex_lock)(pthread_mutex_t *) = NULL;
if (real_pthread_mutex_lock == NULL) {
real_pthread_mutex_lock = dlsym (RTLD_NEXT, "pthread_mutex_lock");
}
assert (real_pthread_mutex_lock);
counter++;
return real_pthread_mutex_lock (mutex);
}

void display_nlocks ()
{
printf ("pthread_mutex_lock is called %d times\n", counter);
return;
}
-bash-4.2$ cat c11_locked_atomics.c
#include 

#ifndef SIZE
#define SIZE 1024
#endif

typedef struct {
char a[SIZE];
} lock_obj_t;

extern void display_nlocks ();

int main()
{
lock_obj_t v2 = {0};
_Atomic lock_obj_t v1 = ATOMIC_VAR_INIT(v2);
v2 = atomic_load (&v1);
display_nlocks ();
return 0;
}

-bash-4.2$ gcc -shared -ldl -fPIC libmythread.c -o libmythread.so
-bash-4.2$ gcc -latomic c11_locked_atomics.c -DSIZE=2048 -L./ -Wl,-rpath=./
-lmythread
-bash-4.2$ LD_PRELOAD=./libmythread.so a.out
pthread_mutex_lock is called 32 times

[Bug c/66842] libatomic uses multiple locks for locked atomics

2015-07-13 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842

--- Comment #2 from Bin Fan  ---
I couldn't find a category for libatomic, and my understand is that C and C++
share libatomic library.

(In reply to Jonathan Wakely from comment #1)
> This obviously isn't a libstdc++ bug because you're not even using C++!


[Bug c++/66842] libatomic uses multiple locks for locked atomics

2015-07-15 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842

Bin Fan  changed:

   What|Removed |Added

  Component|c   |c++

--- Comment #4 from Bin Fan  ---
Since I don't see any response from C so far, I change the example to C++ code,
and change the category to c++. Could C++ folks take a look?

-bash-4.2$ cat c++11_locked_atomics.cpp 
#include 
using namespace std;

#ifndef SIZE
#define SIZE 1024
#endif

typedef struct {
char a[SIZE];
} lock_obj_t; 

extern "C" {
extern void display_nlocks ();
}

int main()
{
lock_obj_t v2 = {0};
atomic v1 = ATOMIC_VAR_INIT(v2);
v2 = atomic_load (&v1);
display_nlocks ();
return 0;
}

gcc -shared -ldl -fPIC libmythread.c -o libmythread.so -g
g++ -std=c++11 -latomic c++11_locked_atomics.cpp -DSIZE=2048 -g -L./
-Wl,-rpath=./ -lmythread
+ LD_PRELOAD=./libmythread.so
+ a.out
pthread_mutex_lock is called 32 times

The g++ version is still 4.9.2.


[Bug libstdc++/65033] C++11 atomics: is_lock_free result does not always match the real lock-free property

2015-07-16 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033

--- Comment #9 from Bin Fan  ---
I verified this bug is fixed in 5.1.0. However, it is only fixed in g++, so now
in 5.1.0, gcc and g++ reports different result:

-bash-4.1$ cat is_lock_free.c
#include 
#include 
#define N 10
typedef struct {
  char a[3];
} s3_t;
_Atomic s3_t array[N];
s3_t obj;

int main()
{
  int i;
  for (i=0;i
#include 
using namespace std;
#define N 10
struct s3_t {
  char a[3];
};
atomic array[N];
s3_t obj;

int main()
{
  int i;
  for (i=0;i

[Bug c++/66842] libatomic uses multiple locks for locked atomics

2015-07-31 Thread bin.x.fan at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842

--- Comment #6 from Bin Fan  ---
(In reply to Richard Henderson from comment #5)
> When libatomic was first written, it wasn't clear what the restrictions
> from the various languages would be, nor even if that was the best of
> ideas -- things that would Just Work lock-free would fail on other,
> less popular platforms.
> 
> Thus libatomic is written such that accesses to the same object, via
> different aliased pages, will work.  

Could you clarify what does aliased pages mean? Do you mean the same object is
mapped into two or more different processes with different virtual addresses?
And the locks in libatomic are also shared by the processes? Or something else?

> Thus locks are created on a per-cacheline basis covering one page.

This make sense if the above understand of aliased pages is true. However, what
if the memory is not mapped at page boundaries? Then the object may have
different page offset therefore it is still protected by different locks.

And this does not explain why a locked object is protected by multiple locks.
If memory is always mapped at edge boundaries, then the offset of the object in
the page will always be the same so one lock should work. If memory is not
mapped at page boundaries, then if an object is mapped into two
"non-overlapped" address space inside a page, multiple locks would still don't
work.

> 
> This does lead to inefficiencies wrt a more straight-forward solution,
> but very careful thought needs to go into changing it.

Besides aliased pages, does libatomic consider supporting nested locked atomic
objects? For example, should the following work?

typedef struct {
  _Atomic locked1_t obj1;
  /* other fields */
} locked2_t;

_Atomic locked2_t obj2;

atomic_store(&obj2, ...)
atomic_load(&obj2.obj1, ...)