[Bug c++/31289] gcc412 elides code in operator delete() override method

2007-03-20 Thread jima at cadence dot com


--- Comment #3 from jima at cadence dot com  2007-03-21 01:32 ---
Can you point me to a description of the C++ aliasing rules?  
Is the memcpy() workaround specific to GCC, or is memcpy
part of the standard with special rules regarding aliasing?
(I ask because our code base has to work on a half-dozen
platforms, many of which don't use GCC).

Will GCC optimize the memcpy() and intermediate variable ("n") away?
With gcc323 this code comes out to about 6 instructions, so even 1 or 2
more would be a large percentage hit...

Thanks for your help!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289



[Bug c++/31289] gcc412 elides code in operator delete() override method

2007-03-20 Thread jima at cadence dot com


--- Comment #5 from jima at cadence dot com  2007-03-21 02:31 ---
(In reply to comment #4)
> (In reply to comment #3)
>... Why do you think 1 or 2 more
> instructions will be a hit, most of the problems you are going to hit into now
> is cache issues anyways unless you do a better malloc pool in the first case 
> as
> you will have non consecative memory that would really like to be close
> together.  Also you alloc the memory and then you add it to your free list but
> never actually free the memory so say if you alloc 100 10 sized objects and
> free those, and then alloc 50 20 sized objects, now your memory usage is twice
> than what it should be.

This was a contrived demo program.  The real application mallocs large 
fixed-sized chunks and sub-divides them to fill a free-list with many 
objects at once.  On average, new & delete execute only a few instructions 
(if the object has no virtuals!).  This is similar to what the Linux malloc
does internally, but without the indirect function calls.

It is true that lots of memory can be stranded after different sized objects
are allocated and freed.  In our system, applications occasionally call a
method which detects if any chunks contain no live blocks, and frees those
chunks.  This mostly cures the problem  because each chunk usually
contains objects of only one size, so chunks with any live
objects tend to be fairly well occupied.  However the apps. have to actually
use this feature :-)

We also reduce the number of unique sizes by rounding the requested size up to
some multiple [a compile-time constant calculation].

I agree that pre-allocating many objects may reduce cache locality.
On the other hand, LIFO recycling means that we re-use the hottest memory
first, and applications should use only a few different sizes during a given
phase.  So the active blocks will tend to be adjacent (in multiple sets,
one set for each unique size after rounding).  I haven't tried to measure
cache locality effect, however.

We think this all pays off for specific application areas which spend a
significant portion of their time allocating & freeing small objects, 
such as for inlined code to to do graph operations, hash-table 
inserts & deletes, etc.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289



[Bug c++/31289] gcc412 elides code in operator delete() override method

2007-03-21 Thread jima at cadence dot com


--- Comment #7 from jima at cadence dot com  2007-03-21 16:13 ---
(In reply to comment #6)
> ... the proper way is to start a new object lifetime using placement new.

Are you referring only to internal operations in MyHeap::Malloc (in the demo
program), or do you mean that the overall scheme can be re-coded somehow
to use placement new without any casts?

If the latter, I don't understand how that is possible, because the type of
the object being allocated is not known -- the API of override operator new
is (void *, size_t).

Note that the interface we are trying to preserve (in use for many years) is
that application programmers simply derive from a special base class when
declaring their own classes, and then objects will use the special allocator.
We can not require application coders to explicitly write placement-new
(or call any kind of special allocator interface); that would be unworkable
for human reasons, and also would prevent using generic template functions
which must use ordinary "new" to allocate objects of template-parameter type.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289



[Bug c++/31289] gcc412 elides code in operator delete() override method

2007-03-21 Thread jima at cadence dot com


--- Comment #8 from jima at cadence dot com  2007-03-21 20:58 ---
I tried using placement-new and explicit destructor calls to 
delineate the lifetime of the allocator's internal class objects explicitly, 
such that their lifetimes do not overlap with any references to the
same storage via pointers of other types (that is, during the lifetime of the
high-level object being allocated).

Can someone point out why this doesn't work?  See the following program ...

#include 
#include 
#include 

struct Block {
  Block * next;
  char rest_of_storage[128-sizeof(Block *)]; // max obj sizeof is 128
};
static Block * free_head = 0;

class C {
public:
  void * operator new(size_t bytes_needed) {
if (free_head == 0) { abort(); }
Block * storage = free_head;
free_head = storage->next;

// Explicitly call the destructor for Block 'storage' to end its lifetime.
// (cf C++ 06 draft @12.4-13).  C++ says the lifetime of the new 'C' being 
// allocated begins when its constructor is called, which has not yet 
// happened.  Therefore the lifetimes of the Block and the C it is
// aliased to do not overlap, so the compiler should not re-order
// references to pass this barrier point.
storage->~Block();

return storage;
  }
  void operator delete(void * ptr_to_free,  size_t how_big_it_is) {
// The lifetime of the C ended at start of its destructor, which occurred 
// before we get here.  So references to the C should not be re-ordered
// to pass that previous point (the start of its destructor).  
// The following placement-new begins the lifetime of Block 'storage'.  
// Therefore accesses to the old C and the new Block should not mix up.
Block * storage = new(ptr_to_free) Block;
storage->next = free_head;
free_head = storage;
// The lifetime of the Block 'storage' continues until it is ended
// in a later call to operator new.
  }
  C() {}
  virtual ~C() {} // virtual dtor required for problem to appear (why?)
};

int main() {
  setbuf(stdout,NULL); setbuf(stderr,NULL); // disable buffering

  free_head = new Block; // Put one object on the free-list
  free_head->next = 0;

  C *obj = new C;
  delete obj;

  if (free_head->next != 0) {
printf("ERROR: Expected only one block on free list\n");
return 1;
  }
  return 0;
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289