[Bug c++/31289] gcc412 elides code in operator delete() override method
--- Comment #3 from jima at cadence dot com 2007-03-21 01:32 --- Can you point me to a description of the C++ aliasing rules? Is the memcpy() workaround specific to GCC, or is memcpy part of the standard with special rules regarding aliasing? (I ask because our code base has to work on a half-dozen platforms, many of which don't use GCC). Will GCC optimize the memcpy() and intermediate variable ("n") away? With gcc323 this code comes out to about 6 instructions, so even 1 or 2 more would be a large percentage hit... Thanks for your help! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289
[Bug c++/31289] gcc412 elides code in operator delete() override method
--- Comment #5 from jima at cadence dot com 2007-03-21 02:31 --- (In reply to comment #4) > (In reply to comment #3) >... Why do you think 1 or 2 more > instructions will be a hit, most of the problems you are going to hit into now > is cache issues anyways unless you do a better malloc pool in the first case > as > you will have non consecative memory that would really like to be close > together. Also you alloc the memory and then you add it to your free list but > never actually free the memory so say if you alloc 100 10 sized objects and > free those, and then alloc 50 20 sized objects, now your memory usage is twice > than what it should be. This was a contrived demo program. The real application mallocs large fixed-sized chunks and sub-divides them to fill a free-list with many objects at once. On average, new & delete execute only a few instructions (if the object has no virtuals!). This is similar to what the Linux malloc does internally, but without the indirect function calls. It is true that lots of memory can be stranded after different sized objects are allocated and freed. In our system, applications occasionally call a method which detects if any chunks contain no live blocks, and frees those chunks. This mostly cures the problem because each chunk usually contains objects of only one size, so chunks with any live objects tend to be fairly well occupied. However the apps. have to actually use this feature :-) We also reduce the number of unique sizes by rounding the requested size up to some multiple [a compile-time constant calculation]. I agree that pre-allocating many objects may reduce cache locality. On the other hand, LIFO recycling means that we re-use the hottest memory first, and applications should use only a few different sizes during a given phase. So the active blocks will tend to be adjacent (in multiple sets, one set for each unique size after rounding). I haven't tried to measure cache locality effect, however. We think this all pays off for specific application areas which spend a significant portion of their time allocating & freeing small objects, such as for inlined code to to do graph operations, hash-table inserts & deletes, etc. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289
[Bug c++/31289] gcc412 elides code in operator delete() override method
--- Comment #7 from jima at cadence dot com 2007-03-21 16:13 --- (In reply to comment #6) > ... the proper way is to start a new object lifetime using placement new. Are you referring only to internal operations in MyHeap::Malloc (in the demo program), or do you mean that the overall scheme can be re-coded somehow to use placement new without any casts? If the latter, I don't understand how that is possible, because the type of the object being allocated is not known -- the API of override operator new is (void *, size_t). Note that the interface we are trying to preserve (in use for many years) is that application programmers simply derive from a special base class when declaring their own classes, and then objects will use the special allocator. We can not require application coders to explicitly write placement-new (or call any kind of special allocator interface); that would be unworkable for human reasons, and also would prevent using generic template functions which must use ordinary "new" to allocate objects of template-parameter type. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289
[Bug c++/31289] gcc412 elides code in operator delete() override method
--- Comment #8 from jima at cadence dot com 2007-03-21 20:58 --- I tried using placement-new and explicit destructor calls to delineate the lifetime of the allocator's internal class objects explicitly, such that their lifetimes do not overlap with any references to the same storage via pointers of other types (that is, during the lifetime of the high-level object being allocated). Can someone point out why this doesn't work? See the following program ... #include #include #include struct Block { Block * next; char rest_of_storage[128-sizeof(Block *)]; // max obj sizeof is 128 }; static Block * free_head = 0; class C { public: void * operator new(size_t bytes_needed) { if (free_head == 0) { abort(); } Block * storage = free_head; free_head = storage->next; // Explicitly call the destructor for Block 'storage' to end its lifetime. // (cf C++ 06 draft @12.4-13). C++ says the lifetime of the new 'C' being // allocated begins when its constructor is called, which has not yet // happened. Therefore the lifetimes of the Block and the C it is // aliased to do not overlap, so the compiler should not re-order // references to pass this barrier point. storage->~Block(); return storage; } void operator delete(void * ptr_to_free, size_t how_big_it_is) { // The lifetime of the C ended at start of its destructor, which occurred // before we get here. So references to the C should not be re-ordered // to pass that previous point (the start of its destructor). // The following placement-new begins the lifetime of Block 'storage'. // Therefore accesses to the old C and the new Block should not mix up. Block * storage = new(ptr_to_free) Block; storage->next = free_head; free_head = storage; // The lifetime of the Block 'storage' continues until it is ended // in a later call to operator new. } C() {} virtual ~C() {} // virtual dtor required for problem to appear (why?) }; int main() { setbuf(stdout,NULL); setbuf(stderr,NULL); // disable buffering free_head = new Block; // Put one object on the free-list free_head->next = 0; C *obj = new C; delete obj; if (free_head->next != 0) { printf("ERROR: Expected only one block on free list\n"); return 1; } return 0; } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289