Re: Can realloc be marked as a mallloc-like function?

2007-07-16 Thread Wolfram Gloger
Hi,

First, I assume we are talking about C realloc here, not just a
"realloc-like" function which may have other semantics and for which
__attribute_malloc__ may not be appropriate.

> > It looks like gcc assumes a functon marked with DECL_IS_MALLOC won't
> > return an address which can alias something else. But it isn't true
> > for realloc. Now, the qestions are
> >
> > 1. Can gcc make such an assumption?
> 
> No, it can't.  The returned memory may alias the original memory.

After realloc(p, s) has returned non-NULL, the "original object"
doesn't exist any more, hence there can't be any aliases.

> > 2. Can realloc be marked as DECL_IS_MALLOC.
> 
> ... with DECL_IS_MALLOC the following
> 
> int *p;
> p = malloc (4);
> *p = 0;
> p = realloc (p, 4);
> *p = 1;
> 
> will have VOPs that do not prevent re-ordering of the two stores.

By that reasoning, consider:

 int *p;
 p = malloc (4);
 *p = 0;
 free(p);
 p = malloc (4); /* this is very likely to return the same address as before */
 *p = 1;

What prevents the reordering of the stores in this case?
Should we also remove __attribute_malloc__ from malloc :-)?
IMHO this is an object lifetime issue not an aliasing issue.

> > BTW, glibc also marks realloc with __attribute_malloc__.
> 
> Which is wrong as well.

I disagree.

Of course, the gcc developers get to define the semantics of
__attribute_malloc__, but according to the gcc manual, the attribute
only refers to the _result_ of the attributed function, hence I would
intuitively expect that I can safely mark:

int *destroy_something_and_allocate_anotherthing(int *p)
{
free(p);
return malloc(sizeof(int)); /* again very likely to return the
same as the previous p */
}

as __attribute_malloc__.

Regards,
Wolfram.


Re: Can realloc be marked as a mallloc-like function?

2007-07-17 Thread Wolfram Gloger
> On Mon, Jul 16, 2007 at 02:39:39PM -0000, Wolfram Gloger wrote:
> > > int *p;
> > > p = malloc (4);
> > > *p = 0;
> > > p = realloc (p, 4);
> > > *p = 1;
> > 
> > By that reasoning, consider:
> > 
> >  int *p;
> >  p = malloc (4);
> >  *p = 0;
> >  free(p);
> >  p = malloc (4);
> >  *p = 1;
> 
> Except that in the first sequence, the value of *p is retained across
> the reallocation, so "*p = 0" is not dead.  The two examples aren't
> really the same at all.

I'm not saying they are the same.  But, how does gcc determine that it
can move "*p = 0" after realloc()?  Does it treat realloc() specially
and insert a run-time check to look at its result to compare it with
the old pointer?  That would seem like a dubious feature totally
independent of __attribute_malloc__.

Surely you agree that in my second example, "*p = 0" _cannot_ be moved
after the call to destroy_something_and_allocate_anotherthing(p)?

Regards,
Wolfram.


Re: Something is broken in repack

2007-12-14 Thread Wolfram Gloger
Hi,

> Note that delta following involves patterns something like
> 
>allocate (small) space for delta
>for i in (1..depth) {
>   allocate large space for base
>   allocate large space for result
>   .. apply delta ..
>   free large space for base
>   free small space for delta
>}
> 
> so if you have some stupid heap algorithm that doesn't try to merge and 
> re-use free'd spaces very aggressively (because that takes CPU time!),

ptmalloc2 (in glibc) _per arena_ is basically best-fit.  This is the
best known general strategy, but it certainly cannot be the best in
every case.

> you 
> might have memory usage be horribly inflated by the heap having all those 
> holes for all the objects that got free'd in the chain that don't get 
> aggressively re-used.

It depends how large 'large' is -- if it exceeds the mmap() threshold
(settable with mallopt(M_MMAP_THRESHOLD, ...))
the 'large' spaces will be allocated with mmap() and won't cause
any internal fragmentation.
It might pay to experiment with this parameter if it is hard to
avoid the alloc/free large space sequence.

> Threaded memory allocators then make this worse by probably using totally 
> different heaps for different threads (in order to avoid locking), so they 
> will *all* have the fragmentation issue.

Indeed.

Could someone perhaps try ptmalloc3
(http://malloc.de/malloc/ptmalloc3-current.tar.gz) on this case?

Thanks,
Wolfram.



Re: Something is broken in repack

2007-12-14 Thread Wolfram Gloger
Hi,

> Maybe an malloc/free/mmap wrapper that records the requested sizes and
> alloc/free order and dumps them to file so that one can make a compact
> git-free standalone test case for the glibc maintainers might be a good
> thing.

I already have such a wrapper:

http://malloc.de/malloc/mtrace-20060529.tar.gz

But note that it does interfere with the thread scheduling, so it
can't record the exact same allocation pattern as when not using the
wrapper.

Regards,
Wolfram.


Re: Something is broken in repack

2007-12-14 Thread Wolfram Gloger
Hi,

> >>if (progress->total) {
> >>unsigned percent = n * 100 / progress->total;
> >>if (percent != progress->last_percent || progress_update) {
> >> +  struct mallinfo m = mallinfo();
> >>progress->last_percent = percent;
> >> -  fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
> >> -  progress->title, percent, n,
> >> -  progress->total, tp, eol);
> >> +  fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s",
> >> +  progress->title, percent, n, progress->total,
> >> +  m.uordblks >> 18, m.fordblks >> 18,
> >> +  tp, eol);
> > 
> > Note: I didn't know what unit of memory those blocks represents, so the 
> > shift is most probably wrong.
> > 
> 
> Me neither, but it appears to me as if hblkhd holds the actual memory
> consumed by the process. It seems to store the information in bytes,
> which I find a bit dubious unless glibc has some internal multiplier.

mallinfo() will only give you the used memory for the main arena.
When you have separate arenas (likely when concurrent threads have
been used), the only way to get the full picture is to call
malloc_stats(), which prints to stderr.

Regards,
Wolfram.



Re: Something is broken in repack

2007-12-14 Thread Wolfram Gloger
Hi,

> Uh what?  Someone crank out his copy of "The Art of Computer
> Programming", I think volume 1.  Best fit is known (analyzed and proven
> and documented decades ago) to be one of the worst strategies for memory
> allocation.  Exactly because it leads to huge fragmentation problems.

Well, quoting http://gee.cs.oswego.edu/dl/html/malloc.html:

"As shown by Wilson et al, best-fit schemes (of various kinds and
approximations) tend to produce the least fragmentation on real loads
compared to other general approaches such as first-fit."

See [Wilson 1995] ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for
more details and references.

Regards,
Wolfram.