On Fri, Jun 12, 2009 at 08:08:40AM +0200, Claudio Jeker wrote:

> > What's the reason to move to RB trees? In general they are slower,
> > have larger memory overhead and cause more memory fragmentation than
> > hash tables. The last thing is fixed by using pools, but the other two
> > things remain. 
> > 
> 
> This is mostly theoretical. Most hash tables are badly sized and have
> often bad hashing algorithms that tend to cause long linear list.

So fix the sizing and use a proper hash algorithm. Indeed, it's more
difficult if sizing is hard.

I'm not against trees, but I like to see proper reasoning to choose
one above the other. Trees execute log(n) compare functions for any
operation on it. If your dataset is large, that is something which can
really hurt, compared to average single hash computation plus a single
compare for properly sized hash tables. 

> It is also almost impossible to resize the tables easily.

Yes, that is a major drawback.

> > Is it just to be able to grow the cache without extra work?
> > 
> 
> AFAIK the whole work was done to make the cache more sane. The current
> version is just insane enough that Bob was crying, shouting and playing
> with red wine bottles during c2k9.

That's not enough reason to change the data structure. Again, there
can be good reasons, but choosing trees for sets where in-order
iteration is not needed should be validated. Not just by a gut
feeling, but by measurememnts or analysis of implementations.

        -Otto

Reply via email to