On Fri, Jun 12, 2009 at 08:08:40AM +0200, Claudio Jeker wrote: > > What's the reason to move to RB trees? In general they are slower, > > have larger memory overhead and cause more memory fragmentation than > > hash tables. The last thing is fixed by using pools, but the other two > > things remain. > > > > This is mostly theoretical. Most hash tables are badly sized and have > often bad hashing algorithms that tend to cause long linear list.
So fix the sizing and use a proper hash algorithm. Indeed, it's more difficult if sizing is hard. I'm not against trees, but I like to see proper reasoning to choose one above the other. Trees execute log(n) compare functions for any operation on it. If your dataset is large, that is something which can really hurt, compared to average single hash computation plus a single compare for properly sized hash tables. > It is also almost impossible to resize the tables easily. Yes, that is a major drawback. > > Is it just to be able to grow the cache without extra work? > > > > AFAIK the whole work was done to make the cache more sane. The current > version is just insane enough that Bob was crying, shouting and playing > with red wine bottles during c2k9. That's not enough reason to change the data structure. Again, there can be good reasons, but choosing trees for sets where in-order iteration is not needed should be validated. Not just by a gut feeling, but by measurememnts or analysis of implementations. -Otto