On 2010-03-28 21:11, Primal Wijesekera wrote: > I am a master student in UBC CS dept. I along with one of my lab mates are > trying to implement the Cassandra on top of a B-Tree implementation rather > than of DHT approach that we have right now. We hope to do benchmarking the > two approaches and really want to see which one scales better. > > In the lab we already have a project (which is not yet completed) on > developing a Distributed B-Tree on top of a Sinfonia like system. We would be > trying to integrate the Cassandra source with the B-tree preserving the rest > of the Cassandra logic. > > Since we are still in its very early stage of this experiment, thought of > getting your expert thoughts and comments on this and we were wondering > whether this could be a potential GSoc project as well.
I'm sorry, but it doesn't make much sense to run Cassandra on top of a B-tree. Reorganizing indexes when writing goes against one of Cassandra's primary design goals: streaming writes to disk as efficiently as possible. http://wiki.apache.org/cassandra/FAQ#reads_slower_writes Additionally, there are *so many* other systems that do use B-tree already. Why add it to Cassandra? You may want to look at Project Voldemort, which can already distribute data across servers similarly to Cassandra but (optionally) with B-tree-based storage on each box. MongoDB also supports sharded data with B-tree-based indexes. Finally, HBase is a distributed B-tree. -- David Strauss | da...@fourkitchens.com Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct]
signature.asc
Description: OpenPGP digital signature