On 2010-03-28 21:11, Primal Wijesekera wrote:
> I am a master student in UBC CS dept. I along with one of my lab mates are 
> trying to implement the Cassandra on top of a B-Tree implementation rather 
> than of DHT approach that we have right now. We hope to do benchmarking the 
> two approaches and really want to see which one scales better. 
> 
> In the lab we already have a project (which is not yet completed) on 
> developing a Distributed B-Tree on top of a Sinfonia like system. We would be 
> trying to integrate the Cassandra source with the B-tree preserving the rest 
> of the Cassandra logic.
> 
> Since we are still in its very early stage of this experiment, thought of 
> getting your expert thoughts and comments on this and we were wondering 
> whether this could be a potential GSoc project as well.

I'm sorry, but it doesn't make much sense to run Cassandra on top of a
B-tree. Reorganizing indexes when writing goes against one of
Cassandra's primary design goals: streaming writes to disk as
efficiently as possible.

http://wiki.apache.org/cassandra/FAQ#reads_slower_writes

Additionally, there are *so many* other systems that do use B-tree
already. Why add it to Cassandra?

You may want to look at Project Voldemort, which can already distribute
data across servers similarly to Cassandra but (optionally) with
B-tree-based storage on each box. MongoDB also supports sharded data
with B-tree-based indexes. Finally, HBase is a distributed B-tree.

-- 
David Strauss
   | da...@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to