Re: Cassandra on top of B-Tree
On 2010-03-28 21:11, Primal Wijesekera wrote: > I am a master student in UBC CS dept. I along with one of my lab mates are > trying to implement the Cassandra on top of a B-Tree implementation rather > than of DHT approach that we have right now. We hope to do benchmarking the > two approaches and really want to see which one scales better. > > In the lab we already have a project (which is not yet completed) on > developing a Distributed B-Tree on top of a Sinfonia like system. We would be > trying to integrate the Cassandra source with the B-tree preserving the rest > of the Cassandra logic. > > Since we are still in its very early stage of this experiment, thought of > getting your expert thoughts and comments on this and we were wondering > whether this could be a potential GSoc project as well. I'm sorry, but it doesn't make much sense to run Cassandra on top of a B-tree. Reorganizing indexes when writing goes against one of Cassandra's primary design goals: streaming writes to disk as efficiently as possible. http://wiki.apache.org/cassandra/FAQ#reads_slower_writes Additionally, there are *so many* other systems that do use B-tree already. Why add it to Cassandra? You may want to look at Project Voldemort, which can already distribute data across servers similarly to Cassandra but (optionally) with B-tree-based storage on each box. MongoDB also supports sharded data with B-tree-based indexes. Finally, HBase is a distributed B-tree. -- David Strauss | da...@fourkitchens.com Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct] signature.asc Description: OpenPGP digital signature
Re: writing and reading data
On 2010-04-05 02:23, S Ahmed wrote: > For starters, I want to learn how keys are read and written from disk. See "read" and "write": http://wiki.apache.org/cassandra/ArchitectureOverview -- David Strauss | da...@fourkitchens.com Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct] signature.asc Description: OpenPGP digital signature
Re: boonfilters
On 2010-04-07 20:34, Peter Schüller wrote: > Re-sizing a bloom filter implies re-creating it from scratch. Not necessarily. Depending on your hash, you can sometimes shrink without regeneration (and without other penalties). It's also sometimes possible to enlarge the bloom filter without regeneration at the cost of increased false positives you wouldn't have if you regenerated. Here's a trivial counterexample to your statement, starting with a bloom filter with two positions: odd and even. I can shrink it down to a single bit with an "or" operation. I can enlarge it to modulo 10 by filling in the odd bits if my current odd bit is filled and the same with the even numbers and bit. In the case of the shrink operation, I'm not regenerating or losing any accuracy. In the case of the enlarge operation, I'll get considerably more false positives than I would on regeneration, but my operation is still correct. Even with complex or cryptographic hashes, a bloom filter based on using, say, the first X bits might be expandable or shrinkable without regeneration. -- David Strauss | da...@fourkitchens.com Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct] signature.asc Description: OpenPGP digital signature
Re: Announcing Riptano professional Cassandra support and services
On 2010-04-26 19:58, Jonathan Ellis wrote: > Short version: Matt Pfeil and I have founded http://riptano.com to > provide production Cassandra support, training, and professional > services. Yes, we're hiring. > > Long version: > http://spyced.blogspot.com/2010/04/and-now-for-something-completely.html > > We're happy to answer questions on- or off-list. Does this mean you're no longer with Rackspace? -- David Strauss | da...@fourkitchens.com Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct] signature.asc Description: OpenPGP digital signature
Re: Generated code?
On 2010-06-15 03:58, Masood Mortazavi wrote: > Hi, > > My assumption is that what one finds in > > interface/thrift/gen-java > > is actually generated code. > > If so, why is it checked in as source under SVN? > > (Certainly, the avro generated code doesn't seem to be checked in.) > > Regards, > Masood > It simplifies the end user's build process. If the code isn't in Subversion, then you'd need to get all the Thrift dependencies and do the generation yourself just to build Cassandra. Sure, there are other methods that don't involve checking into Subversion, but they're more complex. -- David Strauss | da...@fourkitchens.com | +1 512 577 5827 [mobile] Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct] signature.asc Description: OpenPGP digital signature
Re: Cassandra Hack-a-thon in Austin
On Wed, 2010-08-25 at 09:58 -0500, Eric Evans wrote: > Some of us from Rackspace are going offsite and heads-down for a day of > hacking to see how much of the Avro support in Cassandra we can get > knocked out. We'll be at Austin Cowork > (http://www.coworkaustin.com/about.php) on September 1 from 8am to 6pm, > and we have space enough for 4 more if anyone is interested. > > If you're in the Austin area and would like to join us, shoot me an > email and give me an idea which area(s) you think you might like to work > on. The bigger areas I see are: > > * RPC method implementations > * Functional tests > * Client support I'd like to stop by and work on client support for PHP, Python, or C++. -- David Strauss | da...@fourkitchens.com | +1 512 577 5827 [mobile] Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct]