Re: Recommended sort mechanism and partitioner

2010-10-15 Thread Tyler Hobbs
i) Yes ii) Well, so you don't actually want to use version 1 UUIDs for keys here. Although they mostly increase in byte order over time, it's only for the first 8 bytes. Instead, you can use something like: 'timestamp-foo' Where 'foo' might be a randomly generated string or something unique per

Re: Recommended sort mechanism and partitioner

2010-10-15 Thread Wicked J
Tyler, Thanks for answering my question. Can you please clarify on point (c)? i] Are you saying that if I move to second row (identified by a rowKey in Cassandra) after I hit 10 million col. values for 1st row, only then the second row will be written to a new node in the cluster? meaning all th

Re: Recommended sort mechanism and partitioner

2010-10-15 Thread Tyler Hobbs
a) 10 mil sounds fine. Just watch out for compaction. Huge rows can kill you there, from my understanding. b) Use RandomPartitioner unless you absolutely have to use something else. c) If you're inserting all along one row and only moving to another row when you hit 10 mil, you're only going to

Re: Recommended sort mechanism and partitioner

2010-10-15 Thread Paul Prescod
I wrote some thoughts about this on my blog. I think it's still mostly correct: * http://www.ayogo.com/techblog/2010/04/sorting-in-cassandra/ On Fri, Oct 15, 2010 at 11:14 AM, Wicked J wrote: > Hi, > I'm using TimeUUID/Sort by column name mechanism. The column value can > contain text data (in

Recommended sort mechanism and partitioner

2010-10-15 Thread Wicked J
Hi, I'm using TimeUUID/Sort by column name mechanism. The column value can contain text data (in future they may contain image data as well) leading to the possibility of a row out-growing the RAM capacity. Given this background my questions are: a] How many columns are recommended against one row