Re: ballpark low cardinality range for secondary indexes

2011-04-08 Thread Ed Anuff
Well, the amazon paper is good at describing the nature of the problem, but to solve it you'll probably want to use zookeeper. The paper is useful in understanding exactly what you need to lock on and what you don't while updating the index, so you can avoid slowing things down any more than is ne

Re: ballpark low cardinality range for secondary indexes

2011-04-08 Thread Adi
Thanks for the suggestions Ed. Your blog post is quite helpful in deciding on and implementing CF inverted indexes. Our data definitely leans towards external CF - has high cardinality(1000s for one column, millions for another), multiple columns need to be indexed, needs sorted order. Hope that a

Re: ballpark low cardinality range for secondary indexes

2011-04-08 Thread Ed Anuff
If you're just indexing on a single column value and the values have low cardinality in, say, the 10's - I'd have a wide row for each cardinal value that contained the set of keys for rows that contained that value. For higher levels of cardinality or if you're indexing on multiple columns, there

ballpark low cardinality range for secondary indexes

2011-04-08 Thread Adi
I am trying to decide whether to use secondary indexes or use an inverted index column family for a use case. Is there any suggested ballpark range for low cardinality for which secondary indexes are suitable. Meaning at what range should using a secondary index be ruled in or out: cardinality of