Well, the amazon paper is good at describing the nature of the
problem, but to solve it you'll probably want to use zookeeper. The
paper is useful in understanding exactly what you need to lock on and
what you don't while updating the index, so you can avoid slowing
things down any more than is ne
Thanks for the suggestions Ed. Your blog post is quite helpful in deciding
on and implementing CF inverted indexes.
Our data definitely leans towards external CF - has high cardinality(1000s
for one column, millions for another), multiple columns need to be indexed,
needs sorted order.
Hope that a
If you're just indexing on a single column value and the values have
low cardinality in, say, the 10's - I'd have a wide row for each
cardinal value that contained the set of keys for rows that contained
that value. For higher levels of cardinality or if you're indexing on
multiple columns, there
I am trying to decide whether to use secondary indexes or use an inverted
index column family for a use case. Is there any suggested ballpark range
for low cardinality for which secondary indexes are suitable.
Meaning at what range should using a secondary index be ruled in or out:
cardinality of