I don't see anything inherently wrong with your proposal, it would almost
definitely be beneficial in certain scenarios. We use what could be called
"static compression" (golomb-esque encodings) for some data types on our
Cassandra clusters. It's useful for representing things like full precision
d
There is a lot of overhead in the serialized data itself (just have a look at a
sstable file).
It would be great to be able to compress at the byte array level rather than
string.
Regards,
Terje
On 1 Feb 2011, at 03:15, "David G. Boney" wrote:
> In Cassandra, strings are stored as UTF-8. In
Is the partitioner the only code that does comparisons on the keys of a column
family? What about get_range_slices(), does it only use the partitioner's
comparison method?
-
Sincerely,
David G. Boney
dbon...@semanticartifacts.com
http://www.semanticartifacts.com
On Jan 31, 2011, a
In Cassandra, strings are stored as UTF-8. In arithmetic coding compression,
the modeling is separate from the coding. A standard arrangement is to have a
0-order model, frequencies of individual bytes, 1-order model, frequencies of
two byte occurrences, and 2-order models, frequencies of three
On 31 January 2011 04:41, David G. Boney wrote:
> I propose a simple idea for compression using a compressed string datatype.
>
> The compressed string datatype could be implemented for column family keys by
> creating a compressed string ordered partitioner. The compressed string
> ordered part
I propose a simple idea for compression using a compressed string datatype.
The compressed string datatype could be implemented for column family keys by
creating a compressed string ordered partitioner. The compressed string ordered
partitioner works by decompressing the string and then applyi