Would it be possible to support this in a more general case by providing a
distributed |= operator over arbitrary byte strings (like the + operator on
counter columns), which would allow distributed bloom filters as well?
Tim Wintle
On Fri, Jun 29, 2012 at 6:31 AM, Chris Burroughs
wrote:
> Well
Well I obviously think it would be handy. If this get's proposed and
end's up using stream-lib don't be shy about asking for help.
On a more general note, it would be great to see the special case
Counter code become more general atomic operation code.
On 06/13/2012 01:15 PM, Utku Can Topçu wrot
On 06/13/2012 01:00 PM, Yuki Morishita wrote:
> The above implementation and most of the other ones (including stream-lib)
> implement the optimized version of the algorithm which counts up to 10^9, so
> may need some work.
>
> Other alternative is self-learning bitmap
> (http://ect.bell-labs.c
Hi Yuki,
I think I should have used the word discussion instead of proposal for the
mailing subject. I have quite some of a design in my mind but I think it's
not yet ripe enough to formalize. I'll try to simplify it and open a Jira
ticket.
But first I'm wondering if there would be any excitement
You can open JIRA ticket at https://issues.apache.org/jira/browse/CASSANDRA
with your proposal.
Just for the input:
I had once implemented HyperLogLog counter to use internally in Cassandra, but
it turned out I didn't need it so I just put it to gist. You can find it here:
https://gist.github.
Hi All,
Let's assume we have a use case where we need to count the number of
columns for a given key. Let's say the key is the URL and the column-name
is the IP address or any cardinality identifier.
The straight forward implementation seems to be simple, just inserting the
IP Adresses as columns