Hi Kasper,
I'd suggest taking a look at Spark, Storm, or Samza (all are Apache
projects) for a possible approach. Depending on your needs and your
existing infrastructure, one of those may work better than others for you.
Steve
On Tue, Apr 1, 2014 at 2:51 AM, Kasper Petersen wrote:
> Hi,
>
Hi,
I have a large amount (can be >100 million) of (id uuid, score int) entries
in Cassandra. I need to, at regular intervals of lets say 30-60 minutes,
find the cut-off points for the score needed to be in the top 0.1%, 33% and
66% of all scores.
What would a good approach be to this problem?
A