Re: Finding cut-off points

2014-04-01 Thread Steven A Robenalt
Hi Kasper, I'd suggest taking a look at Spark, Storm, or Samza (all are Apache projects) for a possible approach. Depending on your needs and your existing infrastructure, one of those may work better than others for you. Steve On Tue, Apr 1, 2014 at 2:51 AM, Kasper Petersen wrote: > Hi, >

Finding cut-off points

2014-04-01 Thread Kasper Petersen
Hi, I have a large amount (can be >100 million) of (id uuid, score int) entries in Cassandra. I need to, at regular intervals of lets say 30-60 minutes, find the cut-off points for the score needed to be in the top 0.1%, 33% and 66% of all scores. What would a good approach be to this problem? A