Re: 1, 2, 3...

2016-04-11 Thread Emīls Šolmanis
Cassandra is not good for table scan type queries (which count(*) >>>> typically is). While there are some attempts to do that (as noted below), >>>> this is a path I avoid. >>>> >>>> >>>> >>>> >>>

Re: 1, 2, 3...

2016-04-11 Thread Jack Krupansky
th I avoid. >>> >>> >>> >>> >>> >>> Sean Durity >>> >>> >>> >>> *From:* Max C [mailto:mc_cassan...@core43.com] >>> *Sent:* Saturday, April 09, 2016 6:19 PM >>> *To:* user@cassandra.apache.org >

Re: 1, 2, 3...

2016-04-11 Thread Emīls Šolmanis
pically is). While there are some attempts to do that (as noted below), >> this is a path I avoid. >> >> >> >> >> >> Sean Durity >> >> >> >> *From:* Max C [mailto:mc_cassan...@core43.com] >> *Sent:* Saturday, April 09, 2016 6:19 PM &

Re: 1, 2, 3...

2016-04-11 Thread Jack Krupansky
pts to do that (as noted below), > this is a path I avoid. > > > > > > Sean Durity > > > > *From:* Max C [mailto:mc_cassan...@core43.com] > *Sent:* Saturday, April 09, 2016 6:19 PM > *To:* user@cassandra.apache.org > *Subject:* Re: 1, 2, 3... > > > &

RE: 1, 2, 3...

2016-04-11 Thread SEAN_R_DURITY
Subject: Re: 1, 2, 3... Looks like this guy (Brian Hess) wrote a script to split the token range and run count(*) on each subrange: https://github.com/brianmhess/cassandra-count - Max On Apr 8, 2016, at 10:56 pm, Jeff Jirsa mailto:jeff.ji...@crowdstrike.com>> wrote: SELECT COUNT(*) pr

Re: 1, 2, 3...

2016-04-09 Thread Max C
Looks like this guy (Brian Hess) wrote a script to split the token range and run count(*) on each subrange: https://github.com/brianmhess/cassandra-count - Max > On Apr 8, 2016, at 10:56 pm, Jeff Jirsa wrote: > > SELECT COUNT(*) probably works

Re: 1, 2, 3...

2016-04-08 Thread Jeff Jirsa
SELECT COUNT(*) probably works (with internal paging) on many datasets with enough time and assuming you don’t have any partitions that will kill you. No, it doesn’t count extra replicas / duplicates. The old way to do this (before paging / fetch size) was to use manual paging based on tokens/c

Re: 1, 2, 3...

2016-04-08 Thread Spencer Brown
CQL commands don't count replications - that would make any select meaningless since they would all return dups. On Fri, Apr 8, 2016 at 6:48 PM, Jack Krupansky wrote: > I'm afraid I don't have the solid answer to this obvious question: How do > I get a fairly accurate count of (CQL) rows in a Ca