Re: Get column family size

2014-12-12 Thread Ryan Svihla
What version are you on (key estimate I see in 1.2 and 2.0) ? What size is your heap (ideally 8GB, can be lower, but it requires a lot of tuning)? What kind of disk do you have (SANs are going to cause you problems)? Assuming all of those are the right answer, then you have the following options to

Re: Get column family size

2014-12-11 Thread Chamila Wijayarathna
Hi Philip, Ryan, I checked cassandra system.log for any issues, but it showed no error there. I tried using cfstats and it gave me https://gist.github.com/cdwijayarathna/e6b4d3d7d8c272fcfd24. It doesn't seem to have any information like number of keys. I am running cassandra in a single node and

Re: Get column family size

2014-12-11 Thread Ryan Svihla
An estimated partition key count can be had from nodetool cfstats, however for large data sets analytics style queries (such as verification of large data sets) I recommend spark, hive, hadoop, and even solr for some use cases. On Thu, Dec 11, 2014 at 3:10 PM, Philip Thompson < philip.thomp...@dat

Re: Get column family size

2014-12-11 Thread Ryan Svihla
So that query in cqlsh actually has a default limit of 1 and so if you're timing out trying to retrieve only 10k rows that makes me suspect you have either a lot of data per row, or you've got a really really unhappy server. I'd check the cassandra logs for errors, there is probably a lot more

Re: Get column family size

2014-12-11 Thread Philip Thompson
Chamila, You can find more detailed explanations in previous posts on this mailing list as to why, but a "Select count(*) from table;" query is inefficient in Cassandra for non-trivial datasets. You will need a better way to get the number of partition keys of a CF, which hopefully someone else in

Re: Get column family size

2014-12-11 Thread Chamila Wijayarathna
Hi Philip, Yes, I'm using cqlsh. Is there any way I can solve this? Thank You! On Fri, Dec 12, 2014 at 12:26 AM, Philip Thompson < philip.thomp...@datastax.com> wrote: > I assume the query you are sending is through cqlsh. You are actually > getting a client-side timeout error, which is unclear

Re: Get column family size

2014-12-11 Thread Philip Thompson
I assume the query you are sending is through cqlsh. You are actually getting a client-side timeout error, which is unclear in 2.1.2, but I believe the error message will be more helpful as of 2.1.3. On Thu, Dec 11, 2014 at 1:52 PM, Chamila Wijayarathna < cdwijayarat...@gmail.com> wrote: > Hello

Get column family size

2014-12-11 Thread Chamila Wijayarathna
Hello all, I am trying to get the number of key value pairs. I used following query for this. select count(*) from corpus.word_usage ; This returns number of key value pairs when CF is relatively small. But when I insert more key-velue pairs, I am getting error saying, "errors={}, last_host=127