Re: Query

2016-12-29 Thread Manoj Khangaonkar
I am not that familiar with gizzard but with gizzard + mysql , you have multiple moving parts in the system that need to managed separately. You'll need the mysql expert for mysql and the gizzard expert to manage the distributed part. It can be argued that long term this will have higher adminstrat

Re: Read efficiency question

2016-12-28 Thread Manoj Khangaonkar
In the first case, the partitioning is based on key1,key2,key3. In the second case, partitioning is based on key1 , key2. Additionally you have a clustered key key3. This means within a partition you can do range queries on key3 efficiently. That is the difference. regards On Tue, Dec 27, 2016 a

Re: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Manoj Khangaonkar
Hi, What is your write consistency setting ? regards On Wed, Nov 23, 2016 at 3:48 AM, Vladimir Yudovin wrote: > Try to build cluster with *.withPoolingOptions* > > Best regards, Vladimir Yudovin, > *Winguzone - Cloud Cassandra Hosting* > > > On Wed, 23 No

Re: Consistency when adding data to collections concurrently?

2016-11-12 Thread Manoj Khangaonkar
Hi, Instead of using a collection, consider making label a clustered column. With this each request will essentially append a column (label) to the partition. To get all labels would be a simple query select label from table where partitionkey = "value". In general , read + update of a column

Re: Hbase vs Cassandra

2015-05-30 Thread Manoj Khangaonkar
I wrote this up 2 years ago http://khangaonkar.blogspot.com/2013/09/cassandra-vs-hbase-which-nosql-store-do.html regards On Fri, May 29, 2015 at 12:09 PM, Ajay wrote: > Hi, > > I need some info on Hbase vs Cassandra as a data store (in general plus > specific to time series data). > > The comp

Re: Leveled Compaction Strategy with a really intensive delete workload

2015-05-24 Thread Manoj Khangaonkar
Hi, For a delete intensive workload ( translate to write intensive), is there any reason to use leveled compaction ? The recommendation seems to be that leveled compaction is suited for read intensive workloads. Depending on your use case, you might better of with data tiered or size tiered strat

Re: Multiple cassandra instances per physical node

2015-05-21 Thread Manoj Khangaonkar
+1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blog

Re: spout storm cassandra

2015-05-19 Thread Manoj Khangaonkar
Hi, Storm spouts are supposed to read from somewhere - preferably streams. In theory, you could write a spout that queries cassandra and makes data available to storm. But remember that cassandra datamodel is based on partitioning based on key and the use of wide columns. So if the your data mod

Re: Data model suggestions

2015-04-23 Thread Manoj Khangaonkar
for > active records, if a previous active record isn't included in the results, > that means its time to archive that record. > > On Thu, Apr 23, 2015 at 9:20 PM, Manoj Khangaonkar > wrote: > >> Hi, >> >> How do you determine if the record is no longer active

Re: Data model suggestions

2015-04-23 Thread Manoj Khangaonkar
Hi, How do you determine if the record is no longer active ? Is it a perioidic process that goes through every record and checks when the last update happened ? regards On Thu, Apr 23, 2015 at 8:09 AM, Ali Akhtar wrote: > Hey all, > > We are working on moving a mysql based application to Cassa

Re: Cassandra use cases/Strengths/Weakness

2014-07-04 Thread Manoj Khangaonkar
These are my personal opinions based on few months using Cassandra. These are my views. Others may have different opinion http://khangaonkar.blogspot.com/2014/06/apache-cassandra-things-to-consider.html regards On Fri, Jul 4, 2014 at 7:37 AM, Prem Yadav wrote: > Hi, > I have seen this in a

Cassandra & MapReduce/Storm/ etc

2014-05-12 Thread Manoj Khangaonkar
Hi, Searching for Cassandra with MapReduce, I am finding that the search results are really dated -- from version 0.7 & 2010/2011. Is there a good blog/article that describes how using MapReduce on Cassandra table ? >From my naive understanding, Cassandra is all about partitioning. Querying is b

Re: Cassandra slow on some reads

2014-03-14 Thread Manoj Khangaonkar
> > > > I have ~450 queries that are like this: SELECT * FROM table where key = > 'some string' and ts = some value; some value is close to present time. > > The problem: > > About 10 - 20 % of these queries take more than 5 seconds to execute, in > fact, the majority of those take around 10 second

Re: Cassandra cpp driver call to local cassandra colo

2014-03-04 Thread Manoj Khangaonkar
Hi , Your client/application will connect to one of the nodes from the nodes you tell it to connect. In the java driver this is done by calling Cluster.builder.addContactPoint(...). I suppose the C++ driver will have similar class method. For the app in DC1 provide only nodes in DC1 as contact poi

Re: Resetting a counter in CQL

2014-03-01 Thread Manoj Khangaonkar
The last time I checked as in v 2.0.0 counters did not work. Over a period of time , the counters drift from the correct value. There were several open issues and proposal to rewrite the counter implementation Have you checked if all the issues with counters have been fixed ? regards On Fri, Fe

Re: CQL3 delete using < or > ?

2014-02-08 Thread Manoj Khangaonkar
Hi , >From CQL documentation , for the CQL delete statement , the only allowed row specifications are primary_key_name = key_value primary_key_name IN ( *key_value, key_value, ...*) regards On Sat, Feb 8, 2014 at 6:09 PM, Clint Kelly wrote: > Folks, > > Is there any way to perform a delete

Re: Any Limits on number of items in a collection column type

2014-01-22 Thread Manoj Khangaonkar
bytes per item. > > From: Manoj Khangaonkar > Reply-To: > Date: Wednesday, January 22, 2014 at 7:17 PM > To: > Subject: Any Limits on number of items in a collection column type > > Hi, > > On C* 2.0.0. 3 Node cluster. > > I have a column daycount list. The col

Any Limits on number of items in a collection column type

2014-01-22 Thread Manoj Khangaonkar
Hi, On C* 2.0.0. 3 Node cluster. I have a column daycount list. The column is storing a count. Every few secs a new count is appended. The total count for the day is the sum of all items in the list. My application logs indicate I wrote about 11 items to the column for a particular row. As

Re: Read/Write consistency issue

2014-01-10 Thread Manoj Khangaonkar
do > something different. What are these three numbers exactly? > old=60616 val =19 new =60635 > > > On Fri, Jan 10, 2014 at 1:50 PM, Manoj Khangaonkar > wrote: > >> Hi >> >> Using Cassandra 2.0.0. >> 3 node cluster >> Replication 2. >> Using con

Re: Read/Write consistency issue

2014-01-10 Thread Manoj Khangaonkar
to say. In the >>> article I just linked to, the author experienced similar problems, even >>> with “perfectly synchronized clocks”, whatever that means. >>> >>> >>> >>> The conclusion I’ve arrived at after reading and pondering is that if >

Read/Write consistency issue

2014-01-10 Thread Manoj Khangaonkar
Hi Using Cassandra 2.0.0. 3 node cluster Replication 2. Using consistency ALL for both read and writes. I have a single thread that reads a value, updates it and writes it back to the table. The column type is big int. Updating counts for a timestamp. With single thread and consistency ALL , I e

Re: Sorting keys for batch reads to minimize seeks

2013-10-22 Thread Manoj Khangaonkar
Hi, Apologies if my response is a little off track, But instead of trying to squeeze the last ounce of performance out of cassandra, Have you considered putting an external in memory cache in front or along side cassandra ( like a redis or memcached ) to cache frequently used rows. You get fast

Re: Sorting keys for batch reads to minimize seeks

2013-10-17 Thread Manoj Khangaonkar
Unless I misunderstood your statement on sorting by row keys, Cassandra partitions rows across nodes based on row keys. Sorting a random set of keys will not help. If you know that you set of keys are on a particular node , then sorting might help. But I doubt that it is a sound practice, given th