Re: Using cassandra a BLOB store / web cache.

2016-01-20 Thread Mohit Anchlia
The answer to this questions is very much dependent on the throughput, desired latency and access patters (R/W or R/O)? In general what I have seen working for high throughput environment is to either use a distributed file system like Ceph/Gluster or object store like S3 and keep the pointer in th

Re: Cannot query secondary index

2014-06-13 Thread Mohit Anchlia
Some other ways to track old records is: 1) Use external queues - One queue per week or month for instance and pile up data on the queue cluster 2) Create one more table in C* to track the keys per week or month that you can scan to read the keys of the audit table. Make sure you delete the entir

Re: Cassandra blob storage

2014-03-18 Thread Mohit Anchlia
For large volume big data scenarios we don't recommend using Cassandra as a blob storage simply because of intensive IO involved during compation, repair etc. Cassandra store is only well suited for metadata type storage. However, if you are fairly low volume then it's a different story, but if you

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
On Thu, Feb 20, 2014 at 4:37 PM, Edward Capriolo wrote: > Recomendations in cassandra have a shelf life of about 1 to 2 years. If > you try to assert a recomendation from year ago you stand a solid chance of > someone telling you there is now a better way. > > Casaandra once loved being a schemale

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
+1 I like hector client that uses thrift interface and exposes APIs that is similar to how Cassandra physically stores the values. On Thu, Feb 20, 2014 at 9:26 AM, Peter Lin wrote: > > I disagree with the sentiment that "thrift is not worth the trouble". > > CQL and all SQL inspired dialects li

Re: Commit log on USB flash disk?

2013-11-16 Thread Mohit Anchlia
In our testing USB tends to be slower. If there is something more integrated internally would give you better performance Sent from my iPhone On Nov 16, 2013, at 8:30 AM, Dan Simpson wrote: > It doesn't seem like a great idea. The USB drives typically use dynamic wear > leveling. See this a

Re: Cass 1.1.11 out of memory during compaction ?

2013-11-03 Thread Mohit Anchlia
Post your gc logs Sent from my iPhone On Nov 3, 2013, at 6:54 AM, Oleg Dulin wrote: > Cass 1.1.11 ran out of memory on me with this exception (see below). > > My parameters are 8gig heap, new gen is 1200M. > > ERROR [ReadStage:55887] 2013-11-02 23:35:18,419 AbstractCassandraDaemon.java > (l

Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread Mohit Anchlia
are migrating to 1.2.X > though. We had tuned bloom filters (0.1) and AFAIK making it lower than > this won't matter. > > Thanks ! > > > On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia wrote: > >> Which Cassandra version are you on? Essentially heap size is func

Re: Cassandra Heap Size for data more than 1 TB

2013-10-01 Thread Mohit Anchlia
Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap. On Tue, Oct 1, 2013 at 9:34 PM, srmore wrote: > Does anyone know what would roughly be the heap size for cassandra with >

Re: 答复: Frequent Full GC that take > 30s

2013-09-23 Thread Mohit Anchlia
Your ParNew size is way too small. Generally 4GB ParNew (-Xmn) works out best for 16GB heap On Mon, Sep 23, 2013 at 9:05 PM, 谢良 wrote: > it looks to me that "MaxTenuringThreshold" is too small, do you have any > chance to try with a bigger one, like 4 or 8 or sth else? > > __

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-20 Thread Mohit Anchlia
Did you start out your cluster after wiping all the sstables and commit logs? On Fri, Sep 20, 2013 at 3:42 PM, Suruchi Deodhar < suruchi.deod...@generalsentiment.com> wrote: > We have been trying to resolve this issue to find a stable configuration > that can give us a balanced cluster with equal

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-20 Thread Mohit Anchlia
ich we have >> encountered before). >> >> I've attached the output of "nodetool ring" here. >> >> >> On Thu, Sep 19, 2013 at 8:35 PM, Mohit Anchlia wrote: >> >>> Other thing I noticed is that you are using mutiple RACKS and that might

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
he column-family that had 6527744 keys before (load > is now 1.08 GB as compares to 1.05 GB before), while the smallest node now > has 71808 keys as compared to 3840 keys before (load is now 31.89 MB as > compares to 1.12 MB before). > > > On Thu, Sep 19, 2013 at 5:18 PM, Mohit

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
sing the Murmur3 partitioner with NetworkTopologyStrategy. > > Thanks, > Suruchi > > > > On Thu, Sep 19, 2013 at 3:59 PM, Mohit Anchlia wrote: > >> Can you check cfstats to see number of keys per node? >> >> >> On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar < >

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Can you check cfstats to see number of keys per node? On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar < suruchi.deod...@generalsentiment.com> wrote: > Thanks for your replies. I wiped out my data from the cluster and also > cleared the commitlog before restarting it with num_tokens=256. I then

Re: row cache

2013-09-07 Thread Mohit Anchlia
I agree. We've had similar experience. Sent from my iPhone On Sep 7, 2013, at 6:05 PM, Edward Capriolo wrote: > I have found row cache to be more trouble then bene. > > The term fools gold comes to mind. > > Using key cache and leaving more free main memory seems stable and does not > have a

Re: Cassandra 1.2.4 - Unflushed data lost on restart

2013-09-06 Thread Mohit Anchlia
Are you not using RF >= 3 ? On Fri, Sep 6, 2013 at 10:14 AM, Thapar, Vishal (HP Networking) < vtha...@hp.com> wrote: > My usage requirements are such that there should be least possible data > loss even in case of a poweroff. When you say clean shutdown do you mean > Cassandra service stop? > > I

Re: Temporarily slow nodes on Cassandra

2013-09-02 Thread Mohit Anchlia
In general with LOCAL_QUORUM you should not see such an issue when one node is slow. However, it could be because Client's are still sending requests to that node. Depending on what client library you are using , you could try to take that node out of your connection pool. Not knowing exact issue y

Re: Upgrade from 1.0.9 to 1.2.8

2013-08-30 Thread Mohit Anchlia
If you have multiple DCs you at least want to upgrade to 1.0.11. There is an issue where you might get errors during cross DC replication. On Fri, Aug 30, 2013 at 9:41 AM, Mike Neir wrote: > In my testing, mixing 1.0.9 and 1.2.8 seems to work fine as long as there > is no need to do streaming op

Re: Having 2 nodes with 100% Ownership ?

2013-08-12 Thread Mohit Anchlia
You need to get it to 50% on each to equally distribute the has range. You need to 1) Calculate new token 2) move nodes to that token or use vnodes For the first option see: http://www.datastax.com/docs/0.8/install/cluster_init On Mon, Aug 12, 2013 at 12:06 PM, Morgan Segalis wrote: > Hi ever

Re: Cassandra nodetool repair question

2013-08-08 Thread Mohit Anchlia
But node might be streaming data as well, in that case only option is to restart node that started streaming operation Sent from my iPhone On Aug 8, 2013, at 5:56 PM, Andrey Ilinykh wrote: > nodetool repair just triggers repair procedure. You can kill nodetool after > start, it doesn't change

Automated Repair on multiple nodes

2013-08-02 Thread Mohit Anchlia
We currently run automated repairs sequentially on all the nodes. However, as we grow the cluster we now need to run repair on multiple nodes in parallel to be able to finish it withing gcgrace seconds. Before I write the script I was wondering if somebody already has a tool or a script that figure

Re: cassandra GC cpu usage

2013-07-16 Thread Mohit Anchlia
What's your replication factor? Can you check tp stats and net stats to see if you are getting more mutations on these nodes ? Sent from my iPhone On Jul 16, 2013, at 3:18 PM, Jure Koren wrote: > Hi C* user list, > > I have a curious recurring problem with Cassandra 1.2 and what seems like a

Re: Logging Cassandra Reads/Writes

2013-07-09 Thread Mohit Anchlia
There is a new tracing feature in Cassandra 1.2 that might help you with this. On Tue, Jul 9, 2013 at 1:31 PM, Blair Zajac wrote: > No idea on the logging, I'm pretty new to Cassandra. > > Regards, > Blair > > On Jul 9, 2013, at 12:50 PM, hajjat wrote: > > > Blair, thanks for the clarification!

Re: Reduce Cassandra GC

2013-06-20 Thread Mohit Anchlia
dra (out of 4GB). > > > 2013/6/19 Mohit Anchlia > >> How much data do you have per node? >> How much RAM per node? >> How much CPU per node? >> What is the avg CPU and memory usage? >> >> On Wed, Jun 19, 2013 at 12:16 AM, Joel Samuelsson <

Re: Reduce Cassandra GC

2013-06-19 Thread Mohit Anchlia
How much data do you have per node? How much RAM per node? How much CPU per node? What is the avg CPU and memory usage? On Wed, Jun 19, 2013 at 12:16 AM, Joel Samuelsson wrote: > My Cassandra ps info: > > root 26791 1 0 07:14 ?00:00:00 /usr/bin/jsvc -user > cassandra -home /opt

Re: Reduce Cassandra GC

2013-06-18 Thread Mohit Anchlia
Is your young generation size set to 4GB? Can you paste the output of ps -ef|grep cassandra ? On Tue, Jun 18, 2013 at 8:48 AM, Joel Samuelsson wrote: > Yes, like I said, the only relevant output from that file was: > 2013-06-17T08:11:22.300+: 2551.288: [GC 870971K->216494K(4018176K), > 145.18

Re: Reduce Cassandra GC

2013-06-15 Thread Mohit Anchlia
Can you paste you gc config? Also can you take a heap dump at 2 diff points so that we can compare it? Quick thing to do would be to do a histo live at 2 points and compare Sent from my iPhone On Jun 15, 2013, at 6:57 AM, Takenori Sato wrote: > > INFO [ScheduledTasks:1] 2013-04-15 14:00:02,74

Re: very confused by jmap dump of cassandra

2013-02-21 Thread Mohit Anchlia
Roughly how much data do you have per node? Sent from my iPhone On Feb 20, 2013, at 10:49 AM, "Hiller, Dean" wrote: > I took this jmap dump of cassandra(in production). Before I restarted the > whole production cluster, I had some nodes running compaction and it looked > like all memory had

Re: Row cache and counters

2012-12-29 Thread Mohit Anchlia
Can you post gc settings? Also check logs and see what it says Also post how many writes and reads along with avg row size Sent from my iPhone On Dec 29, 2012, at 12:28 PM, rohit bhatia wrote: > i assume u mean 8 seconds and not 8ms.. > thats pretty huge to be caused by gc. Is there lot of lo

Re: How to replace a dead *seed* node while keeping quorum

2012-09-12 Thread Mohit Anchlia
How can this be resolved in this case? On Wed, Sep 12, 2012 at 3:53 PM, Rob Coli wrote: > On Tue, Sep 11, 2012 at 4:21 PM, Edward Sargisson > wrote: > > If the downed node is a seed node then neither of the replace a dead node > > procedures work (-Dcassandra.replace_token and taking initial_to

Re: nodetool connection refused

2012-09-08 Thread Mohit Anchlia
Are both running on the same host? On Fri, Sep 7, 2012 at 11:53 PM, Manu Zhang wrote: > When I run Cassandra-trunk in Eclipse, nodetool fail to connect with the > following error > "Failed to connect to '127.0.0.1:7199': Connection refused" > But if I run in terminal, all will be fine. >

Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
of my back log? > > Although we know when a network is flaky, we are interested in knowing how > much data is piling up in local DC that needs to be transferred. > > Greatly appreciate your help. > > VR > > > On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia wrote: > &g

Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would h

Re: Expanding cluster to include a new DR datacenter

2012-08-27 Thread Mohit Anchlia
strategy_options I should be using the DC name > from properfy file snitch right? Ours is “Fisher” and “TierPoint” so > that’s what I used.**** > > ** ** > > *From:* Mohit Anchlia [mailto:mohitanch...@gmail.com] > *Sent:* Monday, August 27, 2012 1:21 PM > > *To:* user@ca

Re: Expanding cluster to include a new DR datacenter

2012-08-27 Thread Mohit Anchlia
> ** ** > > On 25/08/2012, at 6:53 PM, Bryce Godfrey > wrote: > > > > > > Yes > > > > [default@unknown] describe cluster; > > Cluster Information: > > Snitch: org.apache.cassandra.locator.PropertyFileSnitch > &g

Re: Decreasing the number of nodes in the ring

2012-08-26 Thread Mohit Anchlia
use nodetool decommission and nodetool removetoken On Sun, Aug 26, 2012 at 5:31 PM, Senthilvel Rangaswamy wrote: > We have a cluster of 9 nodes in the ring. We would like SSD backed boxes. > But we may not need 9 > nodes in that case. What is the best way to downscale the cluster to 6 or > 3 nod

Re: help required to resolve super column family problems

2012-08-24 Thread Mohit Anchlia
If you are starting out new use composite column names/values or you could also use JSON style doc as a column value. On Fri, Aug 24, 2012 at 2:31 PM, Rob Coli wrote: > On Fri, Aug 24, 2012 at 4:33 AM, Amit Handa wrote: > > kindly help in resolving the following problem with respect to super >

Re: Expanding cluster to include a new DR datacenter

2012-08-24 Thread Mohit Anchlia
That's interesting can you do describe cluster? On Fri, Aug 24, 2012 at 12:11 PM, Bryce Godfrey wrote: > So I’m at the point of updating the keyspaces from Simple to > NetworkTopology and I’m not sure if the changes are being accepted using > Cassandra-cli. > > ** ** > > I issue the change:*

DSE solr HA

2012-08-12 Thread Mohit Anchlia
Going through this page and it looks like indexes are stored locally http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details . My question is what happens if one of the solr nodes crashes? Is the data indexed again on those nodes? Also, if RF > 1 then is the same data being indexe

Re: Decision Making- YCSB

2012-08-10 Thread Mohit Anchlia
I agree with Edward. We always develop our own stress tool that tests each use case of interest. Every use case is different in certain ways that can only be tested using custom stress tool. On Fri, Aug 10, 2012 at 7:25 AM, Edward Capriolo wrote: > There are many YCSB forks on github that get opt

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 11:16 AM, Ertio Lew wrote: > I want to read columns for a randomly selected list of userIds(completely > random). I fetch the data using userIds(which would be used as column names > in case of single row or as rowkeys incase of 1 row for each user) for a > selected list o

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 11:00 AM, Ertio Lew wrote: > For each user in my application, I want to store a *value* that is queried > by using the userId. So there is going to be one column for each user > (userId as col Name & *value* as col Value). Now I want to store these > columns such that can

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 10:53 AM, Ertio Lew wrote: > Actually these columns are 1 for each entity in my application & I need to > query at any time columns for a list of 300-500 entities in one go. Can you describe your situation with small example?

Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 10:07 AM, Ertio Lew wrote: > My major concern is that is it too bad retrieving 300-500 rows (each for a > single column) in a single read query that I should store all these(around > a hundred million) columns in a single row? You could create multiple rows and each row

Re: Cassandra Authentication

2012-06-28 Thread Mohit Anchlia
Sent from my iPad On Jun 28, 2012, at 8:45 AM, Christof Bornhoevd wrote: > Hi, > > we are using Cassandra v1.0.8 with Hector v1.0-5 and would like to move our > current system to an operational setting based on Amazon AWS. What are best > practices for addessing security for Cassandra on A

Re: Multi datacenter, WAN hiccups and replication

2012-06-26 Thread Mohit Anchlia
estion. In general I don't think you can selectively decide on HH. Besides HH should only be used when the outage is in mts, for longer outages using HH would only create memory pressure. > On Tuesday, June 26, 2012, Mohit Anchlia wrote: > >> >> On Tue, Jun 26, 2012 at 7:52

Re: Multi datacenter, WAN hiccups and replication

2012-06-26 Thread Mohit Anchlia
On Tue, Jun 26, 2012 at 7:52 AM, Karthik N wrote: > My Cassandra ring spans two DCs. I use local quorum with replication > factor=3. I do a write in DC1 with local quorum. Data gets written to > multiple nodes in DC1. For the same write to propagate to DC2 only one > copy is sent from the coordin

Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-15 Thread Mohit Anchlia
I agree with Brandon. We only use it for enhancing authz and authn modules to use LDAP that C* currently doesn't provide. On Mon, May 14, 2012 at 11:08 PM, Brandon Williams wrote: > On Tue, May 15, 2012 at 12:53 AM, Ertio Lew wrote: > > @Brandon : I just created a jira issue to request this typ

Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-14 Thread Mohit Anchlia
That's right. Create class that implements the required interface and then drop that jar in lib directory and start the cluster. On Mon, May 14, 2012 at 11:41 AM, Kirk True wrote: > Disclaimer: I've never tried, but I'd imagine you can drop a JAR > containing the class(es) into the lib directory

Re: Updating CF to reversed type

2012-05-05 Thread Mohit Anchlia
I thought so. Is there a way I can unload and load data after dropping CF and re-creating it with reversed type? On Sat, May 5, 2012 at 7:11 AM, Edward Capriolo wrote: > You can not update comparators because they effect the on disk ordering. > > On Sat, May 5, 2012 at 2:11 AM, Mohi

Updating CF to reversed type

2012-05-04 Thread Mohit Anchlia
Is it possible to update CF definition to use "reversed" type? If it's possible then what happens to the old values, do they still remain ordered in ascending order?

Re: Question regarding major compaction.

2012-05-01 Thread Mohit Anchlia
+1 On Tue, May 1, 2012 at 12:06 PM, Edward Capriolo wrote: > Also there are some tickets in JIRA to impose a max sstable size and > some other related optimizations that I think got stuck behind levelDB > in coolness factor. Not every use case is good for leveled so adding > more tools and optimi

Re: cassandra gui

2012-03-30 Thread Mohit Anchlia
at columns that falls outside of it > > ** > > *Von:* Mohit Anchlia [mailto:mohitanch...@gmail.com] > *Gesendet:* Freitag, 30. März 2012 16:57 > > *An:* user@cassandra.apache.org > *Betreff:* Re: cassandra gui > > ** ** > > On Thu, Mar 29, 2012 at 10:08 PM, Mar

Re: cassandra gui

2012-03-30 Thread Mohit Anchlia
On Thu, Mar 29, 2012 at 10:08 PM, Markus Wiesenbacher | Codefreun.de < m...@codefreun.de> wrote: > Hi, > > yes you can insert data into cassandra with apollo, just try the demo > center: http://www.codefreun.de/apolloUI/ > > You can login by just press the login-button (autologin) and play around

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
.0.0 does not generate cross-dc forwarding message at all, so you're > safe on that side. > > Is cross-dc forwarding different than replication? > -- > Sylvain > > On Thu, Mar 29, 2012 at 9:33 PM, Mohit Anchlia > wrote: > > Any updates? > > > > > >

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
Any updates? On Thu, Mar 29, 2012 at 7:31 AM, Mohit Anchlia wrote: > This is from NEWS.txt. So my question is if we are on 1.0.0-2 release do > we still need to upgrade since this impacts releases between 1.0.3-1.0.5? > - > If you are running a multi datacenter setup, you shoul

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-29 Thread Mohit Anchlia
or any > details on the upgrade path for these versions). > The incompatibility here is only between 1.1.0-beta1 and 1.1.0-beta2. > > -- > Sylvain > > On Thu, Mar 29, 2012 at 2:50 AM, Mohit Anchlia > wrote: > > We are currently using 1.0.0-2 version. Do we still need

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-28 Thread Mohit Anchlia
We are currently using 1.0.0-2 version. Do we still need to migrate to the latest release of 1.0 before migrating to 1.1? Looks like incompatibility is only between 1.0.3-1.0.8. On Tue, Mar 27, 2012 at 6:42 AM, Benoit Perroud wrote: > Thanks for the quick feedback. > > I will drop the schema t

Re: Performance overhead when using start and end columns

2012-03-26 Thread Mohit Anchlia
ickle.com > > On 27/03/2012, at 6:21 AM, Mohit Anchlia wrote: > > Thanks but if I do have to specify start and end columns then how much > overhead roughly would that translate to since reading metadata should be > constant overall? > > On Mon, Mar 26, 2012 at 10:18 AM, aa

Re: Performance overhead when using start and end columns

2012-03-26 Thread Mohit Anchlia
/07/04/Cassandra-Query-Plans/ > > Tl;Dr; Select columns with no start, in the natural Comparator order. > > Cheers > > >- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 25/03/2012, at 2:25 PM, Mohit Anchl

Performance overhead when using start and end columns

2012-03-24 Thread Mohit Anchlia
I have rows with around 2K-50K columns but when I do a query I only need to fetch few columns between start and end columns. I was wondering what performance overhead does it cause by using slice query with start and end columns? Looking at the code it looks like when you give start and end column

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Mohit Anchlia
On Sun, Feb 26, 2012 at 12:18 PM, aaron morton wrote: > Nathan Milford has a post about taking a node down > > http://blog.milford.io/2011/11/rolling-upgrades-for-cassandra/ > > The only thing I would do differently would be turn off thrift first. > > Cheers > Isn't decomission meant to do the sa

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
; > On 2/22/2012 1:34 PM, Mohit Anchlia wrote: > > Outside on the file system and a pointer to it in C* > > On Wed, Feb 22, 2012 at 10:03 AM, Rafael Almeida wrote: > >> Keep them where? >> >> -- >> *From:* Mohit Anchlia >

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
Outside on the file system and a pointer to it in C* On Wed, Feb 22, 2012 at 10:03 AM, Rafael Almeida wrote: > Keep them where? > > -- > *From:* Mohit Anchlia > *To:* user@cassandra.apache.org > *Cc:* potek...@bnl.gov > *Sent:* Wednesday, Febr

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Mohit Anchlia
In my opinion if you are busy site or application keep blobs out of the database. On Wed, Feb 22, 2012 at 9:37 AM, Dan Retzlaff wrote: > Chunking is a good idea, but you'll have to do it yourself. A few of the > columns in our application got quite large (maybe ~150MB) and the failure > mode was

Re: nodetool hangs and didn't print anything with firewall

2012-02-05 Thread Mohit Anchlia
Does it work with iptables disabled? You could add log to your firewall rules to see if firewall is dropping the packets. On Sun, Feb 5, 2012 at 5:35 PM, Roshan wrote: > Hi > > I have 2 node Cassandra cluster and each linux box configured with a > firewall. The ports 7000, 7199 and 9160 are open

Re: WARN [Memtable] live ratio

2012-02-03 Thread Mohit Anchlia
hen read from. > > On Fri, Feb 3, 2012 at 10:31 AM, Mohit Anchlia wrote: >> On Fri, Feb 3, 2012 at 7:32 AM, Jonathan Ellis wrote: >>> It's a warn because it's nonsense for the JVM to report that an column >>> + overhead, takes less space than just the col

Re: WARN [Memtable] live ratio

2012-02-03 Thread Mohit Anchlia
d on WARN and ERROR. But if there is nothing to do then it probably is just an INFO. > On Tue, Jan 31, 2012 at 9:41 PM, Mohit Anchlia wrote: >> I guess this is not really a WARN in that case. >> >> On Tue, Jan 31, 2012 at 4:29 PM, aaron morton >> wrote: >>> The r

Re: WARN [Memtable] live ratio

2012-01-31 Thread Mohit Anchlia
I guess this is not really a WARN in that case. On Tue, Jan 31, 2012 at 4:29 PM, aaron morton wrote: > The ratio is the ratio of serialised bytes for a memtable to actual JVM > allocated memory. Using a ratio below 1 would imply the JVM is using less > bytes to store the memtable in memory than i

Re: WARN [Memtable] live ratio

2012-01-30 Thread Mohit Anchlia
I have the same experience. Wondering what's causing this? One thing I noticed is that this happens if server is idle for some time and then load starts going high is when I start to see these messages. On Mon, Jan 30, 2012 at 4:54 PM, Roshan wrote: > Hi All > > Time to time I am seen this below

Re: Cassandra to Oracle?

2012-01-20 Thread Mohit Anchlia
I think the problem stems when you have data in a column that you need to run adhoc query on which is not denormalized. In most cases it's difficult to predict the type of query that would be required. Another way of solving this could be to index the fields in search engine. On Fri, Jan 20, 2012

Re: Garbage collection freezes cassandra node

2012-01-19 Thread Mohit Anchlia
What's the version of Java do you use? Can you try reducing NewSize and increasing Old generation? If you are on old version of Java I also recommend upgrading that version. On Thu, Jan 19, 2012 at 3:27 AM, Rene Kochen wrote: > Thanks for your comments. The application is indeed suffering from a

Re: Max records per node for a given secondary index value

2012-01-18 Thread Mohit Anchlia
You need to shard your rows On Wed, Jan 18, 2012 at 5:46 PM, Kamal Bahadur wrote: > Anyone? > > > On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur > wrote: >> >> Hi All, >> >> It is great to know that Cassandra column family can accommodate 2 billion >> columns per row! I was reading about how Cas

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Mohit Anchlia
Have you tried running repair first on each node? Also, verify using df -h on the data dirs On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach wrote: > Hi, > > we're using RP and have each node assigned the same amount of the token > space. The cluster looks like that: > > Address         Status

Brisk with standard C* cluster

2012-01-16 Thread Mohit Anchlia
Is it possible to add Brisk only nodes to standard C* cluster? So if we have node A,B,C with standard C* then add Brisk node D,E,F for analytics?

Installing C* on EC2

2012-01-12 Thread Mohit Anchlia
What's the best way to install C*? Any good links? Is it better to just create instances and install rpms on it first, just like regular cluster and then create image from it? I am assuming it's possible. Are there any known issues when running C* on EC2? How do other C* users deal with instance fa

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
ets it looks like this has been tried > before, and for various reasons was not added. It's definitely > non-trivial to get right. > > On Fri, 6 Jan 2012 13:33:02 -0800 > Mohit Anchlia wrote: >> This looks like right way to do it. But remember this still doesn't >

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
andra >> a month or so back on this list. >> >> -Jeremiah >> >> On 01/06/2012 02:42 PM, Bryce Allen wrote: >> > On Fri, 6 Jan 2012 10:38:17 -0800 >> > Mohit Anchlia  wrote: >> >> It could be as simple as reading before writing to make sure tha

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
the "tracker" CF too, no? > > > On Jan 6, 2012, at 10:38 AM, Mohit Anchlia wrote: > >> On Fri, Jan 6, 2012 at 10:03 AM, Drew Kutcharian wrote: >>> Hi Everyone, >>> >>> What's the best way to reliably have unique constraints like function

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Mohit Anchlia
On Fri, Jan 6, 2012 at 10:03 AM, Drew Kutcharian wrote: > Hi Everyone, > > What's the best way to reliably have unique constraints like functionality > with Cassandra? I have the following (which I think should be very common) > use case. > > User CF > Row Key: user email > Columns: userId: UUID

Re: Pending on ReadStage

2012-01-06 Thread Mohit Anchlia
Are all your nodes equally balanced in terms of read requests? Are you using RandomPartitioner? Are you reading using indexes? First thing you can do is compare iostat -x output between the 2 nodes to rule out any io issues assuming your read requests are equally balanced. On Fri, Jan 6, 2012 at

Re: cassandra data to hadoop.

2011-12-24 Thread Mohit Anchlia
You could read using Cassandra client and write to HDFS using Hadoop FS Api. On Fri, Dec 23, 2011 at 11:20 PM, ravikumar visweswara wrote: > Jeremy, > > We use cloudera distribution for our hadoop cluster and may not be possible > to migrate to brisk quickly because of flume/hue dependencies. Did

Re: Garbage collection freezes cassandra node

2011-12-19 Thread Mohit Anchlia
Increasing memory in this case may not solve the problem. Share some information about your workload. Cluster configuration, cache sizes etc. You can also try getting java heap historgram to get more info on what's on the heap. On Mon, Dec 19, 2011 at 7:35 AM, Rene Kochen wrote: > I recently se

Re: One ColumnFamily places data on only 3 out of 4 nodes

2011-12-14 Thread Mohit Anchlia
> bart@node1:~$ nodetool -h localhost getendpoints A UserDetails 4545027 > 192.168.81.5 > 192.168.81.2 > 192.168.81.3 Can you see what happens if you stop C* say on node .5 and write and read at quorum? On Wed, Dec 14, 2011 at 7:06 AM, Bart Swedrowski wrote: > > > On 14 December 2011 14:58, wro

Re: What sort of load do the tombstones create on the cluster?

2011-11-21 Thread Mohit Anchlia
On Mon, Nov 21, 2011 at 11:47 AM, Edward Capriolo wrote: > > > On Mon, Nov 21, 2011 at 3:30 AM, Philippe wrote: >> >> I don't remember your exact situation but could it be your network >> connectivity? >> I know I've been upgrading mine because I'm maxing out fastethernet on a >> 12 node cluster.

Re: Efficiency of Cross Data Center Replication...?

2011-11-20 Thread Mohit Anchlia
On Sun, Nov 20, 2011 at 4:01 AM, Boris Yen wrote: > A quick question, what if DC2 is down, and after a while it comes back on. > how does the data get sync to DC2 in this case? (assume hint is disable) > Thanks in advance. Manually, use nodetool repair in rolling fashion on all the nodes of DC2

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
f GC logs including ParNew and other major phases recorded in the logs. Are there any significant writes, memtable flushes etc occuring during this time? How many read/sec and writes/sec? What's the size of your row and columns that you are trying to retrieve? > > On 11/18/11 2:40 PM

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 1:46 PM, Todd Burruss wrote: > Ok, I figured something like that.  Switching to > ConcurrentLinkedHashCacheProvider I see it is a lot better, but still > instead of the 25-30ms response times I enjoyed with no caching, I'm > seeing 500ms at 100% hit rate on the cache.  No o

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 9:42 AM, Sylvain Lebresne wrote: > On Fri, Nov 18, 2011 at 6:31 PM, Mohit Anchlia wrote: >> On Fri, Nov 18, 2011 at 7:47 AM, Sylvain Lebresne >> wrote: >>> On Fri, Nov 18, 2011 at 4:23 PM, Mohit Anchlia >>> wrote: >>>> On F

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 7:47 AM, Sylvain Lebresne wrote: > On Fri, Nov 18, 2011 at 4:23 PM, Mohit Anchlia wrote: >> On Fri, Nov 18, 2011 at 6:39 AM, Sylvain Lebresne >> wrote: >>> On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss wrote: >>>> I'm using c

Re: Data Model Design for Login Servie

2011-11-18 Thread Mohit Anchlia
Secondary indexes in Cassandra are not good fit for High Cardinality values On Fri, Nov 18, 2011 at 7:14 AM, Dan Hendry wrote: > I they are not limited to repeating values but the Datastax docs[1] on > secondary indexes certainly seem to indicate they would be a poor fit for > this case (high rea

Re: ParNew and caching

2011-11-18 Thread Mohit Anchlia
On Fri, Nov 18, 2011 at 6:39 AM, Sylvain Lebresne wrote: > On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss wrote: >> I'm using cassandra 1.0.  Been doing some testing on using cass's cache. >>  When I turn it on (using the CLI) I see ParNew jump from 3-4ms to >> 200-300ms.  This really screws with

Re: Second Cassandra users survey

2011-11-14 Thread Mohit Anchlia
On Mon, Nov 14, 2011 at 4:44 PM, Jake Luciani wrote: > Re  Simpler "elasticity": > Latest opscenter will now rebalance cluster optimally > http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 > Does it cause any impact on reads and writes while re-balance is in progress? How is it handled

Re: Help with Cassandra Row Caches

2011-11-11 Thread Mohit Anchlia
Can you temporarily increase the size of Heap and try? On Fri, Nov 11, 2011 at 5:21 PM, Oleg Tsvinev wrote: > Hi everybody, > > We set row cache too high, 1 or so and now all our 6 nodes fail > with OOM. I believe that high row cache causes OOMs. > > Now, we trying to change row cache sizes u

Re: security

2011-11-09 Thread Mohit Anchlia
We lockdown ssh to root from any network. We also provide individual logins including sysadmin and they go through LDAP authentication. Anyone who does sudo su as root gets logged and alerted via trapsend. We use firewalls and also have a separate vlan for datastore servers. We then open only speci

Re: Second Cassandra users survey

2011-11-06 Thread Mohit Anchlia
Transparent on disk encryption with pluggable keyprovider will also be really helpful to secure sensitive information. On Sun, Nov 6, 2011 at 9:42 AM, Aaron Turner wrote: > The intent was to have a lighter solution for common problems then > having to go with Hadoop or streaming large quantities

Re: Second Cassandra users survey

2011-11-03 Thread Mohit Anchlia
On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson wrote: > I'm using Cassandra as a big graph database, loading large volumes of data > live and linking on the fly. Not sure if Cassandra is right fit to model complex vertexes and edges. > The number of edges grow geometrically with data added, and

Re: Cassandra cluster HW spec (commit log directory vs data file directory)

2011-10-30 Thread Mohit Anchlia
On Sun, Oct 30, 2011 at 6:53 PM, Chris Goffinet wrote: > > > On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean > wrote: >> >> Hey Chris, >> >>  Thanks for sharing all  the info. >>  I have few questions: >>  1. What are you doing with so much memory :) ? How much of it do you >> allocate for heap ? >

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Mohit Anchlia
On Sat, Oct 29, 2011 at 11:23 AM, Aditya Narayan wrote: > @Mohit: > I have stated the example scenarios in my first post under this heading. > Also I have stated above why I want to split that data in two rows & like > Ikeda below stated, I'm too trying out to prevent the frequently accessed > row

  1   2   >