Re: What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Robert Coli
On Mon, Nov 24, 2014 at 3:01 PM, Parag Shah wrote: > In our case, the timeouts were happening because internode > authentication was turned on and by default the user column family in the > system_auth keyspace is replicated only on 1 node. We also had to tune the > permissions_validity_in_ms fr

Re: What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Parag Shah
lt;mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Monday, November 24, 2014 at 2:52 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: What causes NoHostAvailableException

Re: What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Robert Coli
On Mon, Nov 24, 2014 at 12:57 PM, Kevin Burton wrote: > I’m trying to track down some exceptions in our production cluster. I > bumped up our write load and now I’m getting a non-trivial number of these > exceptions. Somewhere on the order of 100 per hour. > > All machines have a somewhat high

Re: What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Shane Hansen
Not sure if this is what you're looking for, but api docs can be useful (I won't copy/paste the docs themselves) http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/exceptions/NoHostAvailableException.html http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/exceptions/

Re: What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Bulat Shakirzyanov
Check out Ruby Driver documentation on these topics: Error Handling Retry Policies While the documentation is for the Ruby Driver, the concepts were borrowed from and

What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Kevin Burton
I’m trying to track down some exceptions in our production cluster. I bumped up our write load and now I’m getting a non-trivial number of these exceptions. Somewhere on the order of 100 per hour. All machines have a somewhat high CPU load because they’re doing other tasks. I’m worried that per

Re: UnavailableException

2014-07-14 Thread Ruchir Jha
Yes the line is : Datacenter: datacenter1 which matches with my create keyspace command. As for the NodeDiscoveryType, we will follow it but I don't believe it to be the root of my issue here because the nodes start up atleast 6 hours before the UnavailableException and as far as adding nod

Re: UnavailableException

2014-07-14 Thread Chris Lohfink
that your DC names match up (case sensitive) > > Chris > > On Jul 11, 2014, at 9:38 AM, Ruchir Jha wrote: > >> Here's the complete stack trace: >> >> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: >> Token

Re: UnavailableException

2014-07-14 Thread Chris Lohfink
t; On Jul 11, 2014, at 9:38 AM, Ruchir Jha wrote: > >> Here's the complete stack trace: >> >> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: >> TokenRangeOfflineException: [host=ny4lpcas5.fusion

Re: UnavailableException

2014-07-14 Thread Ruchir Jha
gt;> double check that your DC names match up (case sensitive) >>> >>> Chris >>> >>> On Jul 11, 2014, at 9:38 AM, Ruchir Jha wrote: >>> >>> Here's the complete stack trace: >>

Re: UnavailableException

2014-07-11 Thread Mark Reddy
mplete stack trace: >> >> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: >> TokenRangeOfflineException: >> [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), >> attempts=3]UnavailableException() >> at >> com.netf

Re: UnavailableException

2014-07-11 Thread Ruchir Jha
2014, at 9:38 AM, Ruchir Jha wrote: > > Here's the complete stack trace: > > com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: > TokenRangeOfflineException: > [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), > attempts=3]Unavaila

Re: UnavailableException

2014-07-11 Thread Chris Lohfink
ck trace: > > com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: > TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, > latency=22784(42874), attempts=3]UnavailableException() > at > com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConver

Re: UnavailableException

2014-07-11 Thread Ruchir Jha
Here's the complete stack trace: com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException()

Re: UnavailableException

2014-07-11 Thread Prem Yadav
Please post the full exception. On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha wrote: > We have a 12 node cluster and we are consistently seeing this exception > being thrown during peak write traffic. We have a replication factor of 3 > and a write consistency level of QUORUM. Also note there is

UnavailableException

2014-07-11 Thread Ruchir Jha
We have a 12 node cluster and we are consistently seeing this exception being thrown during peak write traffic. We have a replication factor of 3 and a write consistency level of QUORUM. Also note there is no unusual Or Full GC activity during this time. Appreciate any help. Sent from my iPhon

Cassandra Caused by: UnavailableException()

2014-03-17 Thread Alaa Zubaidi (PDF)
t;socket closed" errors, and many delays, in the Reads. while debugging the issue we found the following error in the our Cassandra client: Caused by: UnavailableException() Cassandra log file does not show errors.. What does

Re: WriteTimeoutException instead of UnavailableException

2013-12-24 Thread Demian Berjman
Thanks Aaron!! On Mon, Dec 23, 2013 at 5:28 PM, Aaron Morton wrote: > But in some cases, from one certain node, I get an WriteTimeoutException > for a few minutes until an UnavailableException. It's like the coordinator > don't know the status of the cluster. Any clue wh

Re: WriteTimeoutException instead of UnavailableException

2013-12-23 Thread Aaron Morton
> But in some cases, from one certain node, I get an WriteTimeoutException for > a few minutes until an UnavailableException. It's like the coordinator don't > know the status of the cluster. Any clue why is this happening? Depending on how the node goes down there can be a d

WriteTimeoutException instead of UnavailableException

2013-12-17 Thread Demian Berjman
Question. I have a 5 node cluster (local with ccm). A keyspace with rf: 3. Three nodes are down. I run "nodetool ring" in the two living nodes and both see the other three nodes down. Then i do an insert with cs quorum and get an UnavailableException. It's ok. I am using Datasta

Re: UnavailableException() for keyspace

2013-02-20 Thread Abhijit Chanda
1b Up Normal 65.71 KB > 0.00% 113427455640312821154458202477256070485 * > > However, in any of the nodes, if I connect to cassandra-cli and try to > list a CF, I get unavailableException: > > *[default@unknown] use dmp_input; > * > *[d

Re: Understanding UnavailableException

2012-08-18 Thread Nick Bailey
> Last time I checked, this was not true for batch writes. The row > mutations were started sequentially (ie, for each mutation check > availability, then kick off an aynchronous write), so it was possible > for the first to succeed, and the second to fail with an > UnavailableExce

Re: Understanding UnavailableException

2012-08-17 Thread Russell Haering
On Fri, Aug 17, 2012 at 8:00 AM, Nick Bailey wrote: > This is actually incorrect. If you get an UnavailableException, the > write was rejected by the coordinator and was not written anywhere. Last time I checked, this was not true for batch writes. The row mutations were started sequential

Re: Understanding UnavailableException

2012-08-17 Thread Mohit Agarwal
r your question: > > >> UnavailableException is bit tricky. It means, that not all replicas > >> required by CL received update. Actually you do not know, whenever > update > >> was stored or not, and actually what went wrong. > >> > > This is actually incorrec

Re: Understanding UnavailableException

2012-08-17 Thread Nick Bailey
This blog post should help: http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure But to answer your question: >> UnavailableException is bit tricky. It means, that not all replicas >> required by CL received update. Actually you do not know, whenever update >

Re: Understanding UnavailableException

2012-08-17 Thread Mohit Agarwal
Does this mean that the coordinator sends requests to all nodes, even when it knows that sufficient number of nodes are not available, via gossip? On Fri, Aug 17, 2012 at 4:49 PM, Maciej Miklas wrote: > UnavailableException is bit tricky. It means, that not all replicas > required

Re: Understanding UnavailableException

2012-08-17 Thread Maciej Miklas
UnavailableException is bit tricky. It means, that not all replicas required by CL received update. Actually you do not know, whenever update was stored or not, and actually what went wrong. This is the case, why writing with CL.ALL might get problematic. It is enough, that only one replica is

Understanding UnavailableException

2012-08-17 Thread Mohit Agarwal
Hi guys, I am trying to understand what happens when an UnavailableException is thrown. a) Suppose we are doing a ConsistencyLevel.ALL write on a 3 node cluster. My understanding is that if one of the nodes is down and the coordinator node is aware of that(through gossip), then it will respond

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Peter Schuller
>  Thank you for your explanations. Even with a RF=1 and one node down I don't > understand why I can't at least read the data in the nodes that are still > up? You will be able to read data for row keys that do not live on the node that is down. But for any request to a row which is on the node t

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Alexandru Dan Sicoe
Hi Peter, Thank you for your explanations. Even with a RF=1 and one node down I don't understand why I can't at least read the data in the nodes that are still up? Also, why can't I at least perform writes with consistency level ANY and failover policy ON_FAIL_TRY_ALL_AVAILABLE...shouldn't the nod

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Peter Schuller
> If you want to survive node failures, use an RF above 1. And then make > sure to use an appropriate consistency level. To elaborate a bit: RF, or replication factor, is the *total* number of copies of any piece of data in the cluster. So with only one copy, the data will not be available when a

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Peter Schuller
> took a node down to see how it behaves. All of a sudden I couldn't write or [snip] > me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be [snip] >     Default replication factor = 1 So you have an RF=1 cluster (only one copy of data) and you bring a node down. This fundamenta

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Alexandru Dan Sicoe
java:219) at ch.cern.pbeast.CassandraDBClient.executeBatchInsert(CassandraDBClient.java:958) at ch.cern.test.TimeBinTester.main(TimeBinTester.java:294)Caused by: UnavailableException() at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.ja

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread R. Verlangen
t's inappropriate to reply here, please > let > >> me know../ > >> > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html > >> > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread Jonathan Ellis
ubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html >> >> -- >> View this message in context: >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p693676

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread Javier Canillas
What the problem might be is that you are setting the Consistency Level to a value bigger than 1. In such cases, Cassandra will respond you with an UnavailableException since it can't achieve the level of consistency you are asking for. Remember that, when you have RF=2, CS values as AL

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread Jonathan Ellis
node-down-overall-failure-td6936722.html > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html > > -- > View this message in context: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Unavailable

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread RobinUs2
r-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Re: TimedOutException and UnavailableException from multiGetSliceQuery

2011-10-05 Thread Yuhan Zhang
ronmorton > http://www.thelastpickle.com > > On 6/10/2011, at 9:14 AM, Yuhan Zhang wrote: > > Hi all, > > I have been experiencing the unavailableException and TimedOutException on > a 3-node cassandra cluster > during a multiGetSliceQuery with 1000 columns. Since there

Re: TimedOutException and UnavailableException from multiGetSliceQuery

2011-10-05 Thread aaron morton
Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6/10/2011, at 9:14 AM, Yuhan Zhang wrote: > Hi all, > > I have been experiencing the unavailableException and TimedOutException on a > 3-node cassandra cluste

TimedOutException and UnavailableException from multiGetSliceQuery

2011-10-05 Thread Yuhan Zhang
Hi all, I have been experiencing the unavailableException and TimedOutException on a 3-node cassandra cluster during a multiGetSliceQuery with 1000 columns. Since there are many keys involved in the query, I divided them into groups of 5000 rows and process each group individually in a for loop

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-06 Thread Anthony Ikeda
Thanks Jonathan. On Tue, Sep 6, 2011 at 3:53 PM, Jonathan Ellis wrote: > It's linked from the vote thread: > > http://mail-archives.apache.org/mod_mbox/cassandra-dev/201109.mbox/%3ccakkz8q12k2o7zm5uy9hxnk7kyesqidwcyxbq_uzfna+yaty...@mail.gmail.com%3E > > On Tue, Sep 6, 2011 at 5:41 PM, Anthony I

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-06 Thread Jonathan Ellis
It's linked from the vote thread: http://mail-archives.apache.org/mod_mbox/cassandra-dev/201109.mbox/%3ccakkz8q12k2o7zm5uy9hxnk7kyesqidwcyxbq_uzfna+yaty...@mail.gmail.com%3E On Tue, Sep 6, 2011 at 5:41 PM, Anthony Ikeda wrote: > Do you have a link to the downloadable? > Anthony > > On Tue, Sep 6,

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-06 Thread Anthony Ikeda
Do you have a link to the downloadable? Anthony On Tue, Sep 6, 2011 at 3:38 PM, Anthony Ikeda wrote: > Thanks Jonathan, I'll consult with the team. > > Anthony > > > On Tue, Sep 6, 2011 at 3:34 PM, Jonathan Ellis wrote: > >> 0.8.5 is being voted on now on the dev list. I'd encourage you to te

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-06 Thread Anthony Ikeda
Thanks Jonathan, I'll consult with the team. Anthony On Tue, Sep 6, 2011 at 3:34 PM, Jonathan Ellis wrote: > 0.8.5 is being voted on now on the dev list. I'd encourage you to test it. > > I do not recommend running trunk. > > On Tue, Sep 6, 2011 at 5:32 PM, Anthony Ikeda > wrote: > > Jonathan

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-06 Thread Jonathan Ellis
0.8.5 is being voted on now on the dev list. I'd encourage you to test it. I do not recommend running trunk. On Tue, Sep 6, 2011 at 5:32 PM, Anthony Ikeda wrote: > Jonathan, do you know when 0.8.5 will be released? We are looking at a > production deployment soon and this fix is something that

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-06 Thread Anthony Ikeda
Jonathan, do you know when 0.8.5 will be released? We are looking at a production deployment soon and this fix is something that we would need. Alternatively, what is the stability of the trunk for a production deployment. Anthony On Mon, Sep 5, 2011 at 3:35 PM, Evgeniy Ryabitskiy < evgeniy.ryab

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-05 Thread Evgeniy Ryabitskiy
great thanks! Evgeny.

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-05 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-3082 On Mon, Sep 5, 2011 at 10:04 AM, Evgeniy Ryabitskiy wrote: > Hi, > > I'am trying to store record with EACH_QUORUM consistency and RF=3. While > same thing with RF=2 is working. > Could some one tell me why EACH_QUORUM is working with RF=2 but n

Re: UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-05 Thread Evgeniy Ryabitskiy
One more thing, Cassandra version is 0.8.4. And if I try same thing from Pelops(thrift), I get UnavailableException.

UnavailableException while storing with EACH_QUORUM and RF=3

2011-09-05 Thread Evgeniy Ryabitskiy
Hi, I'am trying to store record with EACH_QUORUM consistency and RF=3. While same thing with RF=2 is working. Could some one tell me why EACH_QUORUM is working with RF=2 but not with RF >=3 I have 7 nodes cluster. All nodes are UP. Here is simple CLI script: create keyspace kspace3 with placeme

UnavailableException on first time setup

2011-08-01 Thread Mike Stults
I have just started with a cassandra, so maybe a simple configuration problem? from a java program, with consistency level set to ANY, Exception in thread "main" UnavailableException() at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19053)

Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Jonathan Ellis
Yes, if you want to keep writes available w/ RF=1 then you need to use CL.ANY. On Fri, Apr 15, 2011 at 3:48 PM, Mick Semb Wever wrote: > On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote: >> Sure sounds like you have RF=1 to me. > > Yes that's right. > > I see... so the answer here is that

Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote: > Sure sounds like you have RF=1 to me. Yes that's right. I see... so the answer here is that i should be using CL.ANY ? (so the write goes through and hinted handoff can get it to the correct node latter on). ~mck -- "The fox condemns t

Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Jonathan Ellis
r this the other two nodes kept throwing UnavailableExceptions like > > UnavailableException() >        at > org.apache.cassandra.service.WriteResponseHandler.assureSufficientLiveNodes(WriteResponseHandler.java:127) >        at > org.apache.cassandra.service.StorageProxy.mut

CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
Just experienced something i don't understand yet. Running a 3 node cluster successfully for a few days now, then one of the nodes went down (server required reboot). After this the other two nodes kept throwing UnavailableExceptions like UnavailableException()

Re: dropped mutations, UnavailableException, and long GC

2011-02-24 Thread Narendra Sharma
ncreasing the RPC timeout could help this, but I’m not convinced this is > the root of the problem. Note that in this case writes return with the > UnavailableException. > > - As mentioned, long GCs. We see the ParNew GC doing a lot of > smaller collections (few hundred MB)

dropped mutations, UnavailableException, and long GC

2011-02-24 Thread Jeffrey Wang
odes often see each other as dead even though none of the nodes actually go down. I suspect this may be due to long GCs. It seems like increasing the RPC timeout could help this, but I'm not convinced this is the root of the problem. Note that in this case writes return with the Unava

Re: UnavailableException when data grows

2010-10-02 Thread Peter Schuller
> And I'm still getting UnavailableException and TimedOutException when there > Cassandra daemon is doing either Compaction or Garbage collection... Have you specifically correlated this? If so, which one, or both? GC should not cause unavailable exceptions on a healthy cluster w

Re: UnavailableException when data grows

2010-10-01 Thread Rana Aich
And I'm still getting UnavailableException and TimedOutException when there Cassandra daemon is doing either Compaction or Garbage collection... On Thu, Sep 30, 2010 at 2:42 PM, Rana Aich wrote: > I ran the nodetool cleanup...but the scenario doesn't change... > > > On Thu

Re: UnavailableException when data grows

2010-09-30 Thread Rana Aich
I ran the nodetool cleanup...but the scenario doesn't change... On Thu, Sep 30, 2010 at 1:14 PM, Edward Capriolo wrote: > After nodetool move you have to run nodetool cleanup. > > On Thu, Sep 30, 2010 at 3:45 PM, Rana Aich wrote: > > I have arranged my initial tokens and get this result: > > Add

Re: UnavailableException when data grows

2010-09-30 Thread Edward Capriolo
After nodetool move you have to run nodetool cleanup. On Thu, Sep 30, 2010 at 3:45 PM, Rana Aich wrote: > I have arranged my initial tokens and get this result: > Address       Status     Load          Range >      Ring > > 17014118346046923173168730371588000 > 192.168.202.1 Up         208.39

Re: UnavailableException when data grows

2010-09-30 Thread Rana Aich
I have arranged my initial tokens and get this result: Address Status Load Range Ring 17014118346046923173168730371588000 192.168.202.1 Up 208.39 GB 3402823669209384634633746074317700 |<--| 192.168.202.2 Up 333.52 GB 6805647338418769269267492148

Re: UnavailableException when data grows

2010-09-27 Thread Oleg Anastasyev
Rana Aich gmail.com> writes: > > Yet my nodetool shows the following: > > 192.168.202.202Down       319.94 GB     7200044730783885730400843868815072654      |<--| > 192.168.202.4 Up         382.39 GB     23719654286404067863958492664769598669     |   ^ > 192.168.202.2 Up         106.81 GB     3

Re: UnavailableException when data grows

2010-09-27 Thread Benjamin Black
>> In terms of surviving the problem, a re-try on the client side might >> help assuming the problem is temporary. >> >> However,  certainly the fact that you're seeing an issue to begin with >> is interesting, and the way to avoid it would depend on what the >> pro

Re: UnavailableException when data grows

2010-09-27 Thread Rana Aich
in with > is interesting, and the way to avoid it would depend on what the > problem is. My understanding is that the UnavailableException > indicates that the node you are talking to was unable to read > form/write to a sufficient number of nodes to satisfy your consistency > level.

Re: UnavailableException when data grows

2010-09-27 Thread Peter Schuller
problem is. My understanding is that the UnavailableException indicates that the node you are talking to was unable to read form/write to a sufficient number of nodes to satisfy your consistency level. Presumably either because individual requests failed to return in time, or because the node c

UnavailableException when data grows

2010-09-27 Thread Rana Aich
t a one stretch before unavailable exception stops my client program. I'm writing with ConsistencyLevel.ONE. Previously I've inserted around 10 billion data with OrderPreservingPartition and sometimes got TimedOutException. But now with RandomPartition I'm getting UnavailableExcepti

Re: RE: UnavailableException with 3 nodes and RF=2

2010-09-14 Thread Aaron Morton
[mailto:martin.grabmuel...@eleven.de] Sent: 14 September 2010 09:54To: user@cassandra.apache.orgSubject: RE: UnavailableException with 3 nodes and RF=2 When you write with QUORUM, RF/2+1 of the nodes cassandra *wants to write*to have to be up.  In your case, RF/2+1 = 2, that means, the two nodes responsiblefor

RE: UnavailableException with 3 nodes and RF=2

2010-09-14 Thread Chris Jansen
. Thanks again. Chris From: Dr. Martin Grabmüller [mailto:martin.grabmuel...@eleven.de] Sent: 14 September 2010 09:54 To: user@cassandra.apache.org Subject: RE: UnavailableException with 3 nodes and RF=2 When you write with QUORUM, RF/2+1 of the nodes cassandra *wants to write* to have to be

Re: UnavailableException with 3 nodes and RF=2

2010-09-14 Thread Sylvain Lebresne
unaware > replication strategy. When I write with CL=QUORUM with all 3 nodes commit > the data fine, but when I write with the same CL with one of the nodes down > I see an UnavailableException thrown. Surely if one of the nodes in the > cluster is down another should acknowledge the wri

RE: UnavailableException with 3 nodes and RF=2

2010-09-14 Thread Dr . Martin Grabmüller
@cassandra.apache.org Subject: UnavailableException with 3 nodes and RF=2 Hi All, I’m a newbie to Cassandra so I could have a configuration issue here, I am using the latest stable release 0.6.0. I have created a cluster of 3

UnavailableException with 3 nodes and RF=2

2010-09-14 Thread Chris Jansen
fine, but when I write with the same CL with one of the nodes down I see an UnavailableException thrown. Surely if one of the nodes in the cluster is down another should acknowledge the writes and maintain the quorum, or is there something that I have misunderstood? From what I understand, in this case

Re: Two questions : Server crash during compaction and UnavailableException

2010-08-03 Thread Ilun Ahn
You're right... I missed posting crash log. I was too busy and under press of business at that time. Please understand. These are head and tail of the JVM crash log when it stopped : --- T H R E A D --- Current thread (0x002ca9903400): JavaThread "COMPACTION-POOL:1

Re: Two questions : Server crash during compaction and UnavailableException

2010-08-03 Thread Jonathan Ellis
If you have a crash log you should post at least the header rather than playing 20 questions with us. But if it's not OOM then it's likely to be a bug in the JVM, so upgrading is probably your best option. On Tue, Aug 3, 2010 at 3:49 AM, Ilun Ahn wrote: > No, I don't think the direct cause is ou

Re: Two questions : Server crash during compaction and UnavailableException

2010-08-03 Thread Ilun Ahn
No, I don't think the direct cause is out of heap space. It didn't left any heap dump file with the option -XX:+HeapDumpOnOutOfMemoryError. My system.log for the last minute is as follows(many GC occurs): INFO [HINTED-HANDOFF-POOL:1] 2010-08-02 20:33:50,254 HintedHandOffManager.java (line 153) St

Re: Two questions : Server crash during compaction and UnavailableException

2010-08-03 Thread Ilun Ahn
2010/8/2 Peter Schuller > > First, Cassandra suddenly dies during compaction. Java core dump says > that > > the last thread run was "COMPACTION-POOL:1". > > I suspect that my business logic could lead size of columns in a column > > family per a row to be greater than two gigabytes. (but i coul

Re: Two questions : Server crash during compaction and UnavailableException

2010-08-02 Thread Peter Schuller
> First, Cassandra suddenly dies during compaction. Java core dump says that > the last thread run was  "COMPACTION-POOL:1". > I suspect that my business logic could lead size of columns in a column > family per a row to be greater than two gigabytes. (but i couldn't confirm > it yet) Are you runn

Two questions : Server crash during compaction and UnavailableException

2010-08-01 Thread il-woon Ahn
mn family per a row to be greater than two gigabytes. (but i couldn't confirm it yet) Can this be a cause of the server down and is there any solution? (should I wait 0.7?) Second, It seems that my client program often get UnavailableException from Cassandra when Cass is running in normal. I

Re: UnavailableException on QUORUM write

2010-07-27 Thread Per Olesen
On Jul 27, 2010, at 12:23 AM, Jonathan Ellis wrote: > Can you turn on debug logging and try this patch? Yes, but..I am on vacation now, so it will be about 3 weeks from now.

Re: UnavailableException on QUORUM write

2010-07-26 Thread Jonathan Ellis
r, writeEndpoints, hintedEndpoints, consistency_level); @@ -296,6 +297,7 @@ } if (liveNodes < blockFor) { +logger.debug("only " + liveNodes + " seen out of " + blockFor + " required; throwing UE"); throw new UnavailableException(); } }

SV: UnavailableException on QUORUM write

2010-07-21 Thread Per Olesen
>> And when one of my non-seed nodes in my 3 node cluster is down, I do NOT get >> the exception. >> Anyway, guess I need to try and reproduce it in small scale. > >Does it return w/ UE immediately, or does it wait for RPCTimeout first? It returns with UE immediately.

Re: UnavailableException on QUORUM write

2010-07-20 Thread Jonathan Ellis
On Tue, Jul 20, 2010 at 6:40 AM, Per Olesen wrote: >>Seed should only be important when joining the cluster.  You're using >>the Thrift API, right? > > Yep! > > And when one of my non-seed nodes in my 3 node cluster is down, I do NOT get > the exception. > Anyway, guess I need to try and reproduc

SV: UnavailableException on QUORUM write

2010-07-20 Thread Per Olesen
:34 AM, Per Olesen wrote: > Hi, > > Think I might have found out the problem. > I had only one seed node, and when that node is down, they all give > UnavailableException. Guess at least one seed needs to be up then? Sounds > fair. > > > /Per > _

Re: UnavailableException on QUORUM write

2010-07-20 Thread Jonathan Ellis
Seed should only be important when joining the cluster. You're using the Thrift API, right? On Tue, Jul 20, 2010 at 5:34 AM, Per Olesen wrote: > Hi, > > Think I might have found out the problem. > I had only one seed node, and when that node is down, they all give > Unavail

SV: UnavailableException on QUORUM write

2010-07-20 Thread Per Olesen
Hi, Think I might have found out the problem. I had only one seed node, and when that node is down, they all give UnavailableException. Guess at least one seed needs to be up then? Sounds fair. /Per Fra: Per Olesen [...@trifork.com] Sendt: 9. juli 2010

Re: UnavailableException on QUORUM write

2010-07-09 Thread Jonathan Ellis
this sounds like a bug, although if you've attempted any node movement or bootstrapping, that could cause the required quorum to be larger than just the number of nodes. On Fri, Jul 9, 2010 at 3:53 AM, Per Olesen wrote: > Hi, > > I am a bit confused about getting an Unavailable

Re: UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
On Jul 9, 2010, at 11:11 AM, ChingShen wrote: > Which client library do you use? Direct on thrift api using thrift.jar, in version 917130.

Re: UnavailableException on QUORUM write

2010-07-09 Thread ChingShen
Which client library do you use? Shen On Fri, Jul 9, 2010 at 4:53 PM, Per Olesen wrote: > Hi, > > I am a bit confused about getting an UnavailableException when doing a > QUORUM write. > > I have a 3 node cluster, with RF=3. When all 3 nodes are up, the QUORUM > write succ

UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
Hi, I am a bit confused about getting an UnavailableException when doing a QUORUM write. I have a 3 node cluster, with RF=3. When all 3 nodes are up, the QUORUM write succeeds. When 1 of the 3 nodes are down, the QUORUM write fails with UnavailableException. Shouldn't it be enough w

Re: UnavailableException with 1 node down and RF=2?

2010-07-01 Thread Jonathan Ellis
; > >> > Sent from my iPhone. >> > >> > On 2010-07-01, at 1:39 AM, Benjamin Black wrote: >> > >> >> .QUORUM or .ALL (they are the same with RF=2). >> >> >> >> On Wed, Jun 30, 2010 at 10:22 PM, James Golick >> >> wrote: >>

Re: UnavailableException with 1 node down and RF=2?

2010-07-01 Thread James Golick
On Wed, Jun 30, 2010 at 10:22 PM, James Golick > wrote: > >>> 4 nodes, RF=2, 1 node down. > >>> How can I get an UnavailableException in that scenario? > >>> - J. > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >

Re: UnavailableException with 1 node down and RF=2?

2010-06-30 Thread Jonathan Ellis
Black wrote: > >> .QUORUM or .ALL (they are the same with RF=2). >> >> On Wed, Jun 30, 2010 at 10:22 PM, James Golick wrote: >>> 4 nodes, RF=2, 1 node down. >>> How can I get an UnavailableException in that scenario? >>> - J. > -- Jonatha

Re: UnavailableException with 1 node down and RF=2?

2010-06-30 Thread James Golick
>> How can I get an UnavailableException in that scenario? >> - J.

Re: UnavailableException with 1 node down and RF=2?

2010-06-30 Thread Benjamin Black
.QUORUM or .ALL (they are the same with RF=2). On Wed, Jun 30, 2010 at 10:22 PM, James Golick wrote: > 4 nodes, RF=2, 1 node down. > How can I get an UnavailableException in that scenario? > - J.

UnavailableException with 1 node down and RF=2?

2010-06-30 Thread James Golick
4 nodes, RF=2, 1 node down. How can I get an UnavailableException in that scenario? - J.