+1 for removing complexity to be able to create (and maintain!) “reasoned”
systems!
Sean Durity – Staff Systems Engineer, Cassandra
From: Reid Pinchback
Sent: Thursday, October 24, 2019 10:28 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Rack - Datacenter Load Balancing
nd I’m half asleep. I
> prefer simple until the workload motivates complex.
>
>
>
> R
>
>
>
>
>
> *From: *Sergio
> *Reply-To: *"user@cassandra.apache.org"
> *Date: *Thursday, October 24, 2019 at 12:06 PM
> *To: *"user@cassandra.apache.org&qu
at 3 in the morning and I’m half asleep. I prefer simple
until the workload motivates complex.
R
From: Sergio
Reply-To: "user@cassandra.apache.org"
Date: Thursday, October 24, 2019 at 12:06 PM
To: "user@cassandra.apache.org"
Subject: Re: Cassandra Rack - Datacenter Load
estigate.
>
>
> R
>
>
>
> *From: *Sergio
> *Reply-To: *"user@cassandra.apache.org"
> *Date: *Wednesday, October 23, 2019 at 3:34 PM
> *To: *"user@cassandra.apache.org"
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
&
ra Rack - Datacenter Load Balancing relations
Message from External Sender
Hi Reid,
Thank you very much for clearing these concepts for me.
https://community.datastax.com/comments/1133/view.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_comments_1133_view.html
hat is I
>>>>> missing
>>>>>or wrong assumption? I am thinking that I will write a blog post about
>>>>> all
>>>>>my learnings so far, thank you very much for the replies Best, Sergio
>>>>>
>>>>>
>>
gt;>>>> No, that’s not correct. The point of racks is to help you distribute
>>>>> the replicas, not further-replicate the replicas. Data centers are what
>>>>> do
>>>>> the latter. So for example, if you wanted to be able to ensure
e one rack in each DC. In your situation I think I’d
>>>> be more tempted to consider that. Then if an AZ went away, you could fail
>>>> over your traffic to the remaining DC and still be perfectly fine.
>>>>
>>>>
>>>>
>>>>
etworkTopologyStrategy’ at:
>>>
>>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>>
>>>
>>>
>>> That should help you better understand how replicas distribute.
>>>
>>>
>>>
>>> As mentione
ork traffic and
>> connection handling, you can’t isolate reads from writes. You can _
>> *mostly*_ insulate the write DC from the activity within the read DC,
>> and even that isn’t an absolute because of repairs. However, your mileage
>> may vary, so do what makes se
gio
> *Reply-To: *"user@cassandra.apache.org"
> *Date: *Wednesday, October 23, 2019 at 12:50 PM
> *To: *"user@cassandra.apache.org"
> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>
>
>
> *Message from External Sender*
>
> Hi
: Sergio
Reply-To: "user@cassandra.apache.org"
Date: Wednesday, October 23, 2019 at 12:50 PM
To: "user@cassandra.apache.org"
Subject: Re: Cassandra Rack - Datacenter Load Balancing relations
Message from External Sender
Hi Reid,
Thanks for your reply. I really appreciate your exp
v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64
latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
I would like to und
Hello guys!
I was reading about
https://cassandra.apache.org/doc/latest/architecture/dynamo.html#networktopologystrategy
I would like to understand a concept related to the node load balancing.
I know that Jon recommends Vnodes = 4 but right now I found a cluster with
vnodes = 256 replication
Hello,
Most drivers will handle the load balancing for you and provide policies
for configuring your desired approach for load balancing, i.e. load balance
around the entire ring or localize around a specific DC. Your clients will
leverage the driver for connections so that the client machines
Hi,
I am learning C* and its usage these days. I have a very simple, possibly naive
question about load balancing.
I know that C* can automatically balance the load itself by using tokens. But
what about connecting my cluster to a system. For exp, if we have a client or a
set of clients (e.g
rd many write/read
>> request to node1
>>
>> Why did I look at CPU load and not iostat & al ?
>>
>> Because I have a very intensive write work load with read-only-once
>> pattern. I've read here (
>> http://www.datastax.com/docs/0.8/cluster_arc
ecture/cluster_planning) that
> heavy write in C* is more CPU bound but maybe the info may be outdated and no
> longer true
>
> Regards
>
> Duy Hai DOAN
>
>
> On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler
> wrote:
> On 04/24/2014 10:29 AM, DuyHai Doan w
ai DOAN
>
>
> On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler
> wrote:
>
> On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>
> Client used = Hector 1.1-4
> Default Load Balancing connection policy
> Both nodes addresses are provided to Hector so according to its
>
> Client used = Hector 1.1-4
>> Default Load Balancing connection policy
>> Both nodes addresses are provided to Hector so according to its
>>connection policy, the client should switch alternatively between both nodes
>>
>
OK, so is only one connection being
t;
>> Client used = Hector 1.1-4
>> Default Load Balancing connection policy
>> Both nodes addresses are provided to Hector so according to its
>> connection policy, the client should switch alternatively between both
>> nodes
>>
>
> OK, so is only one
Htop is not the only tool for this . Cassandra will hit io bottlnecks before
cpu (on faster cpus) . A simple solution is to check the size of the data dir
on the boxes. If you have aprox the same size then cassandra is wrinting in the
whole cluster. Check how the data dir size changes when impor
On 04/24/2014 10:29 AM, DuyHai Doan wrote:
Client used = Hector 1.1-4
Default Load Balancing connection policy
Both nodes addresses are provided to Hector so according to its
connection policy, the client should switch alternatively between both nodes
OK, so is only one connection being
Hello Michael
RF = 1
Client used = Hector 1.1-4
Default Load Balancing connection policy
Both nodes addresses are provided to Hector so according to its connection
policy, the client should switch alternatively between both nodes
Regards
Duy Hai DOAN
On Thu, Apr 24, 2014 at 4:37 PM
g (20 chars) as partition key so in theory, the Murmur3 partitioner
should dispatch then evenly between both nodes...
Is there any existing JIRA related to load balancing issue with vnodes ?
vnode != node
Clients connect to nodes - real servers with a network connection.
Vnodes are in
s its CPU occupation up to 60% and the other only
around 10%
The massive insertion workload consist of random data and random string
(20 chars) as partition key so in theory, the Murmur3 partitioner should
dispatch then evenly between both nodes...
Is there any existing JIRA related to load
On Fri, Apr 26, 2013 at 3:48 AM, Sam Overton wrote:
> If that is the case then it means you accidentally started those three nodes
> with the default configuration (single-token) and then subsequently changed
> (num_tokens) and then joined them into the cluster.
This would seem to be another reas
Decommissioning those nodes isn't a problem. When you say remove all the
data, I assume you mean rm -rf my data directory (the default
/var/lib/cassandra/data
I'd done this prior to starting up the nodes, because they were installed
with from the apt-get repo, which automatically starts cassandra
Some extra information you could provide which will help debug this: the
logs from those 3 nodes which have no data and the output of "nodetool ring"
Before seeing those I can only guess, but my guess would be that in the
logs on those 3 nodes you will see this: "Calculating new tokens" and this:
So, I had 7 nodes that I set up using vnodes, 256 tokens each, no problem.
I added two 512 token nodes, no problem, things seemed to balance.
The next 3 nodes I added, all at 256 tokens, and they have a cumulative
load of 116mb (where as the other nodes are at ~100GB and ~200GB (256 and
512 respe
Hector provides load balancing so that requests can be distributed across
cluster nodes based on a specified policy, like round robin. Is there
anything similar planned for CQL? I see that there is an open issue (
http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/issues/detail?id=41)
to
il over?
Thanks
/Roshan
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-Hector-failover-load-balancing-not-as-expected-with-version-1-0-5-tp7581380.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
Check nodetool ring to see what state the nodes are in, they all need to be UP.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 15/03/2012, at 7:31 PM, Rishabh Agrawal wrote:
> Hello,
>
> I initially had two node cluster and now I joined
Hello,
I initially had two node cluster and now I joined another node by keeping
intial_token property blank and auto_bootstrap: True. So it took one of the
node and shared its half of the load. Now when I run Nodetool ring I get
ownership as 25%,25% and 50% which is correct but I wish to make
r example, can cassandra nodes be placed behind a Citrix Netscaler
>> hardware load balancer?
>>
>> I can't imagine it being a problem, but in doing so would you break any
>> cassandra functionality?
>>
>> The goal is to have the application talk to a single virtu
ality?
>
> The goal is to have the application talk to a single virtual ip and be
> directed to a random node in the cluster.
>
> I heard a little about adding the node addresses to Hector's load-balancing
> mechanism, but this doesn't seem too robust or easy to maintain.
>
> Thanks in advance.
>
; I heard a little about adding the node addresses to Hector's load-balancing
> mechanism, but this doesn't seem too robust or easy to maintain.
>
> Thanks in advance.
You will loose part of the retry / fallback functionality offered by hector.
The job of the client lib is not only load-balancing. I.e. if a node is
bootstrapping it will accept TCP connections but throw an exception which will
be communicated via thrift. The client lib is supposed to handle
oblem, but in doing so would you break any
cassandra functionality?
The goal is to have the application talk to a single virtual ip and be
directed to a random node in the cluster.
I heard a little about adding the node addresses to Hector's load-balancing
mechanism, but this doesn'
://www.slideshare.net/benjaminblack/cassandra-summit-2010-operations-troubleshooting-intro
b
On Tue, Nov 16, 2010 at 1:56 PM, Brayton Thompson wrote:
> .7 beta 2 here
> I've been reading about load balancing and some sites seem to imply that
> using the random partitioner will ke
even distribution. The best approach is to manually select them using approach linked above. AOn 17 Nov, 2010,at 10:56 AM, Brayton Thompson wrote:.7 beta 2 here
I've been reading about load balancing and some sites seem to imply that using the random partitioner will keeps your nodes fairly
.7 beta 2 here
I've been reading about load balancing and some sites seem to imply that using
the random partitioner will keeps your nodes fairly well balanced. I am
using a 3 node cluster. 1 seed and two others with AutoBootstrap on.
Now i have read that autobootstrap can leave your
Yes. I even tried just starting one node only, and then bootstrapping
another node. (However, at the beginning a few days ago, the cluster was
unstable and unresponsive and I had to restart the cluster. Maybe something
went wrong back then.)
Anyway, I will export all the data, and reimport it with
Not sure if this is the cause, but do all of your nodes have the same seed
list? Did you bring up the seeds first?
- Tyler
On Wed, Oct 27, 2010 at 1:46 PM, Thibaut Britz <
thibaut.br...@trendiction.com> wrote:
> Depending on the range I choose, choosing manually a token will also fail.
> (node
Depending on the range I choose, choosing manually a token will also fail.
(node will never exit boostrap, streams doesn't list any open streams)
INFO [Thread-53] 2010-10-27 20:33:37,399 SSTableReader.java (line 120)
Sampling index for /hd2/cassandra/data/table_xyz/table_xyz-3-Data.db
INFO [Thr
Hello Tyler,
thanksf or the quick answer. That's true, I should have noticed.
I also tried kicking out one node, clearing all directories and then
restarting it with the bootstrap option. It received a few files, but just
set there in bootstrapping mode (streams always printed bootstrapping
witho
With OrderPreservingPartitioner, you have to keep the ring balanced
manually.
This is why people frequently suggest that you use RandomPartitioner unless
you absolutely have to do otherwise. With OPP, keys are *not* evenly
distributed
around the ring.
Apparently you have lots of keys that are bet
dataset will need to be moved), now that is a lot of data movement!!
unless I have got this wrong?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Question-on-load-balancing-in-a-cluster-tp5375140p5389827.html
Sent from the cassandra-u
; move them.
>>
>> As for the need to decommission I'm guessing it's for reasons such as
>> making it easier to avoid overlapping tokens and to avoid accepting writes
>> that will soon be moved.
>>
>> Others may be able to add more.
>>
>> Aa
ing writes
> that will soon be moved.
>
> Others may be able to add more.
>
> Aaron
>
>
> On 5 Aug 2010, at 14:49, anand_s wrote:
>
> >
> > Hi,
> >
> > Have some thoughts on load balancing on current / new nodes. I have come
> > across some p
ites
that will soon be moved.
Others may be able to add more.
Aaron
On 5 Aug 2010, at 14:49, anand_s wrote:
>
> Hi,
>
> Have some thoughts on load balancing on current / new nodes. I have come
> across some posts around this, but not sure of what is being finally
> propos
Hi,
Have some thoughts on load balancing on current / new nodes. I have come
across some posts around this, but not sure of what is being finally
proposed, so..
>From what I have read, a nodebalance on a node does a decommission and
bootstrap of that node. Is there a reason why it is that
I haven't used ELB, but I've setup HAProxy to do it... appears to work well
so far.
Dave Viner
On Tue, Jul 13, 2010 at 3:30 PM, Brian Helfrich wrote:
> Hi, has anyone been able to load balance a Cassandra cluster with an AWS
> Elastic Load Balancer? I've setup an ELB with the obvious settings (
Hi, has anyone been able to load balance a Cassandra cluster with an AWS
Elastic Load Balancer? I've setup an ELB with the obvious settings (namely,
--listener "lb-port=9160,instance-port=9160,protocol=TCP") but client's
simply hang trying to load records from the ELB hostname:9160.
Thanks,
--Bria
Mubarak Seyed apple.com> writes:
>
> - How does client (application) connect to cassandra cluster? Is it always for
one node (and thrift can get ring info) and send the request to connected node
This depends on client library you use. Any cassandra node can accept client
connections and forward
HI All,
I have a requirement that we have around 100 application server instances and
all needs to write/read data from cassandra's cluster, the write data rate is
around 300k records per instance (approximately 30 millions for 100 instances).
- How does client (application) connect to cassandr
Ahh, I think this is the key section I missed:
"you can still have imbalances if your Tokens do not divide up the
range evenly, so you should specify InitialToken to your first nodes
as i * (2**127 / N) for i = 1 .. N."
I'm going to reset my cluster with initial tokens like that. Thanks!
-Mike
The sections on ring management and token selection on
http://wiki.apache.org/cassandra/Operations will help.
On Fri, Jun 4, 2010 at 2:27 PM, Mike Subelsky wrote:
> Hello everyone,
>
> One of my nodes has a much higher load (10x) than the other two nodes.
> I don't think it's because a few keys
Hello everyone,
One of my nodes has a much higher load (10x) than the other two nodes.
I don't think it's because a few keys have a lot more columns than
others -- the keys are well distributed and I'm using the random
partitioner.
Could someone point me in the direction of what should I be chec
That means they are blocking for something to be added to the task queue
On Mon, May 17, 2010 at 9:42 AM, Joost Ouwerkerk wrote:
> At any given moment at least half of those threads are in the following
> state; what does it represent?
> Name: ROW-READ-STAGE:6
> State: WAITING on
> java.util.conc
At any given moment at least half of those threads are in the following
state; what does it represent?
Name: ROW-READ-STAGE:6
State: WAITING on
java.util.concurrent.locks.abstractqueuedsynchronizer$conditionobj...@fea6030
Total blocked: 44 Total waited: 479
Stack trace:
sun.misc.Unsafe.park(Nati
On Sun, May 16, 2010 at 2:52 PM, Joost Ouwerkerk wrote:
> Meanwhile. I'm still getting TimedOutException errors when mapping this
> 30-million row table, even when retrieving no data at all. It looks like it
> is related to disk activity on "hot" nodes (when the same cassandra node has
> to handl
Hadoop doesn't make any assumptions about how input source data is
distributed. It can't 'know' that the data for the first 30 splits emitted
by the InputFormat are all stored on the same cassandra node.
The new case with the patch is CASSANDRA-1096
Meanwhile. I'm still getting TimedOutException
Oh, very interesting. I assumed Hadoop would be smart enough to
load-balance the jobs it sends out. Guess not.
Can you submit a patch?
On Wed, May 12, 2010 at 12:32 PM, Joost Ouwerkerk wrote:
> I've been trying to improve the time it takes to map 30 million rows using a
> hadoop / cassandra cl
I've been trying to improve the time it takes to map 30 million rows using a
hadoop / cassandra cluster with 30 nodes. I discovered that since
CassandraInputFormat returns an ordered list of splits, when there are many
splits (e.g. hundreds or more) the load on cassandra is horribly unbalanced.
e
gt;>
>> On Thu, Mar 25, 2010 at 1:20 PM, Y Aw wrote:
>> > Hi all,
>> > I have a question about load-balancing.
>>
>> http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to
>>
>> Does that help?
>
>
Yes it does...
Is there an easy way to know if a node is down or cannot reply to queries (a
simple telnet command) ?
2010/3/25 Jeremy Dunck
> On Thu, Mar 25, 2010 at 1:20 PM, Y Aw wrote:
> > Hi all,
> > I have a question about load-balancing.
>
> http://wiki.apa
On Thu, Mar 25, 2010 at 1:20 PM, Y Aw wrote:
> Hi all,
> I have a question about load-balancing.
http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to
Does that help?
Hi all,
I have a question about load-balancing.
I have easily built a cluster with two nodes, but I am wondering how my
client should connect to this cluster.
- Run queries against one node (but all data will transit through this node
and this way creates a SPOF)
- Run queries against an
70 matches
Mail list logo