Re: horizontal query scaling issues follow on

2014-07-23 Thread Benedict Elliott Smith
that is spread over multiple partitions, and so extra work needs to be done >>> cross-cluster to service your requests as more nodes are added. >>> >>> I would also consider what effect the file cache may be having on your >>> workload, as it sounds small enough to fit i

Re: horizontal query scaling issues follow on

2014-07-23 Thread Diane Griffith
ou try different >> client levels for the smaller cluster you may see improved performance as >> the data is pulled into file cache across test runs, and then when you >> build your larger cluster this is lost so performance appears to degrade >> (for instance). >> >

Re: horizontal query scaling issues follow on

2014-07-21 Thread Diane Griffith
So I appreciate all the help so far. Upfront, it is possible the schema and data query pattern could be contributing to the problem. The schema was born out of certain design requirements. If it proves to be part of what makes the scalability crumble, then I hope it will help shape the design re

Re: horizontal query scaling issues follow on

2014-07-21 Thread Robert Coli
On Sun, Jul 20, 2014 at 6:12 PM, Diane Griffith wrote: > I am running tests again across different number of client threads and > number of nodes but this time I tweaked some of the timeouts configured for > the nodes in the cluster. I was able to get better performance on the > nodes at 10 clie

Re: horizontal query scaling issues follow on

2014-07-21 Thread Jonathan Lacefield
Hello, Here is the documentation for cfhistograms, which is in microseconds. http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCFhisto.html Your question about setting timeouts is subjective, but you have set your timeout limits to 4 mins, which seems excessive. The

Re: horizontal query scaling issues follow on

2014-07-20 Thread Diane Griffith
I am running tests again across different number of client threads and number of nodes but this time I tweaked some of the timeouts configured for the nodes in the cluster. I was able to get better performance on the nodes at 10 client threads by upping 4 timeout values in cassandra.yaml to 24

Re: horizontal query scaling issues follow on

2014-07-18 Thread Diane Griffith
PM, Diane Griffith > wrote: > >> The column family schema is: >> >> CREATE TABLE IF NOT EXISTS foo (key text, col_name text, col_value text, >> PRIMARY KEY(key, col_name)) >> >> where the key is a generated uuid and all keys were inserted in random >> o

Re: horizontal query scaling issues follow on

2014-07-18 Thread Tyler Hobbs
On Fri, Jul 18, 2014 at 8:01 AM, Diane Griffith wrote: > > Partition Size (bytes) > 1109 bytes: 1800 > > Cell Count per Partition > 8 cells: 1800 > > meaning I can't glean anything about how it partitioned or if it broke a > key across partitions from this right? Does it mean for 180

Re: horizontal query scaling issues follow on

2014-07-18 Thread Diane Griffith
stering >>> columns, or does each row have a unique partition key and no clustering >>> columns. >>> >>> -- Jack Krupansky >>> >>> *From:* Diane Griffith >>> *Sent:* Thursday, July 17, 2014 6:21 PM >>> *To:* user >>> *Subjec

Re: horizontal query scaling issues follow on

2014-07-18 Thread Benedict Elliott Smith
your primary key and whether you >> are using a small number of partition keys and a large number of clustering >> columns, or does each row have a unique partition key and no clustering >> columns. >> >> -- Jack Krupansky >> >> *From:* Diane Griffith >

Re: horizontal query scaling issues follow on

2014-07-18 Thread Diane Griffith
g > columns. > > -- Jack Krupansky > > *From:* Diane Griffith > *Sent:* Thursday, July 17, 2014 6:21 PM > *To:* user > *Subject:* Re: horizontal query scaling issues follow on > > So do partitions equate to tokens/vnodes? > > If so we had configured all

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jonathan Haddad
The problem with starting without vnodes is moving to them is a bit hairy. In particular, nodetool shuffle has been reported to take an extremely long time (days, weeks). I would start with vnodes if you have any intent on using them. On Thu, Jul 17, 2014 at 6:03 PM, Robert Coli wrote: > On Thu

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
Krupansky From: Diane Griffith Sent: Thursday, July 17, 2014 1:33 PM To: user Subject: horizontal query scaling issues follow on This is a follow on re-post to clarify what we are trying to do, providing information that was missing or not clear. Goal: Verify horizontal scaling for

Re: horizontal query scaling issues follow on

2014-07-17 Thread Robert Coli
On Thu, Jul 17, 2014 at 5:16 PM, Diane Griffith wrote: > I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying. > Performance on 2 nodes starts to degrade from 10 clients on. I saw > similar behavior on 4 nodes but haven't done the official runs on that yet. > > Ok, if you'v

Re: horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
So I stripped out the number of clients experiment path information. It is unclear if I can only show horizontal scaling by also spawning many client requests all working at once. So that is why I stripped that information out to distill what our original attempt was at how to show horizontal sca

Re: horizontal query scaling issues follow on

2014-07-17 Thread Robert Coli
On Thu, Jul 17, 2014 at 3:21 PM, Diane Griffith wrote: > So do partitions equate to tokens/vnodes? > A partition is what used to be called a "row". Each individual token in the token ring can contain a partition, which you request using the token as the key. A "token range" is the space betwee

Re: horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
handle more data – more token values > or partitions.) > > -- Jack Krupansky > > *From:* Diane Griffith > *Sent:* Thursday, July 17, 2014 1:33 PM > *To:* user > *Subject:* horizontal query scaling issues follow on > > > This is a follow on re-post to clarify wha

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
a single partition would certainly not be a test of “horizontal scaling” (adding nodes to handle more data – more token values or partitions.) -- Jack Krupansky From: Diane Griffith Sent: Thursday, July 17, 2014 1:33 PM To: user Subject: horizontal query scaling issues follow on This is a

horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
This is a follow on re-post to clarify what we are trying to do, providing information that was missing or not clear. Goal: Verify horizontal scaling for random non duplicating key reads using the simplest configuration (or minimal configuration) possible. Background: A couple years ago we