that is spread over multiple partitions, and so extra work needs to be done
>>> cross-cluster to service your requests as more nodes are added.
>>>
>>> I would also consider what effect the file cache may be having on your
>>> workload, as it sounds small enough to fit i
ou try different
>> client levels for the smaller cluster you may see improved performance as
>> the data is pulled into file cache across test runs, and then when you
>> build your larger cluster this is lost so performance appears to degrade
>> (for instance).
>>
>
So I appreciate all the help so far. Upfront, it is possible the schema
and data query pattern could be contributing to the problem. The schema
was born out of certain design requirements. If it proves to be part of
what makes the scalability crumble, then I hope it will help shape the
design re
On Sun, Jul 20, 2014 at 6:12 PM, Diane Griffith
wrote:
> I am running tests again across different number of client threads and
> number of nodes but this time I tweaked some of the timeouts configured for
> the nodes in the cluster. I was able to get better performance on the
> nodes at 10 clie
Hello,
Here is the documentation for cfhistograms, which is in microseconds.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCFhisto.html
Your question about setting timeouts is subjective, but you have set your
timeout limits to 4 mins, which seems excessive.
The
I am running tests again across different number of client threads and
number of nodes but this time I tweaked some of the timeouts configured for
the nodes in the cluster. I was able to get better performance on the
nodes at 10 client threads by upping 4 timeout values in cassandra.yaml to
24
PM, Diane Griffith
> wrote:
>
>> The column family schema is:
>>
>> CREATE TABLE IF NOT EXISTS foo (key text, col_name text, col_value text,
>> PRIMARY KEY(key, col_name))
>>
>> where the key is a generated uuid and all keys were inserted in random
>> o
On Fri, Jul 18, 2014 at 8:01 AM, Diane Griffith
wrote:
>
> Partition Size (bytes)
> 1109 bytes: 1800
>
> Cell Count per Partition
> 8 cells: 1800
>
> meaning I can't glean anything about how it partitioned or if it broke a
> key across partitions from this right? Does it mean for 180
stering
>>> columns, or does each row have a unique partition key and no clustering
>>> columns.
>>>
>>> -- Jack Krupansky
>>>
>>> *From:* Diane Griffith
>>> *Sent:* Thursday, July 17, 2014 6:21 PM
>>> *To:* user
>>> *Subjec
your primary key and whether you
>> are using a small number of partition keys and a large number of clustering
>> columns, or does each row have a unique partition key and no clustering
>> columns.
>>
>> -- Jack Krupansky
>>
>> *From:* Diane Griffith
>
g
> columns.
>
> -- Jack Krupansky
>
> *From:* Diane Griffith
> *Sent:* Thursday, July 17, 2014 6:21 PM
> *To:* user
> *Subject:* Re: horizontal query scaling issues follow on
>
> So do partitions equate to tokens/vnodes?
>
> If so we had configured all
The problem with starting without vnodes is moving to them is a bit
hairy. In particular, nodetool shuffle has been reported to take an
extremely long time (days, weeks). I would start with vnodes if you
have any intent on using them.
On Thu, Jul 17, 2014 at 6:03 PM, Robert Coli wrote:
> On Thu
Krupansky
From: Diane Griffith
Sent: Thursday, July 17, 2014 1:33 PM
To: user
Subject: horizontal query scaling issues follow on
This is a follow on re-post to clarify what we are trying to do, providing
information that was missing or not clear.
Goal: Verify horizontal scaling for
On Thu, Jul 17, 2014 at 5:16 PM, Diane Griffith
wrote:
> I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying.
> Performance on 2 nodes starts to degrade from 10 clients on. I saw
> similar behavior on 4 nodes but haven't done the official runs on that yet.
>
>
Ok, if you'v
So I stripped out the number of clients experiment path information. It is
unclear if I can only show horizontal scaling by also spawning many client
requests all working at once. So that is why I stripped that information
out to distill what our original attempt was at how to show horizontal
sca
On Thu, Jul 17, 2014 at 3:21 PM, Diane Griffith
wrote:
> So do partitions equate to tokens/vnodes?
>
A partition is what used to be called a "row".
Each individual token in the token ring can contain a partition, which you
request using the token as the key.
A "token range" is the space betwee
handle more data – more token values
> or partitions.)
>
> -- Jack Krupansky
>
> *From:* Diane Griffith
> *Sent:* Thursday, July 17, 2014 1:33 PM
> *To:* user
> *Subject:* horizontal query scaling issues follow on
>
>
> This is a follow on re-post to clarify wha
a single partition would certainly not be a test of
“horizontal scaling” (adding nodes to handle more data – more token values or
partitions.)
-- Jack Krupansky
From: Diane Griffith
Sent: Thursday, July 17, 2014 1:33 PM
To: user
Subject: horizontal query scaling issues follow on
This is a
This is a follow on re-post to clarify what we are trying to do, providing
information that was missing or not clear.
Goal: Verify horizontal scaling for random non duplicating key reads using
the simplest configuration (or minimal configuration) possible.
Background:
A couple years ago we
19 matches
Mail list logo