In the future you may find SASI indexes useful for indexing Cassandra data.
Shameless blog post plug:
http://rustyrazorblade.com/2016/02/cassandra-secondary-index-preview-1/
Deep technical dive: http://www.doanduyhai.com/blog/?p=2058
On Thu, Aug 4, 2016 at 11:45 AM Kevin Burton wrote:
> BTW. we
BTW. we think we tracked this down to using large partitions to implement
inverted indexes. C* just doesn't do a reasonable job at all with large
partitions so we're going to migrate this use case to using Elasticsearch
On Wed, Aug 3, 2016 at 1:54 PM, Ben Slater
wrote:
> Yep, that was what I w
Yep, that was what I was referring to.
On Thu, 4 Aug 2016 2:24 am Reynald Bourtembourg <
reynald.bourtembo...@esrf.fr> wrote:
> Hi,
>
> Maybe Ben was referring to this issue which has been mentioned recently on
> this mailing list:
> https://issues.apache.org/jira/browse/CASSANDRA-11887
>
> Chee
Have you tried using the G1 garbage collector instead of CMS?
We had the same issues that things were normally fine, but as soon as
something extraordinary happened, a node could go into GC hell and never
recover, and that could then spread to other nodes as they took up the
slack, trapping them i
We usually use 100 per every 5 minutes.. but you're right. We might
actually move this use case over to using Elasticsearch in the next couple
of weeks.
On Wed, Aug 3, 2016 at 11:09 AM, Jonathan Haddad wrote:
> Kevin,
>
> "Our scheme uses large buckets of content where we write to a
> bucket/pa
Kevin,
"Our scheme uses large buckets of content where we write to a
bucket/partition for 5 minutes, then move to a new one."
Are you writing to a single partition and only that partition for 5
minutes? If so, you should really rethink your data model. This method
does not scale as you add node
Hi,
Maybe Ben was referring to this issue which has been mentioned recently
on this mailing list:
https://issues.apache.org/jira/browse/CASSANDRA-11887
Cheers,
Reynald
On 03/08/2016 18:09, Romain Hardouin wrote:
>Curious why the 2.2 to 3.x upgrade path is risky at best.
I guess that upgrade
> Curious why the 2.2 to 3.x upgrade path is risky at best. I guess that
>upgrade from 2.2 is less tested by DataStax QA because DSE4 used C* 2.1, not
>2.2.I would say the safest upgrade is 2.1 to 3.0.x
Best,
Romain
DuyHai. Yes. We're generally happy with our disk throughput. We're on
all SSD and have about 60 boxes. The amount of data written isn't THAT
much. Maybe 5GB max... but its over 60 boxes.
On Wed, Aug 3, 2016 at 3:49 AM, DuyHai Doan wrote:
> On a side node, do you monitor your disk I/O to s
Curious why the 2.2 to 3.x upgrade path is risky at best. Do you mean that
this is just for OUR use case since we're having some issues or that the
upgrade path is risky in general?
On Wed, Aug 3, 2016 at 3:41 AM, Ben Slater
wrote:
> Yes, looks like you have a (at least one) 100MB partition whic
On a side node, do you monitor your disk I/O to see whether the disk
bandwidth can catch up with the huge spikes in write ? Use dstat during the
insert storm to see if you have big values for CPU wait
On Wed, Aug 3, 2016 at 12:41 PM, Ben Slater
wrote:
> Yes, looks like you have a (at least one)
Yes, looks like you have a (at least one) 100MB partition which is big
enough to cause issues. When you do lots of writes to the large partition
it is likely to end up getting compacted (as per the log) and compactions
often use a lot of memory / cause a lot of GC when they hit large
partitions. Th
I have a theory as to what I think is happening here.
There is a correlation between the massive content all at once, and our
outags.
Our scheme uses large buckets of content where we write to a
bucket/partition for 5 minutes, then move to a new one. This way we can
page through buckets.
I thin
We have a 60 node CS cluster running 2.2.7 and about 20GB of RAM allocated
to each C* node. We're aware of the recommended 8GB limit to keep GCs low
but our memory has been creeping up (probably) related to this bug.
Here's what we're seeing... if we do a low level of writes we think
everything g
14 matches
Mail list logo