Thanks for the confirmation this is NOT the way to go. I will stick with 4
disks raid 0 with a single data directory.
On Mon, Aug 23, 2010 at 9:24 PM, Rob Coli wrote:
> On 8/22/10 12:00 AM, Wayne wrote:
>
>> Due to compaction being so expensive in terms of disk resources, does it
>> make more s
On 8/22/10 12:00 AM, Wayne wrote:
Due to compaction being so expensive in terms of disk resources, does it
make more sense to have 2 data volumes instead of one? We have 4 data
disks in raid 0, would this make more sense to be 2 x 2 disks in raid 0?
That way the reader and writer I assume would a
On Sun, Aug 22, 2010 at 2:03 PM, Wayne wrote:
> From a testing whether cassandra can take the load long term I do not see it
> as different. Yes bulk loading can be made faster using very different
Then you need far more IO, whether it comes form faster drives or more
nodes. If you can achieve 1
>From a testing whether cassandra can take the load long term I do not see it
as different. Yes bulk loading can be made faster using very different
methods, but my purpose is to test cassandra with a large volume of writes
(and not to bulk load as efficiently as possible). I have scaled back to 5
Wayne,
Bulk loading this much data is a very different prospect from needing
to sustain that rate of updates indefinitely. As was suggested
earlier, you likely need to tune things differently, including
disabling minor compactions during the bulk load, to make this work
efficiently.
b
On Sun,
Has anyone loaded 2+ terabytes of real data in one stretch into a cluster
without bulk loading and without any problems? How long did it take? What
kind of nodes were used? How many writes/sec/node can be sustained for 24+
hours?
On Sun, Aug 22, 2010 at 8:22 PM, Peter Schuller wrote:
> I only
I only sifted recent history of this thread (for time reasons), but:
> You have started a major compaction which is now competing with those
> near constant minor compactions for far too little I/O (3 SATA drives
> in RAID0, perhaps?). Normally, this would result in a massive
> ballooning of your
Is the need for 10k/sec/node just for bulk loading of data or is it
how your app will operate normally? Those are very different things.
On Sun, Aug 22, 2010 at 4:11 AM, Wayne wrote:
> Currently each node has 4x1TB SATA disks. In MySQL we have 15tb currently
> with no replication. To move this t
On Sun, Aug 22, 2010 at 7:11 AM, Wayne wrote:
> Currently each node has 4x1TB SATA disks. In MySQL we have 15tb currently
> with no replication. To move this to Cassandra replication factor 3 we need
> 45TB assuming the space usage is the same, but it is probably more. We had
> assumed a 30 node c
Currently each node has 4x1TB SATA disks. In MySQL we have 15tb currently
with no replication. To move this to Cassandra replication factor 3 we need
45TB assuming the space usage is the same, but it is probably more. We had
assumed a 30 node cluster with 4tb per node would suffice with head room f
I see no reason to make that assumption. Cassandra currently has no
mechanism to alternate in that manner. At the update rate you
require, you just need more disk io (bandwidth and iops).
Alternatively, you could use a bunch more, smaller nodes with the same
SATA RAID setup so they each take many
Due to compaction being so expensive in terms of disk resources, does it
make more sense to have 2 data volumes instead of one? We have 4 data disks
in raid 0, would this make more sense to be 2 x 2 disks in raid 0? That way
the reader and writer I assume would always be a different set of spindles
How much storage do you need? 240G SSDs quite capable of saturating a
3Gbps SATA link are $600. Larger ones are also available with similar
performance. Perhaps you could share a bit more about the storage and
performance requirements. How SSDs to sustain 10k writes/sec PER NODE
WITH LINEAR SCA
Thank you for the advice, I will try these settings. I am running defaults
right now. The disk subsystem is one SATA disk for commitlog and 4 SATA
disks in raid 0 for the data.
>From your email you are implying this hardware can not handle this level of
sustained writes? That kind of breaks down t
My guess is that you have (at least) 2 problems right now:
You are writing 10k ops/sec to each node, but have default memtable
flush settings. This is resulting in memtable flushing every 30
seconds (default ops flush setting is 300k). You thus have a
proliferation of tiny sstables and are seein
Perhaps I missed it in one of the earlier emails, but what is your
disk subsystem config?
On Sat, Aug 21, 2010 at 2:18 AM, Wayne wrote:
> I am already running with those options. I thought maybe that is why they
> never get completed as they keep pushed pushed down in priority? I am
> getting tim
I am already running with those options. I thought maybe that is why they
never get completed as they keep pushed pushed down in priority? I am
getting timeouts now and then but for the most part the cluster keeps
running. Is it normal/ok for the repair and compaction to take so long? It
has been o
yes, the AES is the repair.
if you are running linux, try adding the options to reduce compaction
priority from
http://wiki.apache.org/cassandra/PerformanceTuning
On Sat, Aug 21, 2010 at 3:17 AM, Wayne wrote:
> I could tell from munin that the disk utilization was getting crazy high,
> but the s
On Fri, 2010-08-20 at 19:17 +0200, Wayne wrote:
> WARN [MESSAGE-DESERIALIZER-POOL:1] 2010-08-20 16:57:02,602
> MessageDeserializationTask.java (line 47) dropping message
> (1,078,378ms past timeout)
> WARN [MESSAGE-DESERIALIZER-POOL:1] 2010-08-20 16:57:02,602
> MessageDeserializationTask.java (l
these warnings mean you have more requests queued up than you are able
to handle. that request queue is what is using up most of your heap
memory.
On Fri, Aug 20, 2010 at 12:17 PM, Wayne wrote:
> I turned off the creation of the secondary indexes which had the large rows
> and all seemed good. T
I deleted ALL data and reset the nodes from scratch. There are no more large
rows in there. 8-9megs MAX across all nodes. This appears to be a new
problem. I restarted the node in question and it seems to be running fine,
but I had to run repair on it as it appears to be missing a lot of data.
On
On Fri, Aug 20, 2010 at 1:17 PM, Wayne wrote:
> I turned off the creation of the secondary indexes which had the large rows
> and all seemed good. Thank you for the help. I was getting
> 60k+/writes/second on the 6 node cluster.
>
> Unfortunately again three hours later a node went down. I can not
I turned off the creation of the secondary indexes which had the large rows
and all seemed good. Thank you for the help. I was getting
60k+/writes/second on the 6 node cluster.
Unfortunately again three hours later a node went down. I can not even look
at the logs when it started since they are go
The NullPointerException does not crash the node. It only makes it flap/go
down a for short period and then it comes back up. I do not see anything
abnormal in the system log, only that single error in the cassandra.log.
On Thu, Aug 19, 2010 at 11:42 PM, Peter Schuller <
peter.schul...@infidyne.c
> Sorry; that meant the "set of data acually live (i.e., not garbage) in
> the heap". In other words, the amount of memory truly "used".
And to clarify further this is not the same as the 'used' reported by
GC statistics, except as printed after a CMS concurrent mark/sweep has
completed (and even
> What is my "live set"?
Sorry; that meant the "set of data acually live (i.e., not garbage) in
the heap". In other words, the amount of memory truly "used".
> Is the system CPU bound given the few statements
> below? This is from running 4 concurrent processes against the node...do I
> need to t
On Thu, Aug 19, 2010 at 4:49 PM, Wayne wrote:
> What is my "live set"? Is the system CPU bound given the few statements
> below? This is from running 4 concurrent processes against the node...do I
> need to throttle back the concurrent read/writers?
>
> I do all reads/writes as Quorum. (Replicatio
What is my "live set"? Is the system CPU bound given the few statements
below? This is from running 4 concurrent processes against the node...do I
need to throttle back the concurrent read/writers?
I do all reads/writes as Quorum. (Replication factor of 3).
The memtable threshold is the default o
> of a rogue large row is one I never considered. The largest row on the other
> nodes is as much as 800megs. I can not get a cfstats reading on the bad node
WIth 0.6 I can definitely see this being a problem if I understand its
behavior correctly (I have not actually used 0.6 even for testing). I
So, these:
> INFO [GC inspection] 2010-08-19 16:34:46,656 GCInspector.java (line 116) GC
> for ConcurrentMarkSweep: 41615 ms, 192522712 reclaimed leaving 8326856720
> used; max is 8700035072
[snip]
> INFO [GC inspection] 2010-08-19 16:36:00,786 GCInspector.java (line 116) GC
> for ConcurrentMark
On Thu, Aug 19, 2010 at 4:13 PM, Wayne wrote:
> We are using the random partitioner. The tokens we defined manually and data
> is almost totally equal among nodes, 15GB per node when the trouble started.
> System vitals look fine. CPU load is ~500% for java, iostats are low,
> everything for all p
On Thu, Aug 19, 2010 at 2:48 PM, Wayne wrote:
> I am having some serious problems keeping a 6 node cluster up and running
> and stable under load. Any help would be greatly appreciated.
>
> Basically it always comes back to OOM errors that never seem to subside.
> After 5 minutes or 3 hours of hea
32 matches
Mail list logo