Re: Long GC due to promotion failures

2014-01-22 Thread John Watson
read cfhistograms, but never really > understood fully. Anyone care to explain using OP attached cfhistogram ? > > Taking a wild shot, perhaps trying different build, oracle jdk 1.6u25 > perhaps? > > HTH > > Jason > > > > > On Tue, Jan 21, 2014 at 4:02 PM, John Wats

Re: Long GC due to promotion failures

2014-01-22 Thread John Watson
DSE that > prevents effective old gen collection in some cases. The flag's low > overhead, and very effective if that's your problem too. > > Cheers, > Lee > > > On Tue, Jan 21, 2014 at 12:02 AM, John Watson wrote: >> >> Pretty reliable, at some point, n

Long GC due to promotion failures

2014-01-21 Thread John Watson
Pretty reliable, at some point, nodes will have super long GCs. Followed by https://issues.apache.org/jira/browse/CASSANDRA-6592 Lovely log messages: 9030.798: [ParNew (0: promotion failure size = 4194306) (2: promotion failure size = 4194306) (4: promotion failure size = 4194306) (promotion

hot sstables evicted from page cache on compaction causing high latency

2013-07-12 Thread John Watson
Having a real issue where at the completion of large compactions, it will evict hot sstables from the kernel page cache causing huge read latency while it is backfilled. https://dl.dropboxusercontent.com/s/149h7ssru0dapkg/Screen%20Shot%202013-07-12%20at%201.46.19%20PM.png Blue line - page cache G

Compaction causing OutOfHeap

2013-05-26 Thread John Watson
Having (2) 1.2.5 nodes constantly crashing due to OutOfHeap errors. It always happens when the same large compaction is about to finish (they re-run the same compaction after restarting.) An indicator is CMS GC time of 3-5s (and the many related problems felt throughout the rest of the cluster)

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-29 Thread John Watson
show the output from nodetool status so we can get a feel for the > ring? > Can you include the logs from one of the nodes that failed to join ? > > Thanks > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http:/

Re: setcompactionthroughput and setstreamthroughput have no effect

2013-04-29 Thread John Watson
Same behavior on 1.1.3, 1.1.5 and 1.1.9. Currently: 1.2.3 On Mon, Apr 29, 2013 at 11:43 AM, Robert Coli wrote: > On Sun, Apr 28, 2013 at 2:28 PM, John Watson wrote: > > Running these 2 commands are noop IO wise: > > nodetool setcompactionthroughput 0 > > nodetool

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-29 Thread John Watson
will end up with > 256/(256+N) of the data (almost all of it). > > > > On 28 April 2013 23:01, John Watson wrote: > >> On Sun, Apr 28, 2013 at 2:19 PM, aaron morton wrote: >> >>> We're going to try running a shuffle before adding a new node again... >&

Re: cassandra-shuffle time to completion and required disk space

2013-04-29 Thread John Watson
o a rolling > bootstrap/decommission. You would set num_tokens on the existing hosts (and > restart them) so that they split their ranges, then bootstrap in N new > hosts, then decommission the old ones. > > > > On 28 April 2013 22:21, John Watson wrote: > >> The

Re: cassandra-shuffle time to completion and required disk space

2013-04-28 Thread John Watson
rton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 29/04/2013, at 9:21 AM, John Watson wrote: > > The amount of time/space cassandra-shuffle requires when upgrading to > using vnodes should really be apparent in docume

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-28 Thread John Watson
I believe that "nodetool rebuild" is used to add a new datacenter, not > just a new host to an existing cluster. Is that what you ran to add the > node? > > -Bryan > > > > On Fri, Apr 26, 2013 at 1:27 PM, John Watson wrote: > >> Small relief we

Re: setcompactionthroughput and setstreamthroughput have no effect

2013-04-28 Thread John Watson
have set streamthroughput higher and seen node join improvements. The > features do work however they are probably not your limiting factor. > Remember for stream you are setting Mega Bytes per second but network cards > are measured in Mega Bits per second. > > > On Sun, Apr 28, 2013 at

setcompactionthroughput and setstreamthroughput have no effect

2013-04-28 Thread John Watson
Running these 2 commands are noop IO wise: nodetool setcompactionthroughput 0 nodetool setstreamtrhoughput 0 If trying to recover or rebuild nodes, it would be super helpful to get more than ~120mbit/s of streaming throughput (per session or ~500mbit total) and ~5% IO utilization in (8) 15k di

cassandra-shuffle time to completion and required disk space

2013-04-28 Thread John Watson
The amount of time/space cassandra-shuffle requires when upgrading to using vnodes should really be apparent in documentation (when some is made). Only semi-noticeable remark about the exorbitant amount of time is a bullet point in: http://wiki.apache.org/cassandra/VirtualNodes/Balance "Shuffling

Re: Adding nodes in 1.2 with vnodes requires huge disks

2013-04-26 Thread John Watson
t; ran "nodetool cleanup" on the older nodes, the situation was the same. > > The problem only seemed to disappear when "nodetool repair" was applied to > all nodes. > > Regards, > Francisco Sobral. > > > > > On Apr 25, 2013, at 4:57

Adding nodes in 1.2 with vnodes requires huge disks

2013-04-25 Thread John Watson
After finally upgrading to 1.2.3 from 1.1.9, enabling vnodes, and running upgradesstables, I figured it would be safe to start adding nodes to the cluster. Guess not? It seems when new nodes join, they are streamed *all* sstables in the cluster. https://dl.dropbox.com/s/bampemkvlfck2dt/Screen%20S

Re: 1.1.9 to 1.2.3 upgrade issue

2013-04-18 Thread John Watson
ew Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 15/04/2013, at 9:20 AM, John Watson wrote: > > Started doing a rolling upgrade of nodes from 1.1.9 to 1.2.3 and nodes on > 1.1.9 started flooding this error: > > Exception in thread Thre

1.1.9 to 1.2.3 upgrade issue

2013-04-14 Thread John Watson
Started doing a rolling upgrade of nodes from 1.1.9 to 1.2.3 and nodes on 1.1.9 started flooding this error: Exception in thread Thread[RequestResponseStage:19496,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.service.AbstractRowResolver.preprocess(Abstra