Re: help turning compaction..hours of run to get 0% compaction....

2013-01-08 Thread B. Todd Burruss
i'll second edward's comment. cassandra is designed to scale horizontally, so if disk I/O is slowing you down then you must scale On Tue, Jan 8, 2013 at 7:10 AM, Jim Cistaro wrote: > One metric to watch is pending compactions (via nodetool > compactionstats). This count will give you some id

Re: when are keyspace dirs removed?

2013-01-04 Thread B. Todd Burruss
Their should be shapshots in there > https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L402 > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 4/01/

Re: TTL on SecondaryIndex Columns. A bug?

2012-12-19 Thread B. Todd Burruss
i believe we have hit this as well. if you use nodetool to rebuild_index, does it work? On Wed, Dec 19, 2012 at 8:10 PM, aaron morton wrote: > Well that was fun https://issues.apache.org/jira/browse/CASSANDRA-5079 > > Just testing my idea of a fix now. > > Cheers > - > Aaron Mort

Re: entire range of node out of sync -- out of the blue

2012-12-19 Thread B. Todd Burruss
che.org/jira/browse/CASSANDRA-5041 > TBH i think this was a repair without -pr > > thanks, > Andras > > Andras Szerdahelyi* > *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A > M: +32 493 05 50 88 | Skype: sandrew84 > > > > > > On 18 Dec 2

Re: Moving data from one datacenter to another

2012-12-19 Thread B. Todd Burruss
to get it "correct", meaning consistent, it seems you will need to do a repair no matter what since the source cluster is taking writes during this time and writing to commit log. so to avoid filename issues just do the first copy and then repair. i am not sure if they can have any filename. to

Re: Does a scrub remove deleted/expired columns?

2012-12-19 Thread B. Todd Burruss
i will add that we have had a good experience with leveled compaction cleaning out tombstoned data faster than size tiered, therefore keeping our total disk usage much more reasonable than size tiered. it is at the cost of I/O ... maybe 2X the I/O?? but that is not bothering us. what is bothering

Re: entire range of node out of sync -- out of the blue

2012-12-18 Thread B. Todd Burruss
in your data directory, for each keyspace there is a solr.json. cassandra stores the SSTABLEs it knows about when using leveled compaction. take a look at that file and see if it looks accurate. if not, this is a bug with cassandra that we are checking into as well On Thu, Dec 6, 2012 at 7:38

Re: Query regarding SSTable timestamps and counts

2012-12-10 Thread B. Todd Burruss
my two cents ... i know this thread is a bit old, but the fact that odd-sized SSTABLEs (usually large ones) will hang around for a while can be very troublesome on disk space and planning. our data is temporal in cassandra, being deleted constantly. we have seen space usage in the 1+ TB range whe

CQL timestamps and timezones

2012-12-07 Thread B. Todd Burruss
trying to figure out if i'm doing something wrong or a bug. i am creating a simple schema, inserting a timestamp using ISO8601 format, but when retrieving the timestamp, the timezone is displayed incorrectly. i'm inserting using GMT, the result is shown with "+", but the time is for my local

Re: removing SSTABLEs

2012-11-12 Thread B. Todd Burruss
tion. >> >> >> On Mon, Nov 12, 2012 at 12:09 PM, B. Todd Burruss wrote: >>> >>> if i stop a node and remove an SSTABLE, let's call it X, is that safe? >>> >>> ok, more info. i know that the data in SSTABLE X has been tombstoned >>>

Re: Multiple Clusters Keyspacse to one core cluster

2012-11-11 Thread B. Todd Burruss
with NetworkTopologyStrategy it theoretically should work http://www.datastax.com/docs/1.0/cluster_architecture/replication On Thu, Nov 8, 2012 at 5:11 PM, ws wrote: > If I have multiple clusters can I replicate a keyspace from each of those > cluster to separate cluster? > >

Re: Replication factor and performance questions

2012-11-10 Thread B. Todd Burruss
@oleg, to answer your last question a cassandra node should never ask another node for information it doesn't have. it uses the key and the partitioner to determine where the data is located before ever contacting another node. On Mon, Nov 5, 2012 at 9:45 AM, Andrey Ilinykh wrote: > You will hav

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
, Nov 8, 2012 at 11:53 AM, B. Todd Burruss wrote: > thanks for the links! i had forgotten about live sampling > > On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams wrote: >> On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote: >>> There are also ways to bring up a test

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
thanks for the links! i had forgotten about live sampling On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams wrote: > On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote: >> There are also ways to bring up a test node and just run Level Compaction on >> that. Wish I had a URL handy, but hopefull

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
we are running Datastax enterprise and cannot patch it. how bad is "kill performance"? if it is so bad, why is it an option? On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar wrote: > Dne 8.11.2012 19:12, B. Todd Burruss napsal(a): > >> my question is would leveled compact

leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
we are having the problem where we have huge SSTABLEs with tombstoned data in them that is not being compacted soon enough (because size tiered compaction requires, by default, 4 like sized SSTABLEs). this is using more disk space than we anticipated. we are very write heavy compared to reads, an

Re: High bandwidth usage between datacenters for cluster

2012-11-01 Thread B. Todd Burruss
bryce, did you resolve this? i'm interested in the outcome. when you write does it help to use CL = LOCAL_QUORUM? On Mon, Oct 29, 2012 at 12:52 AM, aaron morton wrote: > Outbound messages for other DC's are grouped and a single instance is sent > to a single node in the remote DC. The remote no

Re: constant CMS GC using CPU time

2012-10-23 Thread B. Todd Burruss
Regarding memory usage after a repair ... Are the merkle trees kept around? On Oct 23, 2012 3:00 PM, "Bryan Talbot" wrote: > On Mon, Oct 22, 2012 at 6:05 PM, aaron morton wrote: > >> The GC was on-going even when the nodes were not compacting or running a >> heavy application load -- even when th

Re: nodetool cleanup

2012-10-23 Thread B. Todd Burruss
It is typically used after the token > assignments have been changed. > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 23/10/2012, at 6:42 PM, Will @ SOHO wrote: > > On 10/23/2012 01:

nodetool cleanup

2012-10-22 Thread B. Todd Burruss
does "nodetool cleanup" perform a major compaction in the process of removing unwanted data? i seem to remember this to be the case, but can't find anything definitive

Re: tombstones and their data

2012-10-22 Thread B. Todd Burruss
u get >> confirmation. >> >> Dean >> >> On 10/22/12 10:43 AM, "B. Todd Burruss" wrote: >> >>>if a node, X, has a tombstone marking deleted data, when can node X >>>remove the data - not the tombstone, but the data? i understand the >

tombstones and their data

2012-10-22 Thread B. Todd Burruss
if a node, X, has a tombstone marking deleted data, when can node X remove the data - not the tombstone, but the data? i understand the tombstone cannot be removed until GCGraceSeconds has passed, but it seems the data could be compacted away at any time.

Re: Issue removing rows

2012-10-13 Thread B. Todd Burruss
i have used StorageProxy and was forgetting to rewind (or otherwise setup my ByteBuffer properly) and was getting, i believe, the same error. check your ByteBuffers On Sat, Oct 13, 2012 at 8:49 AM, Nick Morizio wrote: > I'm wondering if anyone has seen this issue before: > > We are running Cassa

Re: Option for ordering columns by timestamp in CF

2012-10-12 Thread B. Todd Burruss
trying to think of a use case where you would want to order by timestamp, and also have unique column names for direct access. not really trying to challenge the use case, but you can get ordering by timestamp and still maintain a "name" for the column using composites. if the first component of t

Re: read performance plumetted

2012-10-12 Thread B. Todd Burruss
did the amount of data finally exceed your per machine RAM capacity? is it the same 20% each time you read? or do your periodic reads eventually work through the entire dataset? if you are essentially table scanning your data set, and the size exceeds available RAM, then a degradation like that i

Re: what is more important (RAM vs Cores)

2012-10-12 Thread B. Todd Burruss
i would not worry as much about the single machine specs. find the sweet spot on price for CPU and RAM and by that, then scale horizontal to meet your demand. but .. if i was pressed for a general statement -choose RAM over CPU On Fri, Oct 12, 2012 at 4:34 AM, Serge Fonville wrote: > It seems y

Re: Cassandra nodes loaded unequally

2012-10-12 Thread B. Todd Burruss
are you connecting to the same node every time? if so, spread out your connections across the ring On Fri, Oct 12, 2012 at 1:22 AM, Alexey Zotov wrote: > Hi Ben, > > I suggest you to compare amount of queries for each node. May be the problem > is on the client side. > Yoy can do that using JMX:

Re: 1.1.1 is "repair" still needed ?

2012-10-11 Thread B. Todd Burruss
as of 1.0 (CASSANDRA-2034) hints are generated for nodes that timeout. On Thu, Oct 11, 2012 at 3:55 AM, Watanabe Maki wrote: > Even if HH works fine, HH will not be created until the failure detector > marks the node is dead. > HH will not be created for partially timeouted mutation request ( b

Re: unbalanced ring

2012-10-10 Thread B. Todd Burruss
+972 54 8356490 > Fax: +972 2 5612956 > > > > > > On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss wrote: > >> major compaction in production is fine, however it is a heavy operation >> on the node and will take I/O and some CPU. >> >> the only time i

Re: Upgrading hardware on a node in a cluster

2012-10-10 Thread B. Todd Burruss
if you have N nodes in your cluster, add N new nodes using the new hardware, then decommision the old N nodes. (and migrate to VPC like dean said) On Wed, Oct 10, 2012 at 5:23 AM, Hiller, Dean wrote: > Well, you could use amazon VPC in which case you DO pick the IP yourself > ;)….it makes life

Re: unbalanced ring

2012-10-10 Thread B. Todd Burruss
major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like "nodetool movetoken". cassandra does not auto-delete data that it doesn't use anymore just

Re: cassandra 1.2 beta in production

2012-10-10 Thread B. Todd Burruss
https://issues.apache.org/jira/browse/CASSANDRA/fixforversion/12323284 On Wed, Oct 10, 2012 at 1:41 AM, Alexey Zotov wrote: > Hi Guys, > > What known critical bugs are there that couldn't allow to use 1.2 beta 1 in > production? > We don't use cql and secondary indexes. > > > -- > > Best regards

Re: Remove node from cluster and have it run as a single node cluster by itself

2012-10-05 Thread B. Todd Burruss
i believe the system keyspace keeps track of the cluster topology. even though you changed info in yaml, the system keyspace still knows about the other nodes. remove the system keyspace files from data dir and try again On Fri, Oct 5, 2012 at 4:47 AM, Fredrik wrote: > I guess that the other nod

Re: Another EOFException

2011-02-15 Thread B. Todd Burruss
the cache keys?" On Tue, Feb 15, 2011 at 1:10 PM, B. Todd Burruss wrote: the following exception seems to be about loading saved caches, but i don't really care about the cache so maybe isn't a big deal. anyway, this is with patched 0.7.1 (0001-Fix-bad-signed-conversion-from-byt

Another EOFException

2011-02-15 Thread B. Todd Burruss
the following exception seems to be about loading saved caches, but i don't really care about the cache so maybe isn't a big deal. anyway, this is with patched 0.7.1 (0001-Fix-bad-signed-conversion-from-byte-to-int.patch) WARN 11:07:59,800 error reading saved cache /data/cassandra-data/save

Re: ORM over Cassandra

2011-02-10 Thread B. Todd Burruss
wiki page is here ... https://github.com/rantav/hector/wiki/Hector-Object-Mapper-(HOM) it does not handle relationships between objects yet, but does handle inheritance On 02/10/2011 12:21 PM, Jonathan Ellis wrote: An o

Re: Cassandra events next week around Strata

2011-01-28 Thread B. Todd Burruss
web site says sold out, too bad for me ;) On 01/28/2011 07:01 PM, Jonathan Ellis wrote: Next week is the Strata conference and not one, not two, but five Cassandra events! In chronological order: 1. My Strata Cassandra tutorial Tuesday afternoon: http://strataconf.com/strata2011/public/schedul

0.7.1 release

2011-01-28 Thread B. Todd Burruss
any word on when to expect 0.7.1? lots of good fixes we need. trying to decide if i should apply patches or wait. thx!

Re: Secondary Index information

2011-01-28 Thread B. Todd Burruss
batch_mutate doesn't guarantee consistency. each mutation in the batch is guaranteed to be consistent based on your CL, but if it returns an error it means that it couldn't complete all mutations ... but the converse isn't true. it may have successfully completed some mutations. if you get a

Re: repair cause large number of SSTABLEs

2011-01-27 Thread B. Todd Burruss
files are marked as -tmp-? On Jan 27, 2011 9:00 AM, "B. Todd Burruss" <mailto:bburr...@real.com>> wrote: > ok thx. what about the repair creating hundreds of new sstables and > lsof showing cassandra using currently over 800 Data.db files? is this > normal? > > On

Re: repair cause large number of SSTABLEs

2011-01-27 Thread B. Todd Burruss
ok thx. what about the repair creating hundreds of new sstables and lsof showing cassandra using currently over 800 Data.db files? is this normal? On 01/27/2011 08:40 AM, Brandon Williams wrote: On Thu, Jan 27, 2011 at 10:21 AM, Todd Burruss > wrote: thx, but i

repair cause large number of SSTABLEs

2011-01-26 Thread B. Todd Burruss
i ran out of file handles on the "repairing node" after doing nodetool repair - strange as i have never had this issue until using 0.7.0 (but i should say that i have not truly tested 0.7.0 until now.) up'ed the number of file handles, removed data, restarted nodes, then restarted my test. wa

Re: monitoring with Zabbix

2011-01-10 Thread B. Todd Burruss
we use zabbix. we run the agent on our linux boxes and also start zapcat using the class that follows. essentially you go into the zabbix console and setup "hosts" for the zapcat port, and "hosts" for the zabbix agent. then setup items for the "zapcat host" that are JMX metrics. info on zap

Re: maven cassandra plugin

2011-01-06 Thread B. Todd Burruss
very useful for automated tasks that needs to run on multiple machines Shiy On 2011 1 6 21:38, "B. Todd Burruss" <mailto:bburr...@real.com>> wrote: has anyone created a maven plugin, like cargo for tomcat, for automating starting/stopping a cassandra instance?

Re: maven cassandra plugin

2011-01-06 Thread B. Todd Burruss
nonsense words and other nonsense are a direct result of using swype to type on the screen On 6 Jan 2011 19:38, "B. Todd Burruss" <mailto:bburr...@real.com>> wrote: > has anyone created a maven plugin, like cargo for tomcat, for automating > starting/stopping a cassandra instance?

maven cassandra plugin

2011-01-06 Thread B. Todd Burruss
has anyone created a maven plugin, like cargo for tomcat, for automating starting/stopping a cassandra instance?

cassandra.yaml customization per node

2010-12-30 Thread B. Todd Burruss
how are folks customizing the cassandra.yaml for each node in the cluster. specifically the token and IP address. with XML i used entities, but i'm not familiar with YAML. does yaml support the same concept? or any sort of textual substitution? thx

Exceptions in RowMutationVerbHandler

2010-12-15 Thread B. Todd Burruss
i am seeing several different exceptions across my 8 node cluster. running 0.7 RC2. the following are all from one node. is this a known issue? ERROR [MutationStage:35] 2010-12-15 09:25:06,466 RowMutationVerbHandler.java (line 83) Error in row mutation org.apache.cassandra.db.Unserializable

Re: hazelcast

2010-12-10 Thread B. Todd Burruss
.@gmail.com // sites http://twitter.com/germanklf http://ar.linkedin.com/in/germankondolf On Fri, Dec 10, 2010 at 2:50 PM, B. Todd Burruss wrote: http://www.hazelcast.com/product.jsp has anyone tested hazelcast as a distributed locking mechanism for java clients? seems very attractive on the surface.

hazelcast

2010-12-10 Thread B. Todd Burruss
http://www.hazelcast.com/product.jsp has anyone tested hazelcast as a distributed locking mechanism for java clients? seems very attractive on the surface.

Re: using too much RAM

2010-10-14 Thread B. Todd Burruss
thx, it does say that in the log, but that is probably just a reflection of whatever is read from cassandra.yaml. i am wondering if some unix tool can tell me if my process is mmap'ing files. maybe lsof? On 10/14/2010 12:07 PM, Rob Coli wrote: On 10/14/10 10:59 AM, B. Todd Burruss

using too much RAM

2010-10-14 Thread B. Todd Burruss
0.7.0-beta2 top is reporting my cassandra process as using 11g. i have set "disk_access_mode: standard" and Xmx8G (verified via JMX) i have only noticed using more RAM than Xmx when using mmap i/o. this leads me to believe that disk_access_mode was not set properly, even though it is in t

Re: Silent Crash

2010-10-13 Thread B. Todd Burruss
if it is actually corrupted). Do you know if compact or repair would detect bad data and disregard it? I'd like to try something like that if possible before just upgrading the JVM and potentially hiding the real problem. On Wed, Oct 13, 2010 at 9:35 PM, B. Todd Burruss <mailto:bbur

Re: Silent Crash

2010-10-13 Thread B. Todd Burruss
you should upgrade to the latest version of the JVM, 1.6.0_21 there was a bug around 1.6.0_18 (or there abouts) that affected cassandra On 10/13/2010 07:55 PM, Eric Czech wrote: And this is the java version: java version "1.6.0_13" Java(TM) SE Runtime Environment (build 1.6.0_13-b03) Java Hot

Re: getSchemaVersion

2010-10-11 Thread B. Todd Burruss
On 10/11/2010 06:14 PM, Jonathan Ellis wrote: On Mon, Oct 11, 2010 at 7:53 PM, B. Todd Burruss wrote: to determine if my programmatic schema changes have been distributed throughout the cluster, I am supposed to use getSchemaVersionMap, correct? my question is how do I properly use it? I

getSchemaVersion

2010-10-11 Thread B. Todd Burruss
to determine if my programmatic schema changes have been distributed throughout the cluster, I am supposed to use getSchemaVersionMap, correct? my question is how do I properly use it? I have the schema version returned from the thrift method, and I can lookup in the schema map returned getS

Re: Advice on settings

2010-10-07 Thread B. Todd Burruss
if you are updating columns quite rapidly, you will scatter the columns over many sstables as you update them over time. this means that a read of a specific column will require looking at more sstables to find the data. performing a compaction (using nodetool) will merge the sstables into on

Re: [RELEASE] 0.7.0 beta2

2010-10-01 Thread B. Todd Burruss
i don't see a beta2 subversion tag. is there one? On 10/01/2010 11:56 AM, Eric Evans wrote: It's like Christmas in October, but without the long lines. First, the obligatory disclaimer. This is beta software. It's like a teenage driver, it seems as though it's up to the task, and it almost i

Re: drop/recreate column family race condition

2010-09-07 Thread B. Todd Burruss
interesting is that "truncate" API doesn't return a schema version nor take a consistency level. does this mean that when it returns the cluster is always consistent? On 09/07/2010 02:50 PM, Jonathan Ellis wrote: On Tue, Sep 7, 2010 at 4:29 PM, B. Todd Burruss wrote:

Re: drop/recreate column family race condition

2010-09-07 Thread B. Todd Burruss
5 secs isn't enough for me, 10 is good. i haven't tried any other values as i can get around this through another manner. On 09/07/2010 02:24 PM, Edward Capriolo wrote: On Tue, Sep 7, 2010 at 5:10 PM, Jonathan Ellis wrote: On Tue, Sep 7, 2010 at 3:55 PM, B. Todd Burr

Re: drop/recreate column family race condition

2010-09-07 Thread B. Todd Burruss
https://issues.apache.org/jira/browse/CASSANDRA-1477 comments below On 09/07/2010 02:10 PM, Jonathan Ellis wrote: On Tue, Sep 7, 2010 at 3:55 PM, B. Todd Burruss wrote: using 0.7 latest from trunk as of few minutes ago. 1 client, 1 node i have the scenario where i want to drop a column

drop/recreate column family race condition

2010-09-07 Thread B. Todd Burruss
using 0.7 latest from trunk as of few minutes ago. 1 client, 1 node i have the scenario where i want to drop a column family and recreate it - unit testing for instance, is a good reason you may want to do this (always start fresh). the problem i observe is that if i do the following: 1 - d

RowMutationVerbHandler.java (line 78) Error in row mutation

2010-08-27 Thread B. Todd Burruss
i got the latest code this morning. i'm testing with 0.7 ERROR [ROW-MUTATION-STAGE:388] 2010-08-27 15:54:58,053 RowMutationVerbHandler.java (line 78) Error in row mutation org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1002 at org.apache.cassandra.db.Colu

Internal error processing batch_mutate

2010-08-27 Thread B. Todd Burruss
i got the latest code from tip of trunk this morning, but i'm seeing this. i recall a JIRA about this. maybe patch isn't on trunk? ERROR [pool-1-thread-61] 2010-08-27 15:55:36,429 Cassandra.java (line 2980) Internal error processing batch_mutate java.lang.NullPointerException at org.ap

Re: get_slice slow

2010-08-25 Thread B. Todd Burruss
if you can reduce the tombstone volume, say by switching to a new row every 5 minutes, that would help a lot. On Wed, Aug 25, 2010 at 11:43 AM, B. Todd Burruss wrote: i did check sstables, and there are only three. i haven't done any major compacts. do u think it is taking so long

Re: get_slice slow

2010-08-25 Thread B. Todd Burruss
Long rows written over long periods of time are almost certain to give worse read performance, even far worse, than rows written all at once. b On Tue, Aug 24, 2010 at 10:17 PM, B. Todd Burruss wrote: thx artie, i haven't used a super CF because i thought it has more trouble doing slic

Re: get_slice slow

2010-08-24 Thread B. Todd Burruss
el? Artie On Tue, Aug 24, 2010 at 9:14 PM, B. Todd Burruss <mailto:bburr...@real.com>> wrote: i am using get_slice to pull columns from a row to emulate a queue. column names are TimeUUID and the values are small, < 32 bytes. simple ColumnFamily. i am using SliceP

get_slice slow

2010-08-24 Thread B. Todd Burruss
i am using get_slice to pull columns from a row to emulate a queue. column names are TimeUUID and the values are small, < 32 bytes. simple ColumnFamily. i am using SlicePredicate like this to pull the first ("oldest") column in the row: SlicePredicate predicate = new SlicePredicate

Re: KeyRange.token in 0.7.0

2010-08-24 Thread B. Todd Burruss
. :) > > On Tue, Aug 24, 2010 at 1:28 PM, B. Todd Burruss wrote: > > i just came across this and i use tokens in range queries because it is > > an easy straightforward way to divide the keyspace and operate on it > > using multiple threads and throttle the processing. may

Re: KeyRange.token in 0.7.0

2010-08-24 Thread B. Todd Burruss
i just came across this and i use tokens in range queries because it is an easy straightforward way to divide the keyspace and operate on it using multiple threads and throttle the processing. maybe this is what hadoop does, i don't know much about hadoop. so i don't really agree that i'm doing i

Re: linux flavor?

2010-08-24 Thread B. Todd Burruss
CentOS works fine for me. straight out-o-the box. i also use ubuntu 10.04 w/o any troubles. make sure to jave jdk 1.6.0_20 or better. there was a bug that affects cassandra somewhere around 1.6.0_18 i think. On Tue, 2010-08-24 at 08:58 -0700, S Ahmed wrote: > Is there a particular linux flavor

RE: TTransportException intermittently in 0.7

2010-08-23 Thread B. Todd Burruss
i am getting this as well. i am calling batch_mutate. i don't see any server logs at INFO level. switched to DEBUG and still no interesting messages. by setting break points i tracked it down to TIOStreamTransport with type TTransportException.END_OF_FILE. seems for some reason the bytes read

Re: 0.7 beta 1 - "error in row mutation" and NPE

2010-08-23 Thread B. Todd Burruss
:03, Gary Dusbabek wrote: > > It looks like you're running into > > https://issues.apache.org/jira/browse/CASSANDRA-1403, which was fixed > > last week and will be included in beta2. > > > > If you are experiencing this on trunk, please do file another ticket, > >

0.7 beta 1 - "error in row mutation" and NPE

2010-08-23 Thread B. Todd Burruss
i see the following in my server logs quite closely while doing a lot of batch_mutations and reads. i create keyspaces and column families using thrift api, not cassandra.yaml. did not migrate anything from 0.6. 4 node cluster, RF = 3, QUORUM read/write. happens immediately on a fresh start of

Re: batch_mutate atomicity

2010-08-06 Thread B. Todd Burruss
ok i just saw the FAQ (http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic) follow up question ... it states that "As a special case, mutations against a single key are atomic, but more generally no" ... i interpret that to also mean " .. mutations against a single key in the same CF ... "

batch_mutate atomicity

2010-08-06 Thread B. Todd Burruss
if i am using batch_mutate to update/insert two columns in the same CF and same key, is this an atomic operation? i understand that an operation on a single key in a CF is atomic, but not sure if the above scenario boils down to two operations or considered one operation. thx

0.6.4 tag

2010-07-29 Thread B. Todd Burruss
i see a 0.6.4 tag in SVN, but not on cassandra's download page. is this ready for use if building from SVN?

Re: Key Caching

2010-07-27 Thread B. Todd Burruss
AggressiveOpts, if i remember correctly, uses options that are not documented but will probably make into a future release of the JVM. cassandra used it once upon a time. probably should take it out, but things work just fine for me now ;) On Tue, 2010-07-27 at 01:48 -0700, Dathan Pattishall wr

Re: Key Caching

2010-07-26 Thread B. Todd Burruss
i run cassandra with a 30gb heap on machines with 48gb total with good results. i don't use more just because i want to leave some for the OS to cache disk pages, etc. i did have the problem a couple of times with GC doing a full stop on the JVM because it couldn't keep up. my understanding of t

nodetool repair

2010-07-15 Thread B. Todd Burruss
if i have N=3 and run nodetool repair on node X. i assume that merkle trees (at a minimum) are calculated on nodes X, X+1, and X+2 (since N=3). when the repair is finished are nodes X, X+1, and X+2 all in sync with respect to node X's data? or does X have the latest data and X+1 and X+2 still in

Re: node down window

2010-07-14 Thread B. Todd Burruss
thx, but disappointing :) is this just something we have to live with and periodically "repair" the nodes? or is there future work to tighten up the window? thx On Wed, 2010-07-14 at 12:13 -0700, Jonathan Ellis wrote: > On Wed, Jul 14, 2010 at 1:43 PM, B. Todd Burruss wrote: &

node down window

2010-07-14 Thread B. Todd Burruss
there is a window of time from when a node goes down and when the rest of the cluster actually realizes that it is down. what happens to writes during this time frame? does hinted handoff record these writes and then "handoff" when the down node returns? or does hinted handoff not kick in until

Re: GCGraceSeconds per ColumnFamily/Keyspace

2010-07-13 Thread B. Todd Burruss
https://issues.apache.org/jira/browse/CASSANDRA-1276 On Tue, 2010-07-13 at 09:05 -0700, Todd Burruss wrote: > From: Jonathan Ellis [jbel...@gmail.com] > Received: 7/12/10 9:15 PM > To: user@cassandra.apache.org [u...@cassandra.apache.org] > Subject: Re: GCGraceSeconds per ColumnFamily/Keyspace >

Re: Question about CL.ZERO

2010-07-12 Thread B. Todd Burruss
why is there no good reason? if i would like to record informational events, possibly for debugging or something, i don't care if they actually get saved and i want the client's request to be as fast as possibly. this sounds like a good reason. are you saying that CL.ONE is equally performan

Re: AVRO client API

2010-06-18 Thread B. Todd Burruss
i'll jump in ... why AVRO over Thrift. can you guys point me at a comparison? (i know next to nothing about both of them) On 06/18/2010 03:41 PM, Paul Brown wrote: On Jun 18, 2010, at 2:12 PM, Eric Evans wrote: On Fri, 2010-06-18 at 11:00 -0700, Paul Brown wrote: At the risk of a

Re: batch mutate + deletion + slice range predicate unsupported

2010-05-13 Thread B. Todd Burruss
thx On 05/13/2010 02:12 PM, Gary Dusbabek wrote: Yes--0.7. I aim to make it part of https://issues.apache.org/jira/browse/CASSANDRA-494 (remove_slice). Gary. On Thu, May 13, 2010 at 16:08, B. Todd Burruss wrote: i just figured out that can't do a batch mutate + deletion that u

batch mutate + deletion + slice range predicate unsupported

2010-05-13 Thread B. Todd Burruss
i just figured out that can't do a batch mutate + deletion that uses a slice range predicate. is adding this functionality targeted for a particular release? what i am trying to do is delete the first X columns in a row. i can get around it by requesting all the columns in question and then

Re: performance tuning - where does the slowness come from?

2010-05-11 Thread B. Todd Burruss
another note on this ... since all my nodes are very well balanced and were started at the same time, i notice that they all do garbage collection at about the same time. this of course causes a performance issue. i also have noticed that with the default JVM options and heavy load, ConcMark

0.6.2

2010-05-11 Thread B. Todd Burruss
i was thinking about doing some testing with 0.6.2 ... do the devs consider the tip of 0.6 branch ok to test with?

Re: Read Latency

2010-05-11 Thread B. Todd Burruss
you can try this benchmarking tool to compare your drive(s) http://freshmeat.net/projects/fio/ ... you can simulate various loads, etc. my RAID0 outperforms single drive (as mentioned below) under heavy concurrent reads. On 05/11/2010 08:15 AM, Peter Schüller wrote: isolated requests, obvio

Re: Tuning Cassandra

2010-05-10 Thread B. Todd Burruss
have you put your commit log on a disk by itself? not a logical partition shared by oracle or cassandra "data". this will make a difference, as you don't want the cassandra commit logs competing with other OS and oracle I/O. look in storage-conf.xml and see if you can move this. also check

Re: performance tuning - where does the slowness come from?

2010-05-06 Thread B. Todd Burruss
i think you will see a slow down because of large values in your columns. make sure you take a look at MemtableThroughputInMB in your config. if you are writing 1MB of data per row, then you'll probably want to increase this quite a bit so you are not constantly creating sstables. can't reca

Re: MESSAGE-STREAMING-POOL exception

2010-04-23 Thread B. Todd Burruss
https://issues.apache.org/jira/browse/CASSANDRA-1019 Jonathan Ellis wrote: Can you create a ticket? On Fri, Apr 23, 2010 at 3:50 PM, B. Todd Burruss wrote: i agree, but it seems to have implications on the streaming service. Jonathan Ellis wrote: java.net.ConnectException

Re: MESSAGE-STREAMING-POOL exception

2010-04-23 Thread B. Todd Burruss
i agree, but it seems to have implications on the streaming service. Jonathan Ellis wrote: java.net.ConnectException: Connection timed out at sun.nio.ch.Net.connect is an os-level connection problem. On Fri, Apr 23, 2010 at 3:34 PM, B. Todd Burruss wrote: i see these exceptions on 4 out

MESSAGE-STREAMING-POOL exception

2010-04-23 Thread B. Todd Burruss
i see these exceptions on 4 out of the 7 nodes in my cluster. in addition those same four nodes all show AE-SERVICE-STAGE with pending work, and been showing this for several hours now. each node in the cluster has less than 2gb, so it should be finished by now. when i do nodetool streams on

Re: cleaning house

2010-04-20 Thread B. Todd Burruss
-obsolete files) and getLiveDiskSpaceUsed (includes everything). On Tue, Apr 20, 2010 at 12:33 PM, B. Todd Burruss wrote: i'm trying to draw some correlation between the size of my data and the space used on disk. i have set 1 so there isn't any reason to keep data around. my approa

Re: cleaning house

2010-04-20 Thread B. Todd Burruss
at 10:33 AM, B. Todd Burruss wrote: i'm trying to draw some correlation between the size of my data and the space used on disk. i have set 1 so there isn't any reason to keep data around. my approach is this: after only doing "puts" to cassandra for a while i stop my cl

cleaning house

2010-04-20 Thread B. Todd Burruss
i'm trying to draw some correlation between the size of my data and the space used on disk. i have set 1 so there isn't any reason to keep data around. my approach is this: after only doing "puts" to cassandra for a while i stop my client and want to perform the proper "cleanup" and/or "comp

Re: Tool for managing cluster nodes?

2010-04-20 Thread B. Todd Burruss
http://sourceforge.net/projects/clusterssh/ Roger Schildmeijer wrote: dancer's shell / distributed shell http://www.netfort.gr.jp/~dancer/software/dsh.html.en On 20 apr 2010, at 17.18em, Joost Ouwerkerk wrote: What are people using to manage Cassandra cluster nodes? i.e. to s

Re: memory question

2010-03-25 Thread B. Todd Burruss
no compaction. Jonathan Ellis wrote: did you check jmx to see if a compaction is going on? On Mon, Mar 22, 2010 at 5:14 PM, Todd Burruss wrote: after running my cluster for a while performance has become unacceptable, 200+ ms for reads. if running well, i see reads <10ms. when i run iost

  1   2   >