CQLSH select time series data, bug?

2012-12-23 Thread Karl Hiramoto
Hi, I get what looks like a python error "'float' object has no attribute 'replace'" from cqlsh with the version packaged in version 1.1.8 $ ./apache-cassandra-1.1.8/bin/cqlsh Connected to Test Cluster at localhost:9160. [cqlsh 2.2.0 | Cassandra 1.1.2 | CQL spec 2.0.0 | Thrift protocol 19.3

Re: offline compaction

2012-03-08 Thread Karl Hiramoto
On 03/08/12 21:40, Edward Capriolo wrote: On Thu, Mar 8, 2012 at 1:43 PM, Feng Qu wrote: Hello, is there a way to take one node out of ring and running a major compaction? Feng Qu http://www.jointhegrid.com/highperfcassandra/?p=187 What are the drawbacks to disable thrift and gossip? So y

Re: Restart cassandra every X days?

2012-01-25 Thread Karl Hiramoto
On 01/25/12 19:18, R. Verlangen wrote: Ok thank you for your feedback. I'll add these tasks to our daily cassandra maintenance cronjob. Hopefully this will keep things under controll. I forgot to mention that we found that Forcing a GC also cleans up some space. in a cronjob you can do th

Re: Restart cassandra every X days?

2012-01-25 Thread Karl Hiramoto
On 01/25/12 16:09, R. Verlangen wrote: Hi there, I'm currently running a 2-node cluster for some small projects that might need to scale-up in the future: that's why we chose Cassandra. The actual problem is that one of the node's harddrive usage keeps growing. For example: - after a fresh

Re: 99.999% uptime - Operations Best Practices?

2011-06-23 Thread Karl Hiramoto
On 06/23/11 09:43, David Boxenhorn wrote: > I think very high uptime, and very low data loss is achievable in > Cassandra, but, for new users there are TONS of gotchas. You really > have to know what you're doing, and I doubt that many people acquire > that knowledge without making a lot of mistake

Re: repair never completes with "finished successfully"

2011-04-12 Thread Karl Hiramoto
On 12/04/2011 13:31, Jonathan Colby wrote: There are a few other threads related to problems with the nodetool repair in 0.7.4. However I'm not seeing any errors, just never getting a message that the repair completed successfully. In my production and test cluster (with just a few MB data)

Re: Compaction doubles disk space

2011-03-30 Thread Karl Hiramoto
On 3/30/2011 12:39 PM, aaron morton wrote: Checked the code again, got it a bit wrong. When getting a path to flush a memtable (and to write an incoming stream) via cfs.getFlushPath() the code does not invoke GC if there is not enough space. One reason for not doing this could be that when we

Re: Compaction doubles disk space

2011-03-30 Thread Karl Hiramoto
On 30/03/2011 09:08, aaron morton wrote: Also as far as I understand we cannot immediately delete files because other operations (including repair) may be using them. The data in the pre compacted files is just as correct as the data in the compacted file, it's just more compact. So the easiest

Re: How to determine if repair need to be run

2011-03-30 Thread Karl Hiramoto
On 03/30/11 00:31, Peter Schuller wrote: > > set -e # important > touch /path/to/flagfile.tmp > nodetool -h localhost repair > mv /path/to/flagfile.tmp /path/to/flagfile > Note this script doesn't work if your repair takes hours, and in the middle of the repair cassandra was restarted, node

improving speed/space for repair/ompact Big O Notation of

2011-03-29 Thread Karl Hiramoto
Can someone roughly advise Big O() for number of keys in a CF? Is it advisable to partition data into more Column Famlies and Keyspaces to improve repair and compact performance? Thanks -- Karl

Re: Compaction doubles disk space

2011-03-29 Thread Karl Hiramoto
Would it be possible to improve the current compaction disk space issue by compacting one only a few SSTables at a time then imediately deleting the old one? Looking at the logs it seems like deletions of old SSTables are taking longer than necessary. -- Karl

Re: reducing disk usage advice

2011-03-14 Thread Karl Hiramoto
On 03/14/11 15:33, Sylvain Lebresne wrote: > > CASSANDRA-1537 is probably also a partial but possibly sufficient > solution. That's also probably easier than CASSANDRA-1610 and I'll try > to give it a shot asap, that had been on my todo list way too long. > Thanks, eager to see CASSANDRA-1610 somed

Re: reducing disk usage advice

2011-03-13 Thread Karl Hiramoto
On 3/13/2011 9:27 PM, aaron morton wrote: The CF Stats are reporting you have 70GB total space taken up by SSTables, of which 55GB is live. The rest is available for deletion, AFAIK this happens when cassandra detects free space is running low. I've never dug into how/when this happens though.

reducing disk usage advice

2011-03-13 Thread Karl Hiramoto
Hi, I'm looking for advice on reducing disk usage. I've ran out of disk space two days in a row while running a nightly scheduled nodetool repair && nodetool compact cronjob. I have 6 nodes RF=3 each with 300 GB drives at a hosting company. GCGraceSeconds= 26 (3.1 days) Every colu

how to force a GC in cronjob to free up disk space?

2011-03-10 Thread Karl Hiramoto
Reading the FAQ http://wiki.apache.org/cassandra/FAQ "SSTables that are obsoleted by a compaction are deleted asynchronously when the JVM performs a GC. You can force a GC from jconsole if necessary" How can i force the GC with a simple java commandline? Is

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
On 03/08/11 21:45, Sylvain Lebresne wrote: > Did you run scrub as soon as you updated to 0.7.3 ? > Yes, whithin a few minutes of starting up 0.7.3 on the node > And did you had problems/exceptions before running scrub ? Not sure. > If yes, did you had problems with only 0.7.3 or also with 0.7.2 ?

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
away. Anyway to fix this without throwing away all the data? Since i only keep data 24 hours, I could insert into two CF for the next 24 hours than after only valid data in new CF remove the old CF. On Tue, Mar 8, 2011 at 5:34 AM, Karl Hiramoto wrote: I have 1000's of these in the lo

Re: nodetool repair hung in 0.7.3

2011-03-08 Thread Karl Hiramoto
On 08/03/2011 16:34, Sylvain Lebresne wrote: I just saw repair hang here too, it's actually very easy to reproduce. I'm looking at it right now. -- Thanks. Should i bump GCGraceSeconds since i can no longer repair? I tried repair on 3 nodes of a 6 node cluster and they all hang. Woul

nodetool repair hung in 0.7.3

2011-03-08 Thread Karl Hiramoto
I never saw this before upgrading to 0.7.3 but now I do nodetool repair and it sits there for hours. Previously it took about 20 minutes per node (about 10GB of data per node). I had some OOM crashes, but haven't seen them since I increased the heap size and decreased the key cache. In the

0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
I have 1000's of these in the log is this normal? java.io.IOError: java.io.EOFException: bloom filter claims to be longer than entire row size at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:117) at org.apache.cassandra.db.CompactionMan

Re: [RELEASE] 0.7.3

2011-03-07 Thread Karl Hiramoto
just updated and after doing a "scrub" we see exceptions. ERROR [CompactionExecutor:1] 2011-03-07 15:46:53,811 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[CompactionExecutor:1,1,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.io.sstab

Re: Issues connecting from outside of localhost

2011-03-03 Thread Karl Hiramoto
On 02/03/2011 17:57, David McNelis wrote: In case anyone is interested. Our problem revolved around one machine having the phpcassa thrift patch, and the other did not. Its resolved now. Which patch? I think there is a difference between tag v0.7.a.3 and the current HEAD of master. I

Re: EOFException: attempted to skip x bytes

2011-02-21 Thread Karl Hiramoto
On 21/02/2011 09:01, shimi wrote: I upgraded to 0.7.2 from 0.7.0 which was upgraded from 0.6.8 and I gets the following Exception. I have 4 nodes cluster on 2 data centers (2 nodes on each). I see the error only on 2 nodes on the same data center. I didn't see this error on 0.7.0 ERROR [Hinted

exceptions upgrading from 0.7.0 to 0.7.1

2011-02-16 Thread Karl Hiramoto
Hi, just started an upgrade on a single node of a live production cluster, and did a nodetool repair/compact/cleanup in the logs I see exceptions, is this normal? ERROR [ReadStage:31] 2011-02-16 08:18:38,094 DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor java.io.IOEr

Re: rename column family

2011-02-10 Thread Karl Hiramoto
On 02/10/11 22:19, Aaron Morton wrote: > That should read "Without more information" > > A > On 11 Feb, 2011,at 10:15 AM, Aaron Morton wrote: > >> With more information I'd say this is not a good idea. >> >> I would suggest looking at why you do the table switch in the MySql >> version and conside

rename column family

2011-02-10 Thread Karl Hiramoto
Hi, In Mysql I do this pattern and wonder if I could do something similar with cassandra. 1. Live/Production queries always coming into LiveTable 2. Build new data with BuildTable 3.RENAME TABLE LiveTable TO OldTable, BuildTable To LiveTable 4. DROP TABLE OldTable, Goto step #2 build

Re: balancing load

2011-01-18 Thread Karl Hiramoto
On 17/01/2011 19:27, Edward Capriolo wrote: cfstats is reporting you have an 8GB Row! I think you could be writing all your data to a few keys. Your right, my n00b fault, I was writing everything to one key, the problem was i had Offer['id'][$UID] = value it made it easy before to do a "c

Re: balancing load

2011-01-17 Thread Karl Hiramoto
On 01/17/11 15:54, Edward Capriolo wrote: > Just to head the next possible problem. If you run 'nodetool cleanup' > on each node and some of your nodes still have more data then others, > then it probably means your are writing the majority of data to a few > keys. ( you probably do not want to do

Re: balancing load

2011-01-16 Thread Karl Hiramoto
Thanks for the help. I used "nodetool move", so now each node owns 20% of the space, but it seems that the data load is still mostly on 2 nodes. nodetool --host slave4 ring Address Status State LoadOwns Token

balancing load

2011-01-16 Thread Karl Hiramoto
Hi, I have a keyspace with Replication Factor: 2 and it seems though that most of my data goes to one node. What am I missing to have Cassandra balance more evenly? ./nodetool -h host1 ring Address Status State LoadOwns Token