compression

2010-03-31 Thread Ran Tavory
What sort of compression (if any) is performed by cassandra? Does the thrift client compress anything before sending to the server to preserve bandwidth? Does the server compress the values in the columns to preserve disk or memory? ... I assume compaction, performed on the server side, is differe

Re: Cassandra data file corrupt

2010-03-31 Thread Stu Hood
808 depends on 674, which has not been fixed, so no, there is no code for 808. -Original Message- From: "JKnight JKnight" Sent: Wednesday, March 31, 2010 10:22pm To: user@cassandra.apache.org Subject: Re: Cassandra data file corrupt Dear Jeremy Dunck, I tried to compact, and get and err

Re: [RELEASE] 0.6.0-rc1 (release candidate)

2010-03-31 Thread Jonathan Ellis
Sounds like the bat file is building the classpath incorrectly? You could try echo-ing it before invoking java. On Wed, Mar 31, 2010 at 11:16 PM, Rao Venugopal wrote: > All > I am trying to get this to work on windows.  While I had no problems on my > x86 laptop (running XP), I am experiencing i

Re: [RELEASE] 0.6.0-rc1 (release candidate)

2010-03-31 Thread Rao Venugopal
All I am trying to get this to work on windows. While I had no problems on my x86 laptop (running XP), I am experiencing issues getting it to work on my x64 desktop (running Vista). Anyone else got it to work on x64 running Vista? This is the error I see when I run cassandra.bat

Re: Cassandra data file corrupt

2010-03-31 Thread JKnight JKnight
Dear Jeremy Dunck, I tried to compact, and get and error: Caused by: java.io.UTFDataFormatException: malformed input around byte 13 at java.io.DataInputStream.readUTF(DataInputStream.java:617) at java.io.RandomAccessFile.readUTF(RandomAccessFile.java:887) at org.apache.cassandra.io.It

Re: Read Performance

2010-03-31 Thread Jonathan Ellis
On Wed, Mar 31, 2010 at 6:21 PM, James Golick wrote: > Keyspace: ActivityFeed >         Read Count: 699443 >         Read Latency: 16.11017477192566 ms. >                 Column Family: Events >                 Read Count: 232378 >                 Read Latency: 0.396 ms. >                 Row cac

Re: Factor and Cassandra

2010-03-31 Thread Jonathan Ellis
Writing a Factor generator for Thrift is probably your best bet. It should be pretty doable; the C# backend for instance is ~1700 loc. -Jonathan On Wed, Mar 31, 2010 at 3:40 PM, Vladimir Lushnikov wrote: > Hi all, > > Does anyone have any experience using Cassandra as a data store in Factor >

Re: Read Performance

2010-03-31 Thread James Golick
Keyspace: ActivityFeed Read Count: 699443 Read Latency: 16.11017477192566 ms. Write Count: 69264920 Write Latency: 0.020393242755495856 ms. Pending Tasks: 0 ...snip Column Family: Events SSTable count: 5 Sp

Re: Read Performance

2010-03-31 Thread Jonathan Ellis
What does the CFS mbean think read latencies are? Possibly something else is introducing latency after the read. On Wed, Mar 31, 2010 at 5:37 PM, James Golick wrote: > Standard CF. 10 columns per row. Between about 800 bytes and 2k total per > row. > On Wed, Mar 31, 2010 at 3:06 PM, Chris Goffin

Re: Read Performance

2010-03-31 Thread James Golick
Standard CF. 10 columns per row. Between about 800 bytes and 2k total per row. On Wed, Mar 31, 2010 at 3:06 PM, Chris Goffinet wrote: > How many columns in each row? > > -Chris > > On Mar 31, 2010, at 2:54 PM, James Golick wrote: > > I just tried running the same multi_get against cassandra 1000

Re: Read Performance

2010-03-31 Thread Chris Goffinet
How many columns in each row? -Chris On Mar 31, 2010, at 2:54 PM, James Golick wrote: > I just tried running the same multi_get against cassandra 1000 times, > assuming that that'd force it in to cache. > > I'm definitely seeing a 5-10ms improvement, but it's still looking like > 20-30ms on a

Re: Read Performance

2010-03-31 Thread Jonathan Ellis
Yes, I would. How many columns are you reading per row? How larger are they? Are they supercolumns? On Wed, Mar 31, 2010 at 4:54 PM, James Golick wrote: > I just tried running the same multi_get against cassandra 1000 times, > assuming that that'd force it in to cache. > I'm definitely seeing

Re: Read Performance

2010-03-31 Thread James Golick
I just tried running the same multi_get against cassandra 1000 times, assuming that that'd force it in to cache. I'm definitely seeing a 5-10ms improvement, but it's still looking like 20-30ms on average. Would you expect it to be faster than that? - James On Wed, Mar 31, 2010 at 11:44 AM, Jonat

Factor and Cassandra

2010-03-31 Thread Vladimir Lushnikov
Hi all, Does anyone have any experience using Cassandra as a data store in Factor [http://factor-language.org](perhaps using something like thrift)? Kind regards, Vladimir Lushnikov Sent using BlackBerry®

Re: expiring data out of Cassandra/time to live

2010-03-31 Thread Ryan Daum
On that topic, what exactly is keeping this feature out of the official releases? On Wed, Mar 31, 2010 at 3:43 PM, Daniel Kluesing wrote: > We also applied this patch to the 0.6 branch and have been running it for > a bit over a week. Works well, would love to see it get into trunk/0.7 > proper

Factor and Cassandra

2010-03-31 Thread Vladimir Lushnikov
Hi all, Does anyone have any experience using Cassandra as a data store in Factor [http://factor-language.org](perhaps using something like thrift)? Kind regards, Vladimir Lushnikov Sent using BlackBerry®

Net::Cassandra::Easy Perl interface (with cassidy.pl CLI) 0.08

2010-03-31 Thread Ted Zlatanov
You can find version 0.08 of the Net::Cassandra::Easy Perl module at: http://search.cpan.org/search?query=cassandra+easy&mode=all This version comes with cassidy.pl, a command-line interface that supports tab-completion. It's not finished (no docs yet, that's a TODO) but in its current form it wi

Re: expiring data out of Cassandra/time to live

2010-03-31 Thread Mike Gallamore
Thanks a lot Jonathan and everyone else that replied to my thread. This looks like it will do what I need. I have a colleague that is a Java wizard and will probably have no problem putting this patch into place for our production builds. I'm a C/C++ programmer at heart so the code itself does

[RELEASE] 0.6.0-rc1 (release candidate)

2010-03-31 Thread Eric Evans
Cassandra 0.6.0-rc1 is available from the usual place[1]. Barring any serious regressions, this will become the next stable release, so please test and submit bug reports[2] if you spot problems. [1]: http://cassandra.apache.org/download/ [2]: https://issues.apache.org/jira/browse/CASSANDRA As a

RE: expiring data out of Cassandra/time to live

2010-03-31 Thread Daniel Kluesing
We also applied this patch to the 0.6 branch and have been running it for a bit over a week. Works well, would love to see it get into trunk/0.7 proper. From: Ryan Daum [mailto:r...@thimbleware.com] Sent: Wednesday, March 31, 2010 11:49 AM To: user@cassandra.apache.org Subject: Re: expiring data

Re: Read Performance

2010-03-31 Thread Jonathan Ellis
But then you'd still be caching the same things memcached is, so unless you have a lot more ram you'll presumably miss the same rows too. The only 2-layer approach that makes sense to me would be to have cassandra keys cache at 100% behind memcached for the actual rows, which will actually reduce

Re: expiring data out of Cassandra/time to live

2010-03-31 Thread Ryan Daum
I was able to successfully merge this patch into the 0.6 branch a few weeks ago by doing the following: - Downloading the patch - Checking out the trunk of Cassandra from github - Rolling back (checking out) the git repo to the same date that the patch was submitted to Jira - Apply

Re: expiring data out of Cassandra/time to live

2010-03-31 Thread Jonathan Ellis
Sounds like you want to follow https://issues.apache.org/jira/browse/CASSANDRA-699. There is a patch there but I wouldn't recommend merging it if Java scares you. :) On Wed, Mar 31, 2010 at 1:39 PM, Mike Gallamore wrote: > Hello everyone, > > I saw a thread on the incubator user chat that starte

expiring data out of Cassandra/time to live

2010-03-31 Thread Mike Gallamore
Hello everyone, I saw a thread on the incubator user chat that started a few months ago: http://www.mail-archive.com/cassandra-u...@incubator.apache.org/msg02047.html . It looks like this is the new official user mailing list so I'll add my thoughts/question here. Is there any way to set a T

Re: Read Performance

2010-03-31 Thread David Strauss
Or, if faking memcached misses is too high a price to pay, queue some proportion of the reads to replay asynchronously against Cassandra. On Wed, 2010-03-31 at 11:04 -0500, Jonathan Ellis wrote: > Can you redirect some of the reads from memcache to cassandra? Sounds > like the cache isn't getting

Re: Read Performance

2010-03-31 Thread Ryan King
On Wed, Mar 31, 2010 at 9:04 AM, Jonathan Ellis wrote: > Can you redirect some of the reads from memcache to cassandra?  Sounds > like the cache isn't getting warmed up. Yeah, putting a cache in front of a cache can ruin the locality of the second cache. -ryan

Re: Read Performance

2010-03-31 Thread Jonathan Ellis
Can you redirect some of the reads from memcache to cassandra? Sounds like the cache isn't getting warmed up. On Wed, Mar 31, 2010 at 11:01 AM, James Golick wrote: > I'm testing on the live cluster, but most of the production reads are being > served by the cache. It's definitely the right CF. >

Re: Read Performance

2010-03-31 Thread James Golick
I'm testing on the live cluster, but most of the production reads are being served by the cache. It's definitely the right CF. On Wed, Mar 31, 2010 at 8:30 AM, Jonathan Ellis wrote: > On Wed, Mar 31, 2010 at 12:01 AM, James Golick > wrote: > > Okay, so now my row cache hit rate jumps between 1.

Re: Read Performance

2010-03-31 Thread Jonathan Ellis
On Wed, Mar 31, 2010 at 12:01 AM, James Golick wrote: > Okay, so now my row cache hit rate jumps between 1.0, 99.5, 95.6, and NaN. > Seems like that stat is a little broken. Sounds like you aren't getting enough requests for the getRecentHitRate to make sense. use getHits / getRequests. But if

Re: Cassandra data file corrupt

2010-03-31 Thread Jeremy Dunck
On Wed, Mar 31, 2010 at 7:55 AM, Stu Hood wrote: > Eventually the new file format will make it in with #674, and we'll be able > to implement an option to skip corrupted data: > > https://issues.apache.org/jira/browse/CASSANDRA-808 That ticket seems to indicate that compaction will remove the co

Re: Cassandra data file corrupt

2010-03-31 Thread Stu Hood
Eventually the new file format will make it in with #674, and we'll be able to implement an option to skip corrupted data: https://issues.apache.org/jira/browse/CASSANDRA-808 We're not ignoring this issue. -Original Message- From: "David Timothy Strauss" Sent: Wednesday, March 31, 2010

unsubscribe

2010-03-31 Thread Salvador Ausina
-- Salutacions Salvador Ausina sausina_at_quadux_dot_net quadux.net

Re: Cassandra data file corrupt

2010-03-31 Thread David Timothy Strauss
Cassandra has always supported two great ways to prevent data loss: * Replication * Backups I doubt Cassandra will ever focus extensively on single-node recovery when it's so easy to wipe and rebuild any node from the cluster. -Original Message- From: JKnight JKnight Date: Wed, 31 Mar

Cassandra data file corrupt

2010-03-31 Thread JKnight JKnight
Dear all, My Cassandra data file had problem and I can not get data from this file. And all row after error row can not be accessed. So I lost a lot of data. Will next version of Cassandra implement the way to prevent data lost. Maybe we use the checkpoint. If data file corrupt, we will read from