Re: Build an index to for join query

2010-09-18 Thread Aaron Morton
In the cassandra world the best approach is to create on CF with the name and address in it. Use a super CF with one super col for the user data and one super col for every address they have. Pull the entire row back every time you want to read the data. No need for joins. Aaron On 18 Sep

Re: token distribution question

2010-09-18 Thread Matthias L. Jugel
Okay, I am answering it myself. As confirmed by benblack on the cassandra irc channel and some reading, the script he presented is for the RP. My idea is now to write a little script to count the keys in my main CF and dividing the key ranges to assign to the nodes. This creates some kind of cu

Re: Network latency on Cassandra 0.7 (TFramedTransport)

2010-09-18 Thread Michal Augustýn
Thank you very much. This solves my issue. Augi 2010/9/17 Michael Greene > This is the correct cause. Reproducing your test gives 38-45ms in each of > 10 runs. If you run a profiler against it, you can see that the time is > entirely spent blocking on receive in TStreamTransport.Read. > > You

a few generic questions

2010-09-18 Thread Mario Micklisch
Hi there, I am currently in the planning state of a new web application that should use Cassandra because of its scaling possibilities. I would like to ask a few questions to make sure I fully understood how Cassandra handles certain cases. If there is somewhere I missed to read or where some mor

Re: 0.7 memory usage problem

2010-09-18 Thread vineet daniel
Hi Peter I actually checked after 15-20 of observation of monitor and logs when everything calmed down then it was showing this many processes, shouldnt it be good to reduce the no. of threads once server is idle or almost idle. As I am not a Java guy the only thing that I can think of is that ma

Re: 0.7 memory usage problem

2010-09-18 Thread Peter Schuller
> Even I would like to add here something and correct me if I am wrong, I > downloaded 0.7 beta and ran it, just by chance I checked 'top' to see how > the new version is doing and there were 64 processes running though > Cassandra was on single node with default configuration options ( ran it as >

Re: 0.7 memory usage problem

2010-09-18 Thread vineet daniel
Hi Even I would like to add here something and correct me if I am wrong, I downloaded 0.7 beta and ran it, just by chance I checked 'top' to see how the new version is doing and there were 64 processes running though Cassandra was on single node with default configuration options ( ran it as is, as

Re: 0.7 memory usage problem

2010-09-18 Thread Peter Schuller
> I see a spike in heap memory usage on Node 2 where it goes from around 1G to > 6GB (max) in less than an hour, and then goes our of memory. > There are some errors in the log file that are reported by other people, but > I don't think that these errors are the reason, because it use to happen > e

token distribution question

2010-09-18 Thread Matthias L. Jugel
Hi, in our setup we have four nodes and we are using the OPP. After starting and writing to the cluster it starts to get unbalanced as one expects. I would like to do some manual reassignment as we have some information on how the distribution looks like. Ben Black's little script seems to do

Re: Cassandra performance

2010-09-18 Thread Peter Schuller
>  - performance (it should be not as much less than shard of MySQL and > scale linearly, we want to have not more that 10K inserts per second > of writes, and probably not more than 1K/s reads which will be mostly > random) >  - ability to store big amounts of data (now it looks that we will > hav

Re: Cassandra performance

2010-09-18 Thread Kamil Gorlo
Hi, first of all I am not Cassandra hater :) I do not expect miracles also :) I'm searching if there is any scalable solution which could have be used instead of sharding solution over MySQL or Tokyo Tyrant. Our system now runs OK on single Tokyo Tyrant DB but we expect a lot of traffic increase i

Re: Cassandra performance

2010-09-18 Thread Peter Schuller
> Disabling row cache in this case makes sense, but disabling key cache > is probably hurting your performance quite a bit.  If you wrote 20GB > of data per node, with narrow rows as you describe, and had default > memtable settings, you now have a huge number of sstables on disk. > You did not ind

Re: Cassandra Cache Mbean values; bytes or number of elements ?

2010-09-18 Thread Jonathan Ellis
elements. it gets the capacity from your configuration setting. cassandra knows how many rows you have, and you told it to cache all of them, so that's what it set capacity to. On Fri, Sep 17, 2010 at 8:58 PM, kannan chandrasekaran wrote: > I am using 0.6.5 and my keycache for the CF is set as "

Re: Network latency on Cassandra 0.7 (TFramedTransport)

2010-09-18 Thread Jonathan Ellis
Created https://issues.apache.org/jira/browse/THRIFT-904 On Fri, Sep 17, 2010 at 12:16 PM, Michael Greene wrote: > This is the correct cause.  Reproducing your test gives 38-45ms in each of > 10 runs.  If you run a profiler against it, you can see that the time is > entirely spent blocking on rec