Re: UnavailableException on QUORUM write

2010-07-27 Thread Per Olesen
On Jul 27, 2010, at 12:23 AM, Jonathan Ellis wrote: > Can you turn on debug logging and try this patch? Yes, but..I am on vacation now, so it will be about 3 weeks from now.

Re: Failing to create a 2 Node cluster on a Windows machine

2010-07-26 Thread Per Olesen
On Jul 26, 2010, at 8:39 PM, Peter Schuller wrote: >> Completely off topic from the list, but Jonathan do you (or others) by any >> means know how to create an alias for 127.0.0.2 on a mac? Cause I used >> 127.0.0.x on linux without a problem, but on my Mac, it seems to need some >> config to

Re: Failing to create a 2 Node cluster on a Windows machine

2010-07-26 Thread Per Olesen
On Jul 26, 2010, at 3:25 AM, Jonathan Ellis wrote: > I know on a mac you need to explicitly create an alias for 127.0.0.2 > before it can be used. Maybe something similar applies to Windows. Completely off topic from the list, but Jonathan do you (or others) by any means know how to create an

SV: UnavailableException on QUORUM write

2010-07-21 Thread Per Olesen
>> And when one of my non-seed nodes in my 3 node cluster is down, I do NOT get >> the exception. >> Anyway, guess I need to try and reproduce it in small scale. > >Does it return w/ UE immediately, or does it wait for RPCTimeout first? It returns with UE immediately.

SV: UnavailableException on QUORUM write

2010-07-20 Thread Per Olesen
:34 AM, Per Olesen wrote: > Hi, > > Think I might have found out the problem. > I had only one seed node, and when that node is down, they all give > UnavailableException. Guess at least one seed needs to be up then? Sounds > fair. > > > /Per > _________

SV: UnavailableException on QUORUM write

2010-07-20 Thread Per Olesen
Hi, Think I might have found out the problem. I had only one seed node, and when that node is down, they all give UnavailableException. Guess at least one seed needs to be up then? Sounds fair. /Per Fra: Per Olesen [...@trifork.com] Sendt: 9. juli 2010

RE: Iterate all keys - doing it as the faq fails for me :(

2010-07-13 Thread Per Olesen
>I'm not entirely sure but I think you can only use get_range_slices >with start_key/end_key on a cluster using OrderPreservingPartitioner. >Dont know if that is intentional or buggy like Jonathan suggest but I >saw the same "duplicates" behaviour when trying to iterate all rows >using RP and start

SV: Iterate all keys - doing it as the faq fails for me :(

2010-07-12 Thread Per Olesen
>This is a bug. Can you submit a ticket with test data to reproduce? Uuuh, maybe...:) Right now it is happening on some life user data, that I am not sure I can ship. Haven't tried if I can reproduce locally. One question: We are running 0.6.2. Could this be fixed in 0.6.3? Not that big a pr

RE: Iterate all keys - doing it as the faq fails for me :(

2010-07-12 Thread Per Olesen
Anyone? - Hi, I was reading http://wiki.apache.org/cassandra/FAQ#iter_world and decided to implement the get_range_slices method for listing all keys of a CF. Only thing is, it doesn't work that well for me :-) I do as it says (I think), and take KeyRanges of size N and use the key of

Re: manual InitialToken assignemnt

2010-07-09 Thread Per Olesen
To method) > > someone please clarify > > Thanks Med venlig hilsen/Best regards Per Olesen, Developer Trifork A/S Spotorno Alle 4, DK-2630 Taastrup Mobile: +45 23389581 - Mail: p...@trifork.com<mailto:p...@trifork.com>

Iterate all keys - doing it as the faq fails for me :(

2010-07-09 Thread Per Olesen
Hi, I was reading http://wiki.apache.org/cassandra/FAQ#iter_world and decided to implement the get_range_slices method for listing all keys of a CF. Only thing is, it doesn't work that well for me :-) I do as it says (I think), and take KeyRanges of size N and use the key of the last call as s

Re: UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
On Jul 9, 2010, at 11:11 AM, ChingShen wrote: > Which client library do you use? Direct on thrift api using thrift.jar, in version 917130.

UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
Hi, I am a bit confused about getting an UnavailableException when doing a QUORUM write. I have a 3 node cluster, with RF=3. When all 3 nodes are up, the QUORUM write succeeds. When 1 of the 3 nodes are down, the QUORUM write fails with UnavailableException. Shouldn't it be enough with 2 nodes

Re: Best documentation for Java and Cassandra?

2010-06-17 Thread Per Olesen
If new to cassandra, how it achieves its goals etc., I would recommend going with the "raw" thrift API to start with, until you get a good understanding of which API calls there are and how they work. I am sure there are nicer APIs, but they also somewhat hide some of the complexity, that I thin

Re: Data modelling question

2010-06-14 Thread Per Olesen
On Jun 14, 2010, at 6:29 PM, Benjamin Black wrote: > On Mon, Jun 14, 2010 at 6:09 AM, Per Olesen wrote: >> >> So, in my use case, when searching on e.g. company, I can then access the >> "DashboardCompanyIndex" with a slice on its SC and then grab all the uuids

Data modelling question

2010-06-14 Thread Per Olesen
Hi, I have a question that relates to how to best model data. I have some pretty simple tabular data, which I am to show to a large amount of users, and the users need to be able to search some of the columns. Given this tabular data: Company| Amount|...many more columns here --

batch_mutate atomic?

2010-06-14 Thread Per Olesen
Can I expect batch_mutate to work in what I would think of as an atomic operation? That either all the mutations in the batch_mutate call are executed or none of them are? Or can some of them fail while some of them succeeds?

Re: Running a very small cluster

2010-06-09 Thread Per Olesen
> Why don't you run the benchmark contrib/stress.py to see what performance do > you get ? Didn't know about that one. Will give it a try. Thanks.

Re: Quick help on Cassandra please: cluster access and performance

2010-06-09 Thread Per Olesen
On Jun 9, 2010, at 9:47 PM, li wei wrote: > Thanks a lot. > We are set READ one, WRITE ANY. Is this better than QUORUM in performance. Yes, but less consistency safe. > Do you think the cassandra Cluster (with 2 or nodes) should be always > faster than Single one node in the reality and theo

Running a very small cluster

2010-06-09 Thread Per Olesen
Short question: Do cassandra only *really* shine when running a cluster with lots of nodes? Same question in a lengthy version: If what I want to obtain from my cassandra cluster is given as this: a) protection against data loss if nodes disk-crash b) good uptime, if servers become unavailable o

Re: Seeds and AutoBoostrap

2010-06-09 Thread Per Olesen
Okay. Cool actually. That clears up quite a bit for me :) On Jun 9, 2010, at 9:26 PM, Jonathan Ellis wrote: > right > > On Wed, Jun 9, 2010 at 11:29 AM, Per Olesen wrote: >> >> On Jun 9, 2010, at 1:00 PM, Ben Browning wrote: >> >>> There really aren&#

Re: Quick help on Cassandra please: cluster access and performance

2010-06-09 Thread Per Olesen
> How to set "write and read with QUORUM"? You set this through each thrift api call you are making through your java code. See http://wiki.apache.org/cassandra/API > They are run physically separate hw (But same since they are VM) So they share disk. I think this can have an influence. As I

Re: Seeds and AutoBoostrap

2010-06-09 Thread Per Olesen
On Jun 9, 2010, at 1:00 PM, Ben Browning wrote: > There really aren't "seed nodes" in a Cassandra cluster. When you > specify a seed in a node's configuration it's just a way to let it > know how to find the other nodes in the cluster. A node functions the > same whether it is another node's seed

Re: Quick help on Cassandra please: cluster access and performance

2010-06-09 Thread Per Olesen
Hi Wei, > 1) I found this: the 2 node is slower (30%) than one node in both of write > and select. Is this normal? (In theory, 2 nodes should be faster than one?). > I monitroing the 2 nodes and found tehy are working and flush often (so, 2 > nodes works) Which consistency level are you using

Seeds and AutoBoostrap

2010-06-08 Thread Per Olesen
Hi, Just a quick question on seed nodes and auto bootstrap. Am I correct in that a seed node won't be able to AutoBootstrap? And if so, will a seed node newly added to an existing cluster then not take long time before it actually starts getting any work to it? I mean, if it doesn't start with

Re: Is ReplicationFactor values number of replicas or number of copies of data?

2010-06-07 Thread Per Olesen
On Jun 7, 2010, at 6:05 PM, Benjamin Black wrote: > There is no 'master' so all copies are replicas. RF=1 means 1 node > has the data, RF=2 means 2 do, etc. > Okay, thanks (and thanks to Ran also). I guess 0 doesn't make sense then, and that RF=1 is a bad idea if I want some protection again

Is ReplicationFactor values number of replicas or number of copies of data?

2010-06-07 Thread Per Olesen
Hi, I am unclear about what the ReplicationFactor value means. Does RF=1 mean that there is only one single node that has the data in the cluster (actually no replication), or, does it mean, that there are two copies of the data - one "actual" and one "replica" (as in "replicated one time")? I

SV: Are 6..8 seconds to read 23.000 small rows - as it should be?

2010-06-07 Thread Per Olesen
> Ben Browning wrote... > >[snip/] >... > I've been able to read columns out of Cassandra at >an order of magnitude higher than what you're seeing here but there >are too many variables to directly compare. I've been reading with ConsistencyLevel QUORUM in my timings. If I change to Consis

Re: Are 6..8 seconds to read 23.000 small rows - as it should be?

2010-06-04 Thread Per Olesen
On Jun 4, 2010, at 5:19 PM, Ben Browning wrote: > How many subcolumns are in each supercolumn and how large are the > values? Your example shows 8 subcolumns, but I didn't know if that was > the actual number. I've been able to read columns out of Cassandra at > an order of magnitude higher than

Re: Are 6..8 seconds to read 23.000 small rows - as it should be?

2010-06-04 Thread Per Olesen
On Jun 4, 2010, at 4:46 PM, Jonathan Ellis wrote: > get_slice reads a single row. do you mean there are 23,000 columns, > or are you running get_slice in a loop 23000 times? Hi Jonathan, thanks for answering! No, I do only one get_slice call. There are 23.000 SUPER columns, which I read using

Are 6..8 seconds to read 23.000 small rows - as it should be?

2010-06-04 Thread Per Olesen
Are 6..8 seconds to read 23.000 small rows - as it should be? I have a quick question on what I think is bad read performance for this simple setup: SCF:Dashboard key:username1 -> { SC:uniqStr1 -> { col1:val1, col2: val2, ... col8:val8 }, SC:uniqStr2 -> { col1:val1, col2: v