date:20100714

SV: key types and grouping related rows together

2010-07-14 Thread Thorvaldsson Justus

Dont forget you can make your own sorting algorithm. Here is a nice tutorial for that. http://www.sodeso.nl/?p=421 Justus Från: Schubert Zhang [mailto:zson...@gmail.com] Skickat: den 15 juli 2010 04:20 Till: user@cassandra.apache.org Ämne: Re: key types and grouping related rows together for you

Re: Seed and nodetool

2010-07-14 Thread Aaron Morton

Can you do an insert with CL ALL?Are there any ERRORs in the log file? Try turning the logging up the TRACE and see whats happening. Check B and see A by ssh'ing into B and using node tool from there to connect to A. Do you have any switches / firewalls between the nodes ? Could this be happenin

Data in Cassandra

2010-07-14 Thread Hendro Kaskus

Hi everyone, I'm newbie to Cassandra :D.. I try to insert data from MySQL to Cassandra. Data dump from MySQL is about 11 MB (64716 records). But when i'm insert to Cassandra, i think the data is become bigger than in MySQL. Is it true...??? Thanks

Re: Seed and nodetool

2010-07-14 Thread Claire Chang

BTW, A is 192.168.11.29 B is 192.168.11.28 C is 192.168.11.27 from the result of nodetool ring, does it mean that B thinks A, C are down and C thinks B is down? I tried to restart B and for a bring moment, I didn't get this problem (all the nodes are all from nodetool) but after a while, this

Seed and nodetool

2010-07-14 Thread Claire Chang

I have 3 nodes A, B, C with RF=3. When I configure the cluster and before start taking any read/write request, I first start A, put A itself as seed (following in the instructions on wiki), and then start B (put A as the seed) and then start C (also put A as the seed). B and C seem joining the

Re: key types and grouping related rows together

2010-07-14 Thread Schubert Zhang

for your apps, how about this schema: key: website1123 columnName: UserID ... On Thu, Jul 15, 2010 at 6:13 AM, Aaron Morton wrote: > The key structure you have should group the keys based on the website There > are some differences between range queries with RP and OPP this article may > help >

Bootstrap Token collision

2010-07-14 Thread Mubarak Seyed

The cluster nodes were running fine. When i restarted to modify the JVM heap settings, two of the nodes are not joining the cluster and throws Bootstrap Token collision Any idea how to fix this error? ERROR [GMFD:1] 2010-07-15 01:23:13,756 DebuggableThreadPoolExecutor.java (line 101) Error in Thr

Re: Bootstrap question

2010-07-14 Thread Jonathan Ellis

Each node logs what token it is going to bootstrap to. Who owns the ranges that contain those tokens? On Wed, Jul 14, 2010 at 5:58 PM, Anthony Molinaro wrote: > Hi, > > I have a 0.6.3 cluster which contains 6 nodes. I added 6 new nodes > by setting AutoBootstrap to true and setting an InitialT

Re: timestamps and batch_mutation

2010-07-14 Thread Jonathan Ellis

It is good style but may not be necessary. On Wed, Jul 14, 2010 at 4:54 PM, Aaron Morton wrote: > Is it OK or recommended to use the same timestamp value for all Column and > Deletion records sent in a batch mutation? > > Am thinking of cases where there is a potential for multiple clients to > u

Re: node down window

2010-07-14 Thread Jonathan Ellis

Coordination in a distributed system is difficult. I don't think we can fix HH's existing edge cases, without introducing other more complicated edge cases. So weekly-or-so repair will remain a common maintenance task for the forseeable future. On Wed, Jul 14, 2010 at 4:17 PM, B. Todd Burruss w

Bootstrap question

2010-07-14 Thread Anthony Molinaro

Hi, I have a 0.6.3 cluster which contains 6 nodes. I added 6 new nodes by setting AutoBootstrap to true and setting an InitialToken on each new node, then waiting for the "Bootstrapping" message in the log before starting another. Then I've been watching the logs on the old boxes waiting to se

Re: key types and grouping related rows together

2010-07-14 Thread Aaron Morton

The key structure you have should group the keys based on the website There are some differences between range queries with RP and OPP this article may help http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/AaronOn 15 Jul, 2010,at 08:44 AM, S Ahmed wr

timestamps and batch_mutation

2010-07-14 Thread Aaron Morton

Is it OK or recommended to use the same timestamp value for all Column and Deletion records sent in a batch mutation? Am thinking of cases where there is a potential for multiple clients to update the same key (with multiple columns) at the same time. In the use case it's acceptable, as the client

Re: node down window

2010-07-14 Thread B. Todd Burruss

thx, but disappointing :) is this just something we have to live with and periodically "repair" the nodes? or is there future work to tighten up the window? thx On Wed, 2010-07-14 at 12:13 -0700, Jonathan Ellis wrote: > On Wed, Jul 14, 2010 at 1:43 PM, B. Todd Burruss wrote: > > there is a wi

Re: RE: Using Pelops with Cassandra 0.7.X

2010-07-14 Thread Ran Tavory

I can't commit. But we accept contribitions :-) On Jul 14, 2010 3:16 PM, "Dop Sun" wrote: Hector will released one along with 0.7, or there are any beta or alpha before official release of 0.7? I’m planning to update my client to work with Cassandra 0.7 trunk now, and I have a dependency on

key types and grouping related rows together

2010-07-14 Thread S Ahmed

Where is the link that describes the various key types and their impact on sorting? (I believe I read it before, can't seem to find it now). So my application supports multi-tenants, so I need the keys to represent things like: website1123 + contentID or website3454 + userID And for range quer

Re: node down window

2010-07-14 Thread Jonathan Ellis

On Wed, Jul 14, 2010 at 1:43 PM, B. Todd Burruss wrote: > there is a window of time from when a node goes down and when the rest > of the cluster actually realizes that it is down. > > what happens to writes during this time frame? does hinted handoff > record these writes and then "handoff" when

Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Jorge Barrios

Each of my top-level functions was allocating a Hector client connection at the top, and releasing it when returning. The problem arose when a top-level function had to call another top-level function, which led to the same thread allocating two connections. Hector was not releasing one of them eve

node down window

2010-07-14 Thread B. Todd Burruss

there is a window of time from when a node goes down and when the rest of the cluster actually realizes that it is down. what happens to writes during this time frame? does hinted handoff record these writes and then "handoff" when the down node returns? or does hinted handoff not kick in until

Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread shimi

do you mean that you don't release the connection back to fhe pool? On 2010 7 14 20:51, "Jorge Barrios" wrote: Thomas, I had a similar problem a few weeks back. I changed my code to make sure that each thread only creates and uses one Hector connection. It seems that client sockets are not being

Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Jorge Barrios

Thomas, I had a similar problem a few weeks back. I changed my code to make sure that each thread only creates and uses one Hector connection. It seems that client sockets are not being released properly, but I didn't have the time to dig into it. Jorge On Wed, Jul 14, 2010 at 8:28 AM, Peter Schu

Re: get_range_slices return the same rows

2010-07-14 Thread Jonathan Ellis

This is a bug. If you can give us data to reproduce with we can fix it faster. On Wed, Jul 14, 2010 at 10:29 AM, shimi wrote: > I wrote a code that iterate on all the rows by using get_range_slices. > for the first call I use KeyRange from "" to "". > for all the others I use from iteration> to

My First Cassandra

2010-07-14 Thread Geoffry Roberts

All, Can anyone help? I followed the instructions for a single node installation of Cassandra. I tried to start it and got: ERROR 08:13:53,499 Exception encountered during startup. java.io.StreamCorruptedException: invalid stream header: 61696E5D at java.io.ObjectInputStream.readStreamH

Re: NYC Cassandra training

2010-07-14 Thread Jonathan Ellis

I bring a USB drive for every attendee. The VM runs Debian. On Wed, Jul 14, 2010 at 10:20 AM, S Ahmed wrote: > How will we load the VM on our machines? Do we download it ? > Is it running Ubuntu? > > On Wed, Jul 14, 2010 at 11:11 AM, Jonathan Ellis wrote: >> >> Turns out we can get a list from

get_range_slices return the same rows

2010-07-14 Thread shimi

I wrote a code that iterate on all the rows by using get_range_slices. for the first call I use KeyRange from "" to "". for all the others I use from to "". I always get the same rows that I got in the previous iteration. I tried changing the batch size but I still gets the same results. I tried i

Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Peter Schuller

> [snip] > I'm not sure that is the case. > > When the server gets into the unrecoverable state, the repeating exceptions > are indeed "SocketException: Too many open files". [snip] > Although this is unquestionably a network error, I don't think it is > actually a > network problem per se, as the

Re: NYC Cassandra training

2010-07-14 Thread S Ahmed

How will we load the VM on our machines? Do we download it ? Is it running Ubuntu? On Wed, Jul 14, 2010 at 11:11 AM, Jonathan Ellis wrote: > Turns out we can get a list from Eventbrite: > http://www.eventbrite.com/org/474011012?s=1926097 > > On Tue, Jul 13, 2010 at 3:09 PM, Jonathan Ellis wr

Denver and Seattle Cassandra training

2010-07-14 Thread Jonathan Ellis

Denver on Sept 10 Seattle on Oct 8 http://www.eventbrite.com/org/474011012?s=1926097 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com

Re: NYC Cassandra training

2010-07-14 Thread Jonathan Ellis

Turns out we can get a list from Eventbrite: http://www.eventbrite.com/org/474011012?s=1926097 On Tue, Jul 13, 2010 at 3:09 PM, Jonathan Ellis wrote: > On Fri, Jul 9, 2010 at 9:36 AM, Jeremy Dunck wrote: >> On Fri, Jul 2, 2010 at 1:08 PM, Jonathan Ellis wrote: >>> Riptano's one day Cassandra tr

Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Jonathan Ellis

socketexception means this is coming from the network, not the sstables knowing the full error message would be nice, but just about any problem on that end should be fixed by adding connection pooling to your client. (moving to user@) On Wed, Jul 14, 2010 at 5:09 AM, Thomas Downing wrote: > On

Re: Authentication

2010-07-14 Thread Jonathan Ellis

Sounds good to me. On Wed, Jul 14, 2010 at 12:25 AM, Mike Malone wrote: > Yep, as Ben said, we're not asking for anyone to write this for us. > We've been playing with some ideas around encryption between EC2 > data-centers/regions (intra-region is already secure enough for us -- it's > all switc

RE: Using Pelops with Cassandra 0.7.X

2010-07-14 Thread Dop Sun

Hector will released one along with 0.7, or there are any beta or alpha before official release of 0.7? I’m planning to update my client to work with Cassandra 0.7 trunk now, and I have a dependency on your library. J Dop From: Ran Tavory [mailto:ran...@gmail.com] Sent: Wednesday, Ju

Re: How to stop Cassandra running in embeded mode

2010-07-14 Thread Ran Tavory

look at my pom. it has alwayshttp://github.com/rantav/hector/blob/master/pom.xml#L95 On Wed, Jul 14, 2010 at 3:02 PM, Andriy Kopachevsky wrote: > Ran, I do know to run jest in own thread with maven surefire plugin, but > don't sure how can I do this with own JVM for each test. How are you doing >

Re: How to stop Cassandra running in embeded mode

2010-07-14 Thread Andriy Kopachevsky

Ran, I do know to run jest in own thread with maven surefire plugin, but don't sure how can I do this with own JVM for each test. How are you doing this? Thanks. On Fri, Jul 9, 2010 at 10:33 PM, Ran Tavory wrote: > The workaround I do is fork always. Each test pulls up its own jvm. > > On Jul 9,

Re: Frequent crashes

2010-07-14 Thread Peter Schuller

> How much memory does cassandra need to manager 300G of data load? How much > extra memory will be needed when doing compaction? For one thing it depends on the data. One thing that scales linearly (but with a low constant) with the amount of data are the bloom filters. If those 300 GB correspond

Frequent crashes

2010-07-14 Thread 王一锋

Hi, Has anyboy done any memory usage analysis for cassandra? How much memory does cassandra need to manager 300G of data load? How much extra memory will be needed when doing compaction? Regarding mmap, memory usage will be determined by the OS so it has nothing to do with the heap size of JV

Best Practices

2010-07-14 Thread Mubarak Seyed

Are there any best practices for Storage configurations, MemTable thresholds and Linux performance tuning to tune Cassandra nodes? -- Thanks, Mubarak Seyed.

37 matches

Mail list logo