Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Benjamin Black
On Mon, Jul 12, 2010 at 11:35 PM, Michael Dürgner wrote: > The thing about slow on joins is true (we experience that ourselves) but > still I wonder myself, why you use cassandra for the indices. Can't you just > store them in MySQL although? > ...and then shard and shard and shard to deal with

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Benjamin Black
We use Cassandra (multidimensional metrics) *and* redis (counters and alerts) *and* MySQL (supporting Rails). Right tool for each job. The idea that it is a good thing to cram everything into a single database (and data model), beaten into everyone by years of relational database marketing, is no

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Michael Dürgner
The thing about slow on joins is true (we experience that ourselves) but still I wonder myself, why you use cassandra for the indices. Can't you just store them in MySQL although? Am 13.07.2010 um 08:26 schrieb Sandeep Kalidindi at PaGaLGuY.com: > @paul - cassandra is really good for storing in

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Sandeep Kalidindi at PaGaLGuY.com
@paul - cassandra is really good for storing indices. But i like redis because it provides us with some of the really good data-structures like sorted sets and all. So we use both to their strengths. For example in a forum - all the posts and replies in a thread + which user is following which thre

Re: Authentication

2010-07-12 Thread Michael Pearson
Hey Stu, I've been using 0.6.3's SimpleAuthenticator without a hitch (just had to figure out the daemon args -Dpasswd.properties=conf/passwd.properties -Daccess.properties=conf/access.properties) - why do you ask? -michael -- http://www.github.com/mjpearson http://www.linkedin.com/in/mjpearso

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Michael Dürgner
Are your PVs mostly read or write? As if they are read, I'd think you wouldn't need a Cassandra like storage which is tuned towards writes. Am 12.07.2010 um 23:40 schrieb Sandeep Kalidindi at PaGaLGuY.com: > well we were going down constantly with VB running on 3-4 dedicated servers > due to h

Re: concurrent reads

2010-07-12 Thread Peter Schuller
> Has anyone experimented with different settings for concurrent reads?  I > have set our servers to 4 ( 2 per processor core ).  I have noticed that > occasionally, our pending reads will get backed up and our servers don't > appear to be under too much load.  In fact, most of the load appears to

Re: server needs thrift to run also?

2010-07-12 Thread Eric Evans
On Mon, 2010-07-12 at 23:13 -0500, Jonathan Ellis wrote: > would it be hard to make "easy_install pycassa" install thrift > automagically? I think it would do that already, assuming that pycassa itself was installable from the cheeseshop (not sure why it isn't). Twissandra is actually using an emb

Re: GCGraceSeconds per ColumnFamily/Keyspace

2010-07-12 Thread Jonathan Ellis
Probably. Can you open a ticket? On Mon, Jul 12, 2010 at 10:41 PM, Todd Burruss wrote: > Is it possible to get this feature in 0.7? > > > > -Original Message- > From: Jonathan Ellis [jbel...@gmail.com] > Received: 7/12/10 5:06 PM > To: user@cassandra.apache.org [u...@cassandra.apache.org

Re: server needs thrift to run also?

2010-07-12 Thread Jonathan Ellis
would it be hard to make "easy_install pycassa" install thrift automagically? On Mon, Jul 12, 2010 at 10:36 PM, Eric Evans wrote: > On Mon, 2010-07-12 at 17:16 -0400, S Ahmed wrote: >> Ok I guess I have to read up on exactly what is going on here. >> >> I figured I could download twissandra, fire

Re: concurrent reads

2010-07-12 Thread Jonathan Ellis
if you're not sure where your bottleneck is, you aren't hitting it hard enough :) On Mon, Jul 12, 2010 at 9:00 PM, Lee Parker wrote: > Has anyone experimented with different settings for concurrent reads?  I > have set our servers to 4 ( 2 per processor core ).  I have noticed that > occasionally

Re: GCGraceSeconds per ColumnFamily/Keyspace

2010-07-12 Thread Todd Burruss
Is it possible to get this feature in 0.7? -Original Message- From: Jonathan Ellis [jbel...@gmail.com] Received: 7/12/10 5:06 PM To: user@cassandra.apache.org [u...@cassandra.apache.org] Subject: Re: GCGraceSeconds per ColumnFamily/Keyspace GCGS per CF sounds totally reasonable to me.

Re: server needs thrift to run also?

2010-07-12 Thread Eric Evans
On Mon, 2010-07-12 at 17:16 -0400, S Ahmed wrote: > Ok I guess I have to read up on exactly what is going on here. > > I figured I could download twissandra, fire up cassandra and run the > app! Pretty much, but you do need to install the Thrift python module (which the README does say). Try: ea

CassandraBulkLoader

2010-07-12 Thread Mubarak Seyed
Where can i find the documentation for BinaryMemTable (btm_example in contrib) to use CassandraBulkLoader? Do i need the HDFS to store my storage-conf.xml? What is the input to be supplied to CassandraBulkLoader? How to form the input data and what is the format of an input data? -- Thanks, Mub

concurrent reads

2010-07-12 Thread Lee Parker
Has anyone experimented with different settings for concurrent reads? I have set our servers to 4 ( 2 per processor core ). I have noticed that occasionally, our pending reads will get backed up and our servers don't appear to be under too much load. In fact, most of the load appears to be from

Re: TechCrunch article on Twitter and Cassandra

2010-07-12 Thread Colin Clark
Glad to hear it; and I'm thrilled to see innovation occurring in this space. But one consulting company that's been in business for a couple of months now wouldn't help me with tier 1 deployments. I'm looking forward to having an ecosystem around Cassandra - I think that Cassandra, and other

Re: TechCrunch article on Twitter and Cassandra

2010-07-12 Thread Jonathan Ellis
On Sat, Jul 10, 2010 at 2:22 PM, Colin Clark wrote: > Although I'm a fan of Cassandra, there's no way I'd use it today for my tier > 1 deployments, because I don't have the resources of Facebook, and even > though Cassandra is open source, that doesn't mean I can fix it when it goes > down.  And,

Re: GCGraceSeconds per ColumnFamily/Keyspace

2010-07-12 Thread Jonathan Ellis
GCGS per CF sounds totally reasonable to me. On Mon, Jul 12, 2010 at 6:33 PM, Todd Burruss wrote: > I have two CFs in my keyspace.  one i care about allowing a good amount of > time for tombstones to propagate (GCGraceSeconds large) ... but the other i > couldn't care and in fact i want them go

GCGraceSeconds per ColumnFamily/Keyspace

2010-07-12 Thread Todd Burruss
I have two CFs in my keyspace. one i care about allowing a good amount of time for tombstones to propagate (GCGraceSeconds large) ... but the other i couldn't care and in fact i want them gone ASAP so i don't iterate over them. has any thought been given to making this setting per Keyspace or

RE: Question about CL.ZERO

2010-07-12 Thread Todd Burruss
the goal i am reaching for with ZERO is to return control to the "user" ASAP, with super fast response times. the load isn't high at all, but persisting does take time even under light load. we are not actually using ZERO at the moment but were considering it for "fire and forget" type of even

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Paul Prescod
Why Cassandra *and* Redis? What do you perceive as the strengths or weaknesses of the two? On Mon, Jul 12, 2010 at 2:40 PM, Sandeep Kalidindi at PaGaLGuY.com wrote: > well we were going down constantly with VB running on 3-4 dedicated servers > due to huge traffic(couple of tens of millions of pa

Re: server needs thrift to run also?

2010-07-12 Thread Miguel Verde
I'll take a guess. S Ahmed, the Thrift compiler takes a .thrift file and can generate client and server code for it in your language of choice. This code depends on the Thrift runtime library in that language. For instance, the Thrift Java runtime library is bundled with Cassandra as a jar. Whe

Re: Is anyone using version 0.7 schema update API

2010-07-12 Thread Benjamin Black
I guess I don't understand what is so complicated about the schema management calls that numerous examples are needed. On Mon, Jul 12, 2010 at 4:43 AM, GH wrote: > Hi, > My problem is that I cannot locate Java equivalents to the api calls you > present in the ruby files you have presented. They a

Re: server needs thrift to run also?

2010-07-12 Thread Benjamin Black
You were just told it is packaged with what it needs. The API is not changed from 0.6.1 to 0.6.3. Why do you think you need to generate client code? On Mon, Jul 12, 2010 at 2:16 PM, S Ahmed wrote: > Ok I guess I have to read up on exactly what is going on here. > I figured I could download twis

Re: Question about CL.ZERO

2010-07-12 Thread Benjamin Black
CL.ONE represents the fastest you can sustain. CL.ZERO represents writing to memory on the coordinator, regardless of what the nodes can sustain for durable writes. That is a bad situation, regardless of your durability goals. So, there is no good reason. What you are describing is a non-existe

Re: Iterate all keys - doing it as the faq fails for me :(

2010-07-12 Thread Jonathan Ellis
I can't picture how you could be reading data that sorts *before* the start key with a range slice. So, probably not fixed in 0.6.3. On Mon, Jul 12, 2010 at 7:56 AM, Per Olesen wrote: > >>This is a bug.  Can you submit a ticket with test data to reproduce? > > Uuuh, maybe...:) > > Right now it i

Re: Question about CL.ZERO

2010-07-12 Thread Aaron Morton
My understanding is that the coordinator will acknowledge the writes faster then they can actually be written. Eventually it will run out of buffer space. see http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_insertsUsing CL.ONE makes it harder for the clients to flood the cluster with mo

Re: Question regarding consistency and deletion

2010-07-12 Thread Aaron Morton
The Tomstones are removed after GCGraceSeconds (in the storage-config.xml), at the next Major Compaction http://wiki.apache.org/cassandra/MemtableSSTable?highlight=%28tombstones%29Take a look at http://wiki.apache.org/cassandra/DistributedDeletes  and Handling Failure on http://wiki.apache.org/

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Sandeep Kalidindi at PaGaLGuY.com
well we were going down constantly with VB running on 3-4 dedicated servers due to huge traffic(couple of tens of millions of page views). We are also planning on some new major features, hence the shift to cassandra with future in mind. Well roughly the architecture is like this(in order of how t

Authentication

2010-07-12 Thread Stu Hood
Hello out there, If you are running Cassandra 0.6.*, and are using Cassandra's authentication (IAuthenticator/SimpleAuthenticator), I'd love to hear about it! Thanks, Stu Hood @stuhood Architecture Software Developer Rackspace Hosting

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread S Ahmed
What sort of traffic levels made you port the application to Cassandra? Very interested in seeing this go live. What sort of server setup are you looking at using? On Mon, Jul 12, 2010 at 4:39 PM, Sandeep Kalidindi at PaGaLGuY.com < sandeep.kalidi...@pagalguy.com> wrote: > No we re-coded from s

Re: server needs thrift to run also?

2010-07-12 Thread S Ahmed
Ok I guess I have to read up on exactly what is going on here. I figured I could download twissandra, fire up cassandra and run the app! I thought all you needed was the python driver which comes with twissandra. Let me read more about Thrift and generating client code etc. thanks! On Mon, Jul

Re: server needs thrift to run also?

2010-07-12 Thread Michael Pearson
Twissandra is packaged with pycassa + correct generated thrift transports under /deps already, so really just need the thrift binary to build from a cassandra.thrift API newer than what's currently supported by the bundled pycassa. -michael On Mon, Jul 12, 2010 at 1:55 PM, Stu Hood wrote: > You'

Re: server needs thrift to run also?

2010-07-12 Thread Stu Hood
You'll need Thrift installed to generate the _client_ code: the server code is embedded within Cassandra. -Original Message- From: "S Ahmed" Sent: Monday, July 12, 2010 3:49pm To: user@cassandra.apache.org Subject: Re: server needs thrift to run also? confused, why does the installation

Re: server needs thrift to run also?

2010-07-12 Thread S Ahmed
confused, why does the installation guide say to build and make it then? http://github.com/ericflo/twissandra twissandar is for 0.6.1 is that why? i.e. it was embedded in a later version? On Mon, Jul 12, 2010 at 4:46 PM, Stu Hood wrote: > The Thrift server

RE: server needs thrift to run also?

2010-07-12 Thread Stu Hood
The Thrift server is embedded in Cassandra, and starts by default. Look for references to Thrift on: http://wiki.apache.org/cassandra/GettingStarted Thanks, Stu -Original Message- From: "S Ahmed" Sent: Monday, July 12, 2010 3:43pm To: user@cassandra.apache.org Subject: server needs thri

Re: TechCrunch article on Twitter and Cassandra

2010-07-12 Thread Eric Evans
On Sun, 2010-07-11 at 01:06 +0530, Sumit Datta wrote: > What I do not see are details as to why Cassandra is not being used to > store tweets. Or the details of the implementation that does have > Cassandra. I wouldn't let that stop you. You should consider doing what so many others are: treat al

server needs thrift to run also?

2010-07-12 Thread S Ahmed
I'm trying to follow along the twissandra installation instructions. So to get it running I have to install Thrift. So thrift runs as another service? So communication is done via thrift, which then communicates to Cassandra on another port?

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Sandeep Kalidindi at PaGaLGuY.com
No we re-coded from scratch with most of the needed functionality. Cheers, Deepu. On Mon, Jul 12, 2010 at 7:49 PM, S Ahmed wrote: > Very interesting! > > What kind of integration do you have between vB and Cassandra? its not a > port then? > > > On Mon, Jul 12, 2010 at 3:34 AM, Sandeep Kalidind

Re: Question about CL.ZERO

2010-07-12 Thread B. Todd Burruss
why is there no good reason? if i would like to record informational events, possibly for debugging or something, i don't care if they actually get saved and i want the client's request to be as fast as possibly. this sounds like a good reason. are you saying that CL.ONE is equally performan

Question regarding consistency and deletion

2010-07-12 Thread Samuru Jackson
Hi, I'm fairly new to Cassandra and started to set up a small cluster for playing around and evaluating it for my potential purposes. As far as I understand I can't remove whole rows - instead the columns of a deleted rows are removed and a client can decided based on the row's column count if it

Re: High CPU usage on all nodes without any read or write

2010-07-12 Thread Olivier Rosello
> > But in Cassandra output log : > > r...@cassandra-2:~#  tail -f /var/log/cassandra/output.log > >  INFO 15:32:05,390 GC for ConcurrentMarkSweep: 1359 ms, 4295787600 > reclaimed leaving 1684169392 used; max is 6563430400 > >  INFO 15:32:09,875 GC for ConcurrentMarkSweep: 1363 ms, 4296991416 > rec

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread S Ahmed
Very interesting! What kind of integration do you have between vB and Cassandra? its not a port then? On Mon, Jul 12, 2010 at 3:34 AM, Sandeep Kalidindi at PaGaLGuY.com < sandeep.kalidi...@pagalguy.com> wrote: > we were one of the vbulletin customers and our forums has been facing some > bad sca

SV: Iterate all keys - doing it as the faq fails for me :(

2010-07-12 Thread Per Olesen
>This is a bug. Can you submit a ticket with test data to reproduce? Uuuh, maybe...:) Right now it is happening on some life user data, that I am not sure I can ship. Haven't tried if I can reproduce locally. One question: We are running 0.6.2. Could this be fixed in 0.6.3? Not that big a pr

Re: Iterate all keys - doing it as the faq fails for me :(

2010-07-12 Thread Jonathan Ellis
This is a bug. Can you submit a ticket with test data to reproduce? On Fri, Jul 9, 2010 at 6:40 AM, Per Olesen wrote: > Hi, > > I was reading http://wiki.apache.org/cassandra/FAQ#iter_world and decided to > implement the get_range_slices method for listing all keys of a CF. Only > thing is, it

Re: Is anyone using version 0.7 schema update API

2010-07-12 Thread GH
Hi, My problem is that I cannot locate Java equivalents to the api calls you present in the ruby files you have presented. They are not visible in the java client packages I have (My code is not that old of trunk). I located the code below from some of the unit test code files This code will h

Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?

2010-07-12 Thread Sandeep Kalidindi at PaGaLGuY.com
we were one of the vbulletin customers and our forums has been facing some bad scaling issues. we coded our forum software to work with cassandra. we are still testing for bugs and might go live in couple of weeks. You can ask any specific questions about vbulletin and cassandra and i will answer

RE: Iterate all keys - doing it as the faq fails for me :(

2010-07-12 Thread Per Olesen
Anyone? - Hi, I was reading http://wiki.apache.org/cassandra/FAQ#iter_world and decided to implement the get_range_slices method for listing all keys of a CF. Only thing is, it doesn't work that well for me :-) I do as it says (I think), and take KeyRanges of size N and use the key of