Re: Do supercolumns have a purpose?

2011-02-11 Thread Stu Hood
I would like to continue to support super columns, but to slowly convert them into "compound column names", since that is really all they really are. On Thu, Feb 10, 2011 at 10:16 AM, Frank LoVecchio wrote: > I've found super column families quite useful when using > RandomOrderedPartioner on a l

Re: Multiple inequality filters

2011-02-11 Thread Stu Hood
_But_, vote for https://issues.apache.org/jira/browse/CASSANDRA-1472 if you'd like to be able to perform this type of query easily*. Binned bitmap indexes can perform compound range queries extremely quickly. * Assuming that your data isn't extremely volatile, in which case those indexes are not t

Indexes and hard disk

2011-02-11 Thread mcasandra
Are indexes supported in Cassandra? If yes then what kind? Also, if it's supported then please point me to the place that give more information about it. Are there any kind of hard disk in particular recommended by Cassandra? We generally get only 500GB hard disks on our virtual machines. But I r

Re: Calculating the size of rows in KBs

2011-02-11 Thread Aklin_81
I think it does not deserialize the entire list of columns in the row(though it is the case with subcolumns in a supercolumn). In case of standard columns, only the blocks on the disk containing the columns values of the columns being asked for, are read and deserailized to get the values. On Sat,

Re: Calculating the size of rows in KBs

2011-02-11 Thread Stu Hood
> Does it also mean that the whole row will be deserialized when a query comes > just for one column? No, it does not mean that: at most column_index_size_in_kb will be read to read a single column, independent of where that column is in the row. On the other hand, with the row cache enabled, it i

RE: ORM over Cassandra

2011-02-11 Thread Vivek Mishra
compiled 1 post on this.. http://mevivs.wordpress.com/2011/02/12/hector-kundera/ Reason to put this in discussuion is to see what can be seen/further developed as a better option From: Vivek Mishra [vivek.mis...@impetus.co.in] Sent: 11 February 2011 20:01

Re: Calculating the size of rows in KBs

2011-02-11 Thread buddhasystem
Does it also mean that the whole row will be deserialized when a query comes just for one column? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Calculating-the-size-of-rows-in-KBs-tp6011243p6017870.html Sent from the cassandra-u...@incubator.a

Re: cassandra solaris x64 support

2011-02-11 Thread Xiaobo Gu
On Fri, Feb 11, 2011 at 11:54 PM, Sylvain Lebresne wrote: > > > On Fri, Feb 11, 2011 at 4:27 PM, Xiaobo Gu wrote: >> >> On Fri, Feb 11, 2011 at 11:21 PM, Roland Gude >> wrote: >> > This is a problem with the start scripts, not with Cassandra itself (or >> > any of its configuration) >> > The she

Re: Calculating the size of rows in KBs

2011-02-11 Thread Robert Coli
On Thu, Feb 10, 2011 at 12:24 PM, Aaron Morton wrote: > In general the entire row only exists in memory when it is contained in the > first Memtable it's written to. Or, somewhat importantly considering the serialization penalty paid to load it there, when it is in the Row Cache. As a simple ex

Re: Out of control memory consumption

2011-02-11 Thread Robert Coli
On Thu, Feb 10, 2011 at 9:17 AM, Huy Le wrote: > Yes, we had setting at 75 but JVM did not have enough time to do GC, so it > abort GC'ing.   We lowered it to 50, but still had issue, so we lowered it > again to 35. If you lower CMSInitiatingOccupancyFraction below the size of your typical workin

unsubscribe

2011-02-11 Thread Shu Zhang
unsubscribe

Re: Pig not reading all cassandra data

2011-02-11 Thread Matt Kennedy
Sorry it has taken me a while to get back to this. I'm still trying to get to the bottom of this to find where the disconnect is between the column family input format code and the Pig optimizer. I suspected that the problem was line 365 of: http://svn.apache.org/viewvc/pig/tags/release-0.8.0/src

RE: Basic Cassandra Architecture questions

2011-02-11 Thread mcasandra
What's the best practice in terms of consistency? I am assuming R+W > N should be the best practice. I thought even if R+W=N then there is some version level reconciliation that kicks off if in case older version of the key/value is read. But to think of it may not be possible. But then if R+W <=

Latest Hector release (0.7.0-26) includes experimental virtual keyspace support

2011-02-11 Thread Ed Anuff
The latest version of the Hector Java client has experimental support for a "virtual keyspaces" feature that transparently adds and removes a prefix to all row keys sent between Hector and Cassandra. There's a small write up of it here: https://github.com/rantav/hector/wiki/Virtual-Keyspaces The

RE: Basic Cassandra Architecture questions

2011-02-11 Thread Shu Zhang
"So if Key A is supposed to go to Node, 1,2,3 then the commit log for Key A will be on each of these nodes?" There isn't a commit log per key, just one for each node tracking what's been written to that node. If a node1 determines node2 or node3 should handle a request it received, it'll route

Re: Basic Cassandra Architecture questions

2011-02-11 Thread Ryan King
On Fri, Feb 11, 2011 at 9:37 AM, mcasandra wrote: > > Is commit log file maintained on every node that's responsible to keep key > ranges? So if Key A is supposed to go to Node, 1,2,3 then the commit log for > Key A will be on each of these nodes? Is this commit log like redo log of > oracle, whic

Re: Basic Cassandra Architecture questions

2011-02-11 Thread Anthony John
>I am trying to think why R + W > N is said to be consistent and not R + W = N? E.g RF of 4 - Write goes to nodes 1/2 and - in R+W=N case - Reads could happen from 3/4. Does your write could be missed! HTH, -JA On Fri, Feb 11, 2011 at 11:37 AM, mcasandra wrote: > > Is commit log file maintai

RE: Basic Cassandra Architecture questions

2011-02-11 Thread mcasandra
Is commit log file maintained on every node that's responsible to keep key ranges? So if Key A is supposed to go to Node, 1,2,3 then the commit log for Key A will be on each of these nodes? Is this commit log like redo log of oracle, which is used in case of failure to roll forward/back the writes

Re: Column name size

2011-02-11 Thread Chris Burroughs
On 02/11/2011 05:06 AM, Patrik Modesto wrote: > Hi all! > > I'm thinking if size of a column name could matter for a large dataset > in Cassandra (I mean lots of rows). For example what if I have a row > with 10 columns each has 10 bytes value and 10 bytes name. Do I have > half the row size just

Re: Column name size

2011-02-11 Thread Ryan King
On Fri, Feb 11, 2011 at 2:06 AM, Patrik Modesto wrote: > Hi all! > > I'm thinking if size of a column name could matter for a large dataset > in Cassandra  (I mean lots of rows). For example what if I have a row > with 10 columns each has 10 bytes value and 10 bytes name. Do I have > half the row

Re: cassandra solaris x64 support

2011-02-11 Thread Sylvain Lebresne
On Fri, Feb 11, 2011 at 4:27 PM, Xiaobo Gu wrote: > On Fri, Feb 11, 2011 at 11:21 PM, Roland Gude > wrote: > > This is a problem with the start scripts, not with Cassandra itself (or > any of its configuration) > > The shell you are using cannot start the cassandra shell script. > > > > Try > >

unsubscribe

2011-02-11 Thread Peter Halliday

Re: cassandra solaris x64 support

2011-02-11 Thread Xiaobo Gu
On Fri, Feb 11, 2011 at 11:21 PM, Roland Gude wrote: > This is a problem with the start scripts, not with Cassandra itself (or any > of its configuration) > The shell you are using cannot start the cassandra shell script. > > Try > #bash bin/cassandra -f You are right, but there are other problem

Re: Limit on amount of CFs

2011-02-11 Thread buddhasystem
I asked a similar question (but didn't receive an answer). I'm trying to see if a large number of CFs might be beneficial. One thing I can think about is the size of extra storage needed for compaction -- obviously it will be smaller in case of many smaller CFs. -- View this message in context:

Re: Column name size

2011-02-11 Thread buddhasystem
I've been thinking about this as well. I'm migrating data from a large Oracle database, and the RDBMS columns names are descriptive (good) and long (bad). For now I just keep them when populating Cassandra, but I can shave off about 30% of storage by hashing names. I don't need any automation and

AW: cassandra solaris x64 support

2011-02-11 Thread Roland Gude
This is a problem with the start scripts, not with Cassandra itself (or any of its configuration) The shell you are using cannot start the cassandra shell script. Try #bash bin/cassandra -f As far as I know, it should work fine. Actually it should work with sh as well... -Ursprüngliche N

Re: How to store news lists in optimal way?

2011-02-11 Thread Bill Speirs
I don't know enough about Lucene to comment, but option #2 seems like a bad idea. You shouldn't grow your database by the number of Column Families as there are bad implications to doing this. Option #1 or #3 seems plausible depending upon how much data you have. Hope this helps... Bill- On Fri,

Re: cassandra solaris x64 support

2011-02-11 Thread Xiaobo Gu
On Fri, Feb 11, 2011 at 10:51 PM, Jonathan Ellis wrote: > The vast majority run on Linux, but there are a few people running > Cassandra on Solaris, FreeBSD, and Windows. But I failed to start the one node test cluster, # sh bin/cassandra -f bin/cassandra: syntax error at line 22: `MAX_HEAP_SIZE=$

Re: cassandra solaris x64 support

2011-02-11 Thread Jonathan Ellis
The vast majority run on Linux, but there are a few people running Cassandra on Solaris, FreeBSD, and Windows. On Fri, Feb 11, 2011 at 4:40 AM, Xiaobo Gu wrote: > Hi, > Because I can't access the archives of the mailing list, so my > apologies if someone have asked this before. > > Does any have

RE: ORM over Cassandra

2011-02-11 Thread Vivek Mishra
Nope, i guess. But Kundera does. I am compiling a document to compare features of HOM/GORA/KUNDERA. to look into further details From: Dodong Juan [dodongj...@gmail.com] Sent: 11 February 2011 18:00 To: user@cassandra.apache.org Cc: user@cassandra.apache.o

Re: ORM over Cassandra

2011-02-11 Thread Dodong Juan
Does HOM support finders on any pojo attributes Sent from my iPhone On Feb 10, 2011, at 5:43 PM, "B. Todd Burruss" wrote: > wiki page is here ... > > https://github.com/rantav/hector/wiki/Hector-Object-Mapper-(HOM) > > >

Re: Super Slow Multi-gets

2011-02-11 Thread Bill Speirs
Sorry, I was setting the file on my client not the server. I will make this change and get back to you. Thanks again for the help... Bill- On Feb 10, 2011 4:45 PM, "Bill Speirs" wrote: > Doesn't seem to help, I just get a bunch of messages that look like this: > > DEBUG - Transport open status

Re: Column name size

2011-02-11 Thread Aklin_81
Would be interested in your findings, Patrik! ... I too was searching for something similar a few days back.. for column names that contained userIds of users on my application. UUIDs that seemed to be most widely recognized(perhaps!) solution are 16 bytes but those definitely seem like a too muc

AW: Data ends up in wrong Columnfamily

2011-02-11 Thread Roland Gude
Yes this could very well be the issue. As I see its already fixed for 0.7.1. Hopefully it will pass a vote soon. Thanks, Roland -Ursprüngliche Nachricht- Von: sc...@scode.org [mailto:sc...@scode.org] Im Auftrag von Peter Schuller Gesendet: Freitag, 11. Februar 2011 09:11 An: user@cassand

AW: Why is it when I removed a row the RowKey is still there?

2011-02-11 Thread Roland Gude
It has something to do with the way data is deleted in Cassandra. You are not doing anything wrong. See here http://wiki.apache.org/cassandra/FAQ#range_ghosts Or here: http://wiki.apache.org/cassandra/DistributedDeletes For some more detail -Ursprüngliche Nachricht- Von: Joshua Partogi [

cassandra solaris x64 support

2011-02-11 Thread Xiaobo Gu
Hi, Because I can't access the archives of the mailing list, so my apologies if someone have asked this before. Does any have successfully run Cassandra on Solaris 10 X64 clusters? Regards, Xiaobo Gu

Why is it when I removed a row the RowKey is still there?

2011-02-11 Thread Joshua Partogi
Hi, I am very puzzled with this. So I removed a row from the client, but when I query the data from CLI, the rowkey is still there: RowKey: 3 --- RowKey: 2 => (column=6e616d65, value=42696c6c, timestamp=1297338131027004) --- RowKey: 1 => (column=6e616d65, value=4a6f

Column name size

2011-02-11 Thread Patrik Modesto
Hi all! I'm thinking if size of a column name could matter for a large dataset in Cassandra (I mean lots of rows). For example what if I have a row with 10 columns each has 10 bytes value and 10 bytes name. Do I have half the row size just of the column names and the other half of the data (not c

Cassandra cluster on aws

2011-02-11 Thread Sooraj S
Hi, I'm using chef for automating some installation tasks on aws. I used infochimp's *"cluster_chef * / cookbooks / cassandra " cookbook. But there were some problems with the installatio

How to store news lists in optimal way?

2011-02-11 Thread Nikolay Fominyh
Hello. We have a problem with finding way to store user news list. Our task - we have 2 group of filters: 1) News Types(photo, video, audio, etc... ) 2) User Lists(user, friends, friends friends, etc..) Our solution ways: 1) Store matrix of filters for each user. UserNews: user_id1: :

AW: Data ends up in wrong Columnfamily

2011-02-11 Thread Roland Gude
Hi, machine A has absolutely no knowledge about the anything about the other application. Not even the columnfamily name. I was digging into this further: Since the data I find in the wrong space has a timestamp in its row key It was quite easy to find out that the data was relatively old. Unfo

Re: Zurich user group

2011-02-11 Thread Samuel Benz
Hi Sasha Nice idea! On 02/10/2011 12:02 PM, Sasha Dolgy wrote: > hi there, > > Are there any cassandra users in and around the zurich area interested in a > get together. Sometimes its good to discuss usage and concepts over > beverages... > Yes, we are using cassandra for the whois server fo

Re: Data ends up in wrong Columnfamily

2011-02-11 Thread Peter Schuller
> So far so good, but it regularly happens, that data from one application > ends up in columnfamilies reserved for the other application as well as the > intended columnfamily. Maybe https://issues.apache.org/jira/browse/CASSANDRA-1992 -- / Peter Schuller