Re: Staging website at cassandra.staged.apache.org

2020-04-22 Thread Aaron Morton
Thanks Mick, if there documentation somewhere on how we update the website
?

A

-
Aaron Morton
New Zealand
@aaronmorton

CEO
Apache Cassandra Consulting
http://www.thelastpickle.com


On Tue, 21 Apr 2020 at 18:40, Mick Semb Wever  wrote:

> For our cassandra-website repository, any changes to our website can now
> first be staged at https://cassandra.staged.apache.org/
>
> The staged website comes from the content/ directory on the `asf-staging`
> branch.
>
> regards,
> Mick
>


Re: [VOTE] Project governance wiki doc (take 2)

2020-06-25 Thread Aaron Morton
+1

-
Aaron Morton
New Zealand
@aaronmorton

CEO
Apache Cassandra Consulting
http://www.thelastpickle.com


On Thu, 25 Jun 2020 at 19:46, Benedict Elliott Smith 
wrote:

> The purpose of this document is to define only how the project makes
> decisions, and it lists "tenets" of conduct only as a preamble for
> interpreting the rules on decision-making.  The authors' intent was to lean
> on this to minimise the rigidity and prescriptiveness in the formulation of
> the rules (so that we could e.g. use "reasonable" repeatedly, instead of
> specifying precise expectations), in part because this is our first attempt
> to codify such rules, and in part because rigidity can cause unnecessary
> friction to a project that mostly runs smoothly.
>
> The document provides an avenue for resolving disputes in decision-making
> when these assumptions on behaviour breakdown. However its scope definitely
> isn't, at least in my opinion, addressing misbehaviour by individuals (i.e.
> one of the serious breaches listed in part 5 of the Apache CoC), which it
> seems to me you are addressing here?
>
> Since we reference the ASF CoC, and the ASF provides its own guide for
> handling CoC complaints (including within projects), that applies to that
> very CoC (and which you referenced), it's unclear to me what you're looking
> for.  Are you looking for a more project-specific CoC with different
> guidelines for reporting?  This is something you would be welcome to
> undertake, and seek consensus for.
>
>
>
>
> On 25/06/2020, 02:38, "Dinesh Joshi"  wrote:
>
> > On Jun 24, 2020, at 6:01 PM, Brandon Williams 
> wrote:
> >
> > On Wed, Jun 24, 2020 at 5:43 PM Dinesh Joshi 
> wrote:
> >> 1. How/Who/Where are we planning to deal with Code of Conduct
> violations? I assume this should be private@ but the document does not
> call it out as such. We should call it out explicitly as part of the PMC
> responsibilities. We should also clarify how and where are CoC violations
> against PMC members reported and handled? Should they go to ASF?
> >
> > I think if we assume good intent, this will be a non-issue.  People
> > may make mistakes, but I try to have faith they will realize them and
> > act accordingly when told so without any need to escalate.
>
> We need to spell out in the document how and where the CoC violations
> are reported irrespective of the role of the person in the community. This
> is a critical point to address. ASF spells this out very clearly[1]. We
> should have a similar statement in the Project Governance document,
> otherwise it feels incomplete to me.
>
> Dinesh
>
> [1]
> http://www.apache.org/foundation/policies/conduct.html#reporting-guidelines
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 2.1.8

2015-07-06 Thread Aaron Morton
> 2.1.8 release vote right on top of 2.1.6 and 2.1.7.
I havent dug into the specific issues, but given the small list of changes
and release velocity, those two older releases should probably be
considered an "upgrade now" trigger with clients.

​Thanks for the heads up.

Guess we should keep a list of this sort of thing somewhere.

A​


-----
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On Mon, Jul 6, 2015 at 12:52 PM, Gary Dusbabek  wrote:

> +1
>
> On Mon, Jul 6, 2015 at 12:04 PM, Jake Luciani  wrote:
>
> > I propose the following artifacts for release as 2.1.8.
> >
> > sha1: db39257c34152f6ccf8d53784cea580dbfe1edad
> > Git:
> >
> >
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.1.8-tentative
> > Artifacts:
> >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1063/org/apache/cassandra/apache-cassandra/2.1.8/
> > Staging repository:
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1063/
> >
> > The artifacts as well as the debian package are also available here:
> > http://people.apache.org/~jake
> >
> > The vote will be open for 72 hours (longer if needed).
> >
> > [1]: http://goo.gl/BFYiEO (CHANGES.txt)
> > [2]: http://goo.gl/24XaPp (NEWS.txt)
> >
>


Re: Problem while configuring key and row cache?

2012-08-23 Thread aaron morton
Use info….

$ bin/nodetool -h localhost info
…
Key Cache: size 672 (bytes), capacity 52428768 (bytes), 12 hits, 17 
requests, 0.706 recent hit rate, 14400 save period in seconds
Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN 
recent hit rate, 0 save period in seconds

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/08/2012, at 5:18 PM, Amit Handa  wrote:

> Hi,
> 
> Thanks Jonathan for your reply.
> I modified key_cache_size_in_mb and row_cache_size_in_mb values inside
> cassandra.yaml. but not able to see it's effect using command " *./nodetool
> -h 107.108.189.212 cfstats*". Can u let me know how to verify that the
> setting for key_cache_size and row_chache_size has taken place.
> 
> With Regards,
> Amit
> 
> 
> On Tue, Aug 21, 2012 at 8:19 PM, Jonathan Ellis  wrote:
> 
>> setcachecapacity is obsolete in 1.1+.  Looks like we missed removing
>> it from nodetool.  See
>> http://www.datastax.com/dev/blog/caching-in-cassandra-1-1 for
>> background.
>> 
>> (Moving to users@.)
>> 
>> On Tue, Aug 21, 2012 at 8:19 AM, Amit Handa  wrote:
>>> I started exploring apache cassandra 1.1.3. I am facing problem with how
>> to
>>> improve performance of cassandra using caching configurations.
>>> I tried setting following configurations:
>>> 
>>> ./nodetool -h 107.108.189.204 setcachecapacity DemoUser Users 25 0
>>> ./nodetool -h 107.108.189.204 setcachecapacity DemoUser Users 0 25
>>> ./nodetool -h 107.108.189.204 setcachecapacity DemoUser Users 25
>> 25
>>> ./nodetool -h 107.108.189.204 setcachecapacity DemoUser Users 444 444
>>> 
>>> 
>>> But when i am checking that this particula configuration are really been
>>> configured using command:
>>> ./nodetool -h 107.108.189.212 cfstats
>>> 
>>> it's showing following results for keySpace DemoUser and column Family
>>> Users:
>>> *Keyspace: DemoUser
>>>Read Count: 21914
>>>Read Latency: 0.08268495026010769 ms.
>>>Write Count: 87656
>>>Write Latency: 0.06009481381765082 ms.
>>>Pending Tasks: 0
>>>Column Family: Users
>>>SSTable count: 1
>>>Space used (live): 1573335
>>>Space used (total): 1573335
>>>Number of Keys (estimate): 22016
>>>Memtable Columns Count: 0
>>>Memtable Data Size: 0
>>>Memtable Switch Count: 1
>>>Read Count: 21914
>>>Read Latency: 0.083 ms.
>>>Write Count: 87656
>>>Write Latency: 0.060 ms.
>>>Pending Tasks: 0
>>>Bloom Filter False Postives: 0
>>>Bloom Filter False Ratio: 0.0
>>>Bloom Filter Space Used: 41104
>>>Compacted row minimum size: 150
>>>Compacted row maximum size: 179
>>>Compacted row mean size: 179 *
>>> 
>>> I am unable to see the effect of above setcachecapacity command. Let me
>>> know how i can configure the cache capacity, and check it's effect.
>>> 
>>> With Regards,
>>> Amit
>> 
>> 
>> 
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>> 



Re: Batch Truncate using Hector 1.0-5

2012-11-25 Thread aaron morton
The hector user list is the best place this question 
https://groups.google.com/forum/?fromgroups#!forum/hector-users

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 22/11/2012, at 8:53 AM, Amitabha Karmakar  
wrote:

> Hi,
> 
> Is there any way I could do a batch truncate using hector 1.0-5 ?
> 
> Thanks !



Re: Proposal: require Java7 for Cassandra 2.0

2013-02-11 Thread aaron morton
+1
-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/02/2013, at 11:21 AM, Jonathan Ellis  wrote:

> Java 6 EOL is this month.  Java 7 will be two years old when C* 2.0
> comes out (July).  Anecdotally, a bunch of people are running C* on
> Java7 with no issues, except for the Snappy-on-OS-X problem (which
> will be moot if LZ4 becomes our default, as looks likely).
> 
> Upgrading to Java7 lets us take advantage of new (two year old)
> features as well as simplifying interoperability with other
> dependencies, e.g., Jetty's BlockingArrayQueue requires java7.
> 
> Thoughts?
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced



Re: ApacheCon North America

2013-02-11 Thread aaron morton
I'll be there from the evening on the Wednesday 27th to Friday 1st midday.

Talking on Thursday afternoon about C* internals. 
  
Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/02/2013, at 4:26 AM, Eric Evans  wrote:

> Hi All
> 
> It's now about 2 weeks until ApacheCon North America, which is taking
> place Sunday 24th Feb - Thursday 28th in Portland. Quite a few people
> from our project will be there, and we'd love to see you all!
> 
> If you haven't already registered for the conference, then we've some
> good news - we've managed to snag a 20% discount for you! To register
> with the 20% off, use code PMC or the link
> http://acna13.eventbrite.com/?discount=PMC
> 
> To see what the talks are, including the ones relating to Cassandra,
> please see the schedule -http://na.apachecon.com/schedule/
> 
> Would you like to get more involved in the project?  A number of
> people will be at the (Free!) Hackathon on the Monday. Ours will focus
> on CQL drivers, but if you would like to learn more about
> contributing, get some mentoring on a patch, or help collaborate on
> some fixes, then by all means come join us.  If you'd like to come,
> whether you can make it to the main conference or not, the details are
> on the ApacheCon wiki: http://wiki.apache.org/apachecon/HackathonNA13
> 
> Also talking of free, there will be a BarCamp on the Sunday. This is
> open to everyone, Portland natives and conference-goers alike, and
> should be a great chance to share new ideas and learn about existing +
> upcoming projects. To sign up to come to that, or learn more, it's
> http://wiki.apache.org/apachecon/BarCampApachePortland
> 
> Hopefully see some of you in Portland in a few weeks!
> ---
> 
> Thanks
> 
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu



Re: Rename failed while cassandra is starting up

2013-04-14 Thread aaron morton
Replying on the user group.

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 14/04/2013, at 3:50 PM, Boris Yen  wrote:

> Hi All,
> 
> Recently, we encountered an error on 1.0.12 that prevented cassandra from
> starting up. From the log messages, it looked like the table/keyspace was
> opened before the scrubDataDirectories was executed. This created a race
> condition between two threads. One was trying to rename files while the
> other was trying to remove tmp files. I was wondering if anyone could
> provide us some information or workaround for this.
> 
> INFO [MemoryMeter:1] 2013-04-09 02:49:39,868 Memtable.java (line 186)
> CFS(Keyspace='fmzd', ColumnFamily='alarm.fmzd_alarm_category') liveRatio is
> 3.7553409423470883 (just-counted was 3.1413828689370487).  calculation took
> 2ms for 265 columns
> INFO [SSTableBatchOpen:1] 2013-04-09 02:49:39,868 SSTableReader.java (line
> 153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshRole-hd-2 (83 bytes)
> INFO [SSTableBatchOpen:2] 2013-04-09 02:49:39,868 SSTableReader.java (line
> 153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshRole-hd-1 (123 bytes)
> INFO [Creating index: alarm.fmzd_alarm_category] 2013-04-09 02:49:39,874
> ColumnFamilyStore.java (line 705) Enqueuing flush of
> Memtable-alarm.fmzd_alarm_category@413535513(14025/65835 serialized/live
> bytes, 275 ops)
> INFO [OptionalTasks:1] 2013-04-09 02:49:39,877 SecondaryIndexManager.java
> (line 184) Creating new index : ColumnDefinition{name=6d65736853534944,
> validator=org.apache.cassandra.db.marshal.UTF8Type, index_type=KEYS,
> index_name='fmzd_ap_meshSSID'}
> INFO [SSTableBatchOpen:1] 2013-04-09 02:49:39,895 SSTableReader.java (line
> 153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshSSID-hd-1 (122 bytes)
> INFO [SSTableBatchOpen:2] 2013-04-09 02:49:39,896 SSTableReader.java (line
> 153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshSSID-hd-2 (82 bytes)
> INFO [OptionalTasks:1] 2013-04-09 02:49:39,900 SecondaryIndexManager.java
> (line 184) Creating new index :
> ColumnDefinition{name=6d6f62696c6974795a6f6e654944,
> validator=org.apache.cassandra.db.marshal.UTF8Type, index_type=KEYS,
> index_name='fmzd_ap_mobilityZoneUUID'}
> ERROR [FlushWriter:1] 2013-04-09 02:49:39,916 AbstractCassandraDaemon.java
> (line 139) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.io.IOError: java.io.IOException: rename failed of
> /test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-hd-21-Data.db
> at
> org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:375)
> at
> org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:319)
> at
> org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:302)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:276)
> at org.apache.cassandra.db.Memtable.access$400(Memtable.java:49)
> at org.apache.cassandra.db.Memtable$4.runMayThrow(Memtable.java:299)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.io.IOException: rename failed of
> /test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-hd-21-Data.db
> at
> org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.java:355)
> at
> org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:371)
> ... 9 more
> INFO [SSTableBatchOpen:1] 2013-04-09 02:49:39,917 SSTableReader.java (line
> 153) Opening /test/db/data/fmzd/ap.fmzd_ap_mobilityZoneUUID-hd-1 (312 bytes)
> INFO [FlushWriter:2] 2013-04-09 02:49:39,916 Memtable.java (line 246)
> Writing Memtable-alarm.fmzd_alarm_alarmCode@402202831(2958/22542
> serialized/live bytes, 58 ops)
> ERROR [main] 2013-04-09 02:49:39,916 AbstractCassandraDaemon.java (line
> 373) Exception encountered during startup
> java.io.IOError: java.io.IOException: Failed to delete
> /test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-tmp-hd-21-Statistics.db
> at
> org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyStore.java:372)
> at
> org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyStore.java:415)
> at
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:193)
> at
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)
> at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
> Caused by: java.io.IOException: Failed to delete
> /test/db/data/fmzd/alarm.fmzd_ala

wiki access

2014-05-29 Thread Aaron Morton
Hi my wiki access has somehow died, my user name is aaronmorton. 

Could you please reset my password or generate a new account.

Thanks
Aaron

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com



Re: wiki access

2014-06-01 Thread Aaron Morton
It was the case sensitivity. 

Weird because I was in 1Password. 

In now, thanks. 

Cheers
Aaron

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/05/2014, at 6:58 pm, Jonathan Ellis  wrote:

> Is it case sensitive?  We have you as AaronMorton.
> 
> We can whitelist a new account if you create one.
> 
> On Fri, May 30, 2014 at 5:25 AM, Aaron Morton  wrote:
>> Hi my wiki access has somehow died, my user name is aaronmorton.
>> 
>> Could you please reset my password or generate a new account.
>> 
>> Thanks
>> Aaron
>> 
>> -
>> Aaron Morton
>> New Zealand
>> @aaronmorton
>> 
>> Co-Founder & Principal Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced



Re: Hadoop package exposed through thrift

2010-06-09 Thread aaron morton
I'm not up to speed with Hadoop in Cassandra, but regular Hadoop provides a IO 
stream interface so it can be used with non Java languages. 

http://hadoop.apache.org/common/docs/r0.15.2/streaming.html

That may be of help. 
Aaron

On 9 Jun 2010, at 09:53, Jeremy Hanna wrote:

> I just didn't know if there were any way to make it easier for the non-java 
> crowd to take advantage of it.  I'll give it some more thought.
> 
> On Jun 8, 2010, at 4:05 PM, Jonathan Ellis wrote:
> 
>> exposing it through thrift would mean the path would be
>> 
>> client
>> to cassandra [processing thrift command]
>> to hadoop [giving it a job]
>> to cassandra [fetching the data]
>> to hadoop [m/r]
>> to cassandra [handing result back]
>> to client
>> 
>> it just doesn't seem like a good design to me.
>> 
>> additionally, thrift is meant more for "stuff your app is doing
>> constantly" while hadoop handles analytics queries.  this separation
>> of duties makes a lot of sense to me.
>> 
>> On Tue, Jun 8, 2010 at 1:45 PM, Jeremy Hanna  
>> wrote:
>>> When I gave a presentation on cassandra+hadoop, some ruby folks were 
>>> wondering about the possibility of using the MapReduce functionality in a 
>>> language other than Java.
>>> 
>>> I was just wondering if any thought was given to exposing the 
>>> org.apache.cassandra.hadoop functionality through thrift.  That way the 
>>> MapReduce code could be used by several languages and secondarily by client 
>>> authors.
>>> 
>>> I'm just trying to see if there is any reason why it wasn't exposed through 
>>> thrift or if more needs to be done before it could be exposed to languages 
>>> other than Java.
>>> 
>>> Thanks,
>>> 
>>> Jeremy
>> 
>> 
>> 
>> -- 
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
> 



Re: Secondary indexing and 0.6/0.7 integration with Datanucleus

2010-06-16 Thread aaron morton
I've not read up on the secondary indexes, but am doing some thing similar. I 
got some inspiration from the Lucandra project. You will probably need to make 
multiple calls to the cassandra for each clause of your query.

The design I used had two CF's rough idea was; in the TermDocIndex the key term 
(e.g. lastName=Smith) and the column names are the keys for the object / 
document the term is from e.g. key1. The DocTermIndex uses the object/doc id as 
the key and has columns for each term the document contains, e.g. 
"lastname=Smith"). I also maintained some stats on how many objects/documents 
had the term (using redis, will move to cassandra counters in 0.7 perhaps). 

The query process then becomes.
1.  Determine the most selective term in the query using the stats
2.  Do a get_slice to get the first X (1000 perhaps) column values from the 
TermDocIndex using the term key.
3.  Use the keys from step 2 in a multi_get_slice against the DocTermIndex, 
passing the list of keys from 2 and listing the remaining terms as the column 
names you want to get back. 
4.  From the result of 3 filter all keys that returned less columns that we 
asked for. 
5.  Repeat from 3 if needed. 

I was hoping the limit in step 2 would bound the queries into the cluster, and 
the multiget in step 3 would be better at distributing the most of the work 
around the cluster. E.g. rather than reading 1000 columns from, say, 3 keys. It 
reads 3 columns from 1000 keys.

Aaron


On 16 Jun 2010, at 16:57, Todd Nine wrote:

> No problem,
>  I didn't want to implement my own solution if an existing one could
> easily be applied.  Since I'll be creating CF that represent secondary
> indexes, I'll need to perform range scans over the keys of those
> secondary index CFs.  The column names within the CF's are the row keys
> of the primary table.  Is there a way I can get the intersection of all
> of the column names from multiple ranges scans over different column
> families in one result set?  Otherwise I'll need to make multiple trips
> and create the intersection myself in my plugin.  Here is an example of
> what I'm trying to do.
> 
> CF: Person
> 
> key1: {
>   firstName: John
>   lastName: Smith
>   email: smi...@foo.com
> }
> 
> key2: {
>  firstName: Jane
>  lastName: Smith
>  email: smi...@foo.com
> }
> 
> key3: {
>  firstName: Jane
>  lastName: Doe
>  email: smi...@foo.com
> }
> 
> 
> My secondary index tables would be the following
> 
> CF: Person_LastName
> 
> Smith:{
>  key1: 0x00
>  key2: 0x00
> }
> 
> Doe: {
>  key3:0x00
> }
> 
> CF: Person_Email
>  smi...@foo.com:{
>key1:0x00
>key2:0x00 
>key3:0x00
> }
> 
> If my input is something similar to lastName == 'Smith' && email ==
> "smi...@foo.com", I would return all columns from key "Smith" in CF
> Person_LastName, and all columns from key "smi...@foo.com" in CF
> Person_Email.  The intersection of the two sets is key1, and key2, and
> have cassandra only return those rows.
> 
> Thanks,
> Todd
> 
> 
> 
> 
> 
> On Tue, 2010-06-15 at 23:38 -0500, Jonathan Ellis wrote:
> 
>> No chance that 749 can be backported to 0.6, sorry.
>> 
>> On Tue, Jun 15, 2010 at 10:35 PM, Todd Nine  wrote:
>> 
>>> Lets try that again.
>>> 
>>> This is the intended issue.
>>> 
>>> https://issues.apache.org/jira/browse/CASSANDRA-749
>>> 
>>> thanks,
>>> Todd
>>> 
>>> 
>>> 
>>>  On Tue, 2010-06-15 at 20:02 -0500, Jonathan Ellis wrote:
>>> 
>>> What issue were you trying to link? :)
>>> 
>>> On Tue, Jun 15, 2010 at 6:56 PM, Todd Nine  wrote:
 Hi all,
 I'm implementing a Datanucleus plugin for Cassandra.  I'm finished
 with the basic functionality, and everything seems to work pretty well.
 Now my issue is performing secondary indexing on fields within my data.
 I have outlined some of the issues I'm facing in this post.
 
 http://www.datanucleus.org/servlet/forum/viewthread_thread,6087_lastpage,yes#32610
 
 Essentially, for each operand the user specifies, I will need to make a
 trip to Cassandra, load the key columns, then perform an intersection
 with the result from my previous read.  Eventually at the end of all the
 intersections, I will have a list of keys I will then load.  This
 obviously requires several trips to Cassandra, where from my
 understanding of secondary indexing, I would only need to make one trip
 for multiple operands over a column family.I've read over this
 issue.
 
 http://issues.apache.org/jira/browse/CASSANDRA-32610
 
 And it seems to solve a lot of my woes.  Is it possible/recommended to
 patch the current code base of 0.6.2 to perform this functionality?
 
 Thanks,
 Todd
 
 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 



Re: Atomic Compare and Swap

2010-06-23 Thread aaron morton
I've been playing with something like CAS, it's not the same but it  
may be of interest.


I write some data into Cassandra with quorum or better consistency,  
that allows me to assert what it should look like when read back. If  
the assertion holds I can then go ahead.


For example, in a CF with Time uuid ordering the client writes a  
column against the key of the thing we want to update. This write does  
not store the value. Then read back the first ordered column, if it's  
name is my uuid then I can proceed. Otherwise delete the column. If  
you know the uuid of the last update you can read back two columns.  
Then assert your the first and the previous is the second.


Perhaps if you were doing a CAS you could then write then actual value  
you want to update and somehow store the uuid from above with it. Say  
as col in another col family with  name as the uuid and value as the  
value. To read get the first colum from both CFs as a multi get, the  
col names must match from both cols for the value to be correct.


(could just use two diff keys in same CF)

Hope that makes sense.
Aaron






On 23/06/2010, at 4:27 PM, Mike Malone  wrote:

I'd be interested in what the folks who want CAS implementations  
think about
vector clocks. Can you use them to fulfill your use cases? If not,  
why not?


I ask because I have found myself wanting CAS in Cassandra too, but  
I think
that's only because I'm pretty familiar with HTTP. I think vector  
clocks
with client merge give you essentially the same functionality, but  
in a way
that fits much more nicely with the rest of the Cassandra  
architecture. CAS

really exacerbates Cassandra's weaknesses.

Mike

On Tue, Jun 22, 2010 at 4:52 PM, Rishi Bhardwaj  
wrote:





S>: An *atomic* CAS is another beast and I see at least two  
difficulties:


S>: 1) making it atomic locally: Cassandra's implementation is very  
much

multi-threaded. On a given node, while you're

reading-comparing-and-swapping

on some column c, no other thread should be allowed to write c (even

'normal'
write). You would probably need to have specific column families  
where CAS

is
allowed and for which all writes would be slower (since some  
locking would

be
involved). Even then, making such locking efficient and right is  
not easy.

But

in the end, local atomicity is quite probably the easy part.


R: I am curious as to how does Cassandra handle two concurrent  
writes to

the same column right now? Is there any locking on the write path to
serialize two writes to the same column? If there is any locking  
then CAS
can build on that. If there is no such locking then we could  
exclude normal
writes from the synchronization/locking required for CAS. So the  
normal
write path remains the same, and we let the client know that atomic  
CAS
wouldn't work if normal writes are also happening on the same  
column values.
In short a client should not mix normal writes with Atomic CAS for  
writing

some column value. This will hopefully make things simpler.

S:>2) making it atomic cluster-wide: data is replicated and an  
atomic CAS

would
need to apply on the exact same column version in every node.  
Which, with

eventual consistency especially, is pretty hard to accomplish unless

you're

locking the cluster (but that's what Cages/ZK do).


R: For starters it would be great if atomic CAS could work for  
consistency
level Quorum and ALL and not be supported for other consistency  
levels. Even
for other consistency levels what would stop CAS to work? Why would  
one
require cluster wide locking? I might be mistaken here but the  
atomic CAS

operation would happen individually at all the replica nodes (either
directly or through hinted writes) and would succeed or fail  
depending on
the timestamp/version of the column at the replica. If we do Quorum  
reads

and CAS writes then we can also be sure about consistency.

S:>That being said, if you have a neat solution for efficient and
distributed
atomic CAS that doesn't require rewriting 80% of Cassandra, I'm  
sure there

will be interest in that.



R: That sounds great. I am definitely going to look into this and  
report

back if I have a good solution.


Thanks,
Rishi





From: Sylvain Lebresne 
To: dev@cassandra.apache.org
Sent: Tue, June 22, 2010 1:21:51 AM
Subject: Re: Atomic Compare and Swap

On Mon, Jun 21, 2010 at 11:19 PM, Rishi Bhardwaj >

wrote:
I have read the post on cages and it is definitely very  
interesting. But
cages seems to be too coarse grained compared to an Atomic Compare  
and

Swap
on Cassandra column value. Cages would makes sense when one wants  
to do
multiple atomic row, column updates. Also, I am not so sure about  
the

scalability when it comes to using zookeeper for keeping locks on

Cassandra

columns... there would also be performance hit with an added RPC for

every
write. I feel Cages maybe fine for systems when one has few locks  
but I

feel

an atomic CAS in

Re: Cassandra and Lucene

2010-07-25 Thread Aaron Morton
You may need to provide a some more information. What's the cluster configuration, what version, what's in the logs etc. AaronOn 24 Jul, 2010,at 03:40 AM, Michelan Arendse  wrote:Hi

I have recently started working on Cassandra as I need to make a distribute
Lucene index and found that Lucandra was the best for this. Since then I
have configured everything and it's working ok.

Now the problem comes in when I need to write this Lucene index to Cassandra
or convert it so that Cassandra can read it. The test index is 32 gigs and i
find that Cassandra times out alot.

What happens can't Cassandra take that load? Please any help will be great.

Kind Regards,


Re: Cassandra and Lucene

2010-07-25 Thread Aaron Morton
Sorry, also moving to User list. AaronOn 26 Jul, 2010,at 12:14 PM, Aaron Morton  wrote:You may need to provide a some more information. What's the cluster configuration, what version, what's in the logs etc. AaronOn 24 Jul, 2010,at 03:40 AM, Michelan Arendse  wrote:Hi

I have recently started working on Cassandra as I need to make a distribute
Lucene index and found that Lucandra was the best for this. Since then I
have configured everything and it's working ok.

Now the problem comes in when I need to write this Lucene index to Cassandra
or convert it so that Cassandra can read it. The test index is 32 gigs and i
find that Cassandra times out alot.

What happens can't Cassandra take that load? Please any help will be great

Kind Regards,


Re: Having Problems installing Chiton

2010-08-24 Thread Aaron Morton
You need to have the python thrift client and the generated cassandra thrift library in the python path. To get the thrift library I followed this guide http://wiki.apache.org/cassandra/InstallThrift There may be an easier way though. It looks like the Telephus client includes the cassandra package but not the thrift package. AaronOn 24 Aug, 2010,at 05:42 PM, durga devi  wrote:Sir/Madam,

I am new to Ubuntu.
I am getting the following Problem when insatlling the chiton in ubuntu
10.4

From this link http://tinyurl.com/24gdgkv
I set the PYHTONPATH as  export
   PYTHONPATH=/home/durga/driftx-
   Telephus-fb32fc7/:/home/durga/driftx-chiton-bd91965/:/usr/bin/python/
   And i run the  /driftx-chiton-bd91965/bin/./chiton-client

   I had the following Problem http://pastebin.com/29T12wef

   I am unble to sort out where this problem is occurring while in
   installation.

Thanks & Regards,
B.Durgadevi


Re: Build an index to join two CFs

2010-09-10 Thread aaron morton
I cannot tell you where in the code to make these changes. But it sounds like 
you want to fork cassandra and turn it into a RDBMS. It would undoubtedly be 
easier to just use a RDBMS.

Rather than have two CF's, address and name, just have one for the person using 
a super CF. Pull back the entire row for the id. Denormalise your data so the 
query is answered by one slice request to one CF, then you do not need  joins. 

If you want some advice on the data model, move the discussion to the user 
list. 

Aaron



On 11 Sep 2010, at 09:01, Alvin UW wrote:

> Hello,
> 
> I am going to build an index to join two CFs.
> First, we see this index as a CF/SCF. The difference is I don't materialise
> it.
> Assume we have two tables:
> ID_Address(*Id*, address) ,  Name_ID(*name*, id)
> Then,the index is: Name_Address(*name*, address)
> 
> When the application tries to query on Name_Address, the value of "name" is
> given by the application.
> I want to direct the read operation  to Name_ID to get "Id" value, then go
> to ID_Address to
> get the "address" value by the "Id" value. So far, I consider only the read
> operation.
> By this way, the join query is transparent to the user.
> 
> So I think I should find out which methods or classes are in charge of the
> read operation in the above operation.
> For example, the operation in cassandra CLI "get
> Keyspace1.Standard2['jsmith']" calls exactly which methods
> in the server side?
> 
> I noted CassandraServer is used to listen to clients, and there are some
> methods such as get(), get_slice().
> Is it the right place I can modify to implement my idea?
> 
> Thanks.
> 
> Alvin



system tests on osx

2010-10-05 Thread Aaron Morton
Anyone had trouble running the test/system/test_thrift_server.py tests on a mac book ? I was trying last night and they would sometimes work, sometimes not, without me making any changes They were failing with errors such as  connection reset, TSocket read 0 bytes errors at different times. I've been able to run them at work (Ubuntu 0.4) OK. Just wanted to check if there were any known issues before I spend more time digging into it. ThanksAaron 

Re: Help on dynamic creation of CF

2010-10-13 Thread aaron morton
Moving to the User List 
Aaron

On 13 Oct 2010, at 18:44, gagandip Singh wrote:

> I am also new to the Cassandra world but I think that is not possible on 0.6
> version. This is feature is provided in 0.7 version which is in beta right
> now. You can download it from Cassandra site.
> 
> Thanks,
> Gagan
> 
> On Wed, Oct 13, 2010 at 11:05 AM, Wicked J  wrote:
> 
>> Hi,
>> I'm using Cassandra v0.6.4 and wondering how can my app. dynamically create
>> Column Families?
>> 
>> Thanks!
>> 



/var/tmp in FailureDetector

2010-10-20 Thread aaron morton
I was reading through some code and noticed the following in 
FailureDetector.dumpInterArrivealTimes()

FileOutputStream fos = new FileOutputStream("/var/tmp/output-" + 
System.currentTimeMillis() + ".dat", true);

If this is meant to be cross platform I'm happy to create a bug and change it 
to use File.createTempFile() . 

Also I could not find any use of the  dumpInterArrivalTimes(InetAddress ep) 
overload. Anyone know if it should be kept?

thanks
Aaron



Re: /var/tmp in FailureDetector

2010-10-20 Thread aaron morton
I should have mentioned the FailureDetectorMBean only has the parameterless 
dumpInterArrivalTimes(). 

The overload that takes InetAddress is not available through JMX. 

A
On 21 Oct 2010, at 01:55, Gary Dusbabek wrote:

> Yes, we should generate it in the right temp directory.  That method
> is an implementation of an interface method (FailureDetectorMBean),
> meant to be invoked by JMX, which is why no other code calls it.
> 
> Gary.
> 
> On Wed, Oct 20, 2010 at 03:48, aaron morton  wrote:
>> I was reading through some code and noticed the following in 
>> FailureDetector.dumpInterArrivealTimes()
>> 
>>FileOutputStream fos = new FileOutputStream("/var/tmp/output-" + 
>> System.currentTimeMillis() + ".dat", true);
>> 
>> If this is meant to be cross platform I'm happy to create a bug and change 
>> it to use File.createTempFile() .
>> 
>> Also I could not find any use of the  dumpInterArrivalTimes(InetAddress ep) 
>> overload. Anyone know if it should be kept?
>> 
>> thanks
>> Aaron
>> 
>> 



Question about ColumnFamily Id's

2010-10-20 Thread Aaron Morton
I was helping a guy who in the end had a mixed beta1 and beta2 cluster http://www.mail-archive.com/u...@cassandra.apache.org/msg06661.htmlI had a look around the code and have a couple of questions, just for my understanding. When ReadResponseSerialize is called to deserialize the response from a node, it calls the RowSerializer which uses the ColumnFamilySerializer. If the CfId in the row is not known on the node a UnserializableColumnFamilyException is thrown. It's an IOException sub class and the error is treated as an Internal Error by the thrift generated Cassandra server. The read message sent to the node contains the Keyspace+CF names, and it returns it's CfID in the response. It looks like if a node somehow has a different/bad schema it can cause reads to fail. Is this correct? Could it's response be ignored if the read still meets the CL?Next question was how nodes could ever get to have a different CfId for the same Keyspace+CF pair?  It looks like the the CfId is never changed, so it would only happen if two node were each given a schema update and could not communicate it with each other.Am guessing the whole scenario is "unsupported" just trying to understand whats happening. ThanksAaron  

Re: /var/tmp in FailureDetector

2010-10-21 Thread Aaron Morton
To quick for me :)
Aaron


On 21 Oct 2010, at 17:52, Jonathan Ellis  wrote:

> Done in r1025822
> 
> On Wed, Oct 20, 2010 at 12:54 PM, Gary Dusbabek  wrote:
>> You're right!  It looks like dead code that should be removed.
>> 
>> Gary.
>> 
>> On Wed, Oct 20, 2010 at 12:50, aaron morton  wrote:
>>> I should have mentioned the FailureDetectorMBean only has the parameterless 
>>> dumpInterArrivalTimes().
>>> 
>>> The overload that takes InetAddress is not available through JMX.
>>> 
>>> A
>>> On 21 Oct 2010, at 01:55, Gary Dusbabek wrote:
>>> 
>>>> Yes, we should generate it in the right temp directory.  That method
>>>> is an implementation of an interface method (FailureDetectorMBean),
>>>> meant to be invoked by JMX, which is why no other code calls it.
>>>> 
>>>> Gary.
>>>> 
>>>> On Wed, Oct 20, 2010 at 03:48, aaron morton  
>>>> wrote:
>>>>> I was reading through some code and noticed the following in 
>>>>> FailureDetector.dumpInterArrivealTimes()
>>>>> 
>>>>>FileOutputStream fos = new FileOutputStream("/var/tmp/output-" 
>>>>> + System.currentTimeMillis() + ".dat", true);
>>>>> 
>>>>> If this is meant to be cross platform I'm happy to create a bug and 
>>>>> change it to use File.createTempFile() .
>>>>> 
>>>>> Also I could not find any use of the  dumpInterArrivalTimes(InetAddress 
>>>>> ep) overload. Anyone know if it should be kept?
>>>>> 
>>>>> thanks
>>>>> Aaron
>>>>> 
>>>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com


questions about SSTableExport/Import

2010-11-17 Thread aaron morton
I was trying to help this guy 
http://www.mail-archive.com/u...@cassandra.apache.org/msg07297.html who seemed 
to have troubles loading a json file. And I started taking a look at 
SSTableExport and SSTableImport.

SSTableExport does not encode any information about the Column sub type 
(ExpiringColumn or DeletedColumn). It records isMarkedForDelete(), the 
timestamp and the localDeletionTime as the col value if its a DeletedColumn. 
SSTableImport then calls either cf.addColumn() or cf.addTombstone() based on 
the deleted flag. 

First question is is the code in SSTableImport.addToStandardCF() correct to 
call cf.addColumn() if when the column was serialised it was 
isMarkedForDelete() ?

Next is it OK to lose the fact that a column is an ExpiringColumn (and its ttl) 
when it's exported to json? 

On my local machine I modified the unit test for SSTableExport as below and the 
assertion that the col was not returned failed.

Thanks
Aaron

diff --git a/test/unit/org/apache/cassandra/tools/SSTableExportTest.java 
b/test/unit/org/apache/cassandra/tools/SSTableExportTest.java
index 6f79f62..53d2a9c 100644
--- a/test/unit/org/apache/cassandra/tools/SSTableExportTest.java
+++ b/test/unit/org/apache/cassandra/tools/SSTableExportTest.java
@@ -179,6 +179,7 @@ public class SSTableExportTest extends SchemaLoader
 
 // Add rowA
 cfamily.addColumn(new QueryPath("Standard1", null, 
ByteBufferUtil.bytes("name")), ByteBufferUtil.bytes("val"), 1);
+cfamily.addColumn(new QueryPath("Standard1", null, 
ByteBufferUtil.bytes("ttl")), ByteBufferUtil.bytes("val"), 1, 1);
 writer.append(Util.dk("rowA"), cfamily);
 cfamily.clear();
 
@@ -187,6 +188,15 @@ public class SSTableExportTest extends SchemaLoader
 writer.append(Util.dk("rowExclude"), cfamily);
 cfamily.clear();
 
+//make sure the ttl col has expired
+try
+{
+Thread.sleep(1500);
+}
+catch (InterruptedException e)
+{
+throw new AssertionError(e);
+}
 SSTableReader reader = writer.closeAndOpenReader();
 
 // Export to JSON and verify
@@ -203,6 +213,11 @@ public class SSTableExportTest extends SchemaLoader
 assertTrue(cf != null);
 
assertTrue(cf.getColumn(ByteBufferUtil.bytes("name")).value().equals(ByteBuffer.wrap(hexToBytes("76616c";
 
+qf = QueryFilter.getNamesFilter(Util.dk("rowA"), new 
QueryPath("Standard1", null, null), ByteBufferUtil.bytes("ttl"));
+cf = qf.getSSTableColumnIterator(reader).getColumnFamily();
+assertTrue(cf != null);
+assertTrue(cf.getColumn(ByteBufferUtil.bytes("ttl")) == null);
+




Re: Reducing confusion around client libraries

2010-12-06 Thread Aaron Morton
I agree with the importance of the Thrift API. When I starting using Cassandra I found the idiomatic API's hid the true nature of what Cassandra does. It felt like trying to learn how a RDBMS works by learning how something like (java) hibernate or (ms) LINQ works. IMHO Cassandra *is* the thrift/avro API just like any RDBMS *is* the SQL language. Thanks AaronOn 07 Dec, 2010,at 07:15 AM, Hannes Schmidt  wrote:Probably chiming in a little late here, but I liked having the Thrift API
documentation in a prominent place. It is a canonical reference that
describes on a logical level what the system can and can't do. Without that
information it would have been much harder to understand how to use the
Hector client. And without that information I wouldn't have been able to
pinpoint bugs in libcassandra.

Having a language- and platform-independent interface specification is worth
gold in my opinion. Moving the clients under the umbrella of the project
would increase the danger that the vetted client source becomes the de-facto
reference because it would be temptingly easy to modify server and client in
lock-step for changes of the on-the-wire format without bothering to
document the change.

I also like seeing the competition of ideas in the client world. I think it
will take some time for the API to mature and settle and a wider variety of
client architectures needs to be evaluated before a set of vetted clients
should be chosen.

On Sun, Dec 5, 2010 at 6:48 AM, Simon Reavely wrote:

> Maybe there needs to be a "listing criteria" for a client library, that
> includes things like examples for what is considered enough to get folks
> started (connections, reads, writes, etc) in addition to what Ran
> suggested "[maintainer, last release, next release, support
> forum, number of committers, number of users, spring support, jpa support
> etc]." I would also have a "who's using us" column as well.
>
> If the library maintainer does not satisfy the listing criteria they can't
> get listed. Then we just need to decide what the criteria is ;-)
>
> Other than understanding how up to date and frequently maintained a library
> is I think that (full) good examples are essential.
>
> Having said that, I am not actually against some hierarchical organization
> in which there is some form of "tested/verified" client library list, then
> "others". To keep things fair the question would then be how something gets
> to be "tested/verified". In an opensource community I expect the library
> developers could take some of this on themselves even if the
> testing/verification is part of the main builds by way of some form of
> plugin/test suite but my level of thinking on this is shallow.
>
> Just my 2 cents/pennies on this topic!
>
> Cheers,
> Simon
>
> On Fri, Dec 3, 2010 at 4:07 PM, Ran Tavory  wrote:
>
> > As developer of one of the client libraries I can say that competition
> > keeps
> > us the library maintainers healthy and in the long run creates more value
> > to
> > the users so we should keep competition fair.
> > I can certainly see Jonathan's point regarding the level of confusion b/w
> > newcomers and I'm all for reducing it, but only as long as there's a fair
> > chance for all clients to evolve.
> > To the points that the server can provide a better interface (avro or CQL
> > and what have you), I think this can improve overall client development
> but
> > will not eliminate the need for clients, there will always be a higher
> > level
> > and nicer interface a client can provide or plugins to 3rd party (spring
> > and
> > such) so it does not solve the confusion problem, there will always be
> more
> > clients as long as cassandra keeps evolving.
> >
> > I like transparency and I think that if you present users enough data
> they
> > will be able to decide mind, even new comers. It would be correct to say
> > that generally folks who'd been involved with cassandra for a few years
> are
> > better informed than newcomers however it is sometimes hard to make an
> > objective decision and it's also hard to make a one-size-fits-all
> decision,
> > for example some clients implement feature x and not y and for most users
> > it
> > makes a lot of sense only that for some users they need y and not x. We
> > need
> > to be transparent and list the features and tradeoffs and let the users
> > decide.
> > I like Paul's idea of a table with a list of libraries and for each
> library
> > a set of columns such as [maintainer, last release, next release, support
> > forum, number of committers, number of users, spring support, jpa support
> > etc]. There's a challenge of keeping this table up to date but on the
> other
> > hand if a library maintainer does not keep his row up to date then it's a
> > signal. If voting can be made easily then I'm all for it as well as part
> of
> > this table. I don't think the table would be huge, it's probably 2-3 per
> > language.
> >
> >
> > On Fri, Dec 3, 2010 at 10:25 PM, Paul Brown 
> wrote:

Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread Aaron Morton
Have a read about JVM heap sizing here 
http://wiki.apache.org/cassandra/MemtableThresholds

If you let people create keyspaces with a mouse click you will soon run out of 
memory.

I use Cassandra to provide a self service "storage service" at my organisation. 
All virtual databases operate in the same Cassandra keyspace (which does not 
change), and I use namespaces in the keys to separate things. Take a look at 
how amazon S3 works, it may give you some ideas.

If you want to continue to discussion let's move this to the user list.

A
 

On 17/01/2011, at 7:44 PM, indika kumara  wrote:

> Hi Stu,
> 
> In our app,  we would like to offer cassandra 'as-is' to tenants. It that
> case, each tenant should be able to create Keyspaces as needed. Based on the
> authorization, I expect to implement it. In my view, the implementation
> options are as follows.
> 
> 1) The name of a keyspace would be 'the actual keyspace name' + 'tenant ID'
> 
> 2) The name of a keyspace would not be changed, but the name of a column
> family would be the 'the actual column family name' + 'tenant ID'.  It is
> needed to keep a separate mapping for keyspace vs tenants.
> 
> 3) The name of a keypace or a column family would not be changed, but the
> name of a column would be 'the actual column name' + 'tenant ID'. It is
> needed to keep separate mappings for keyspace vs tenants and column family
> vs tenants
> 
> Could you please give your opinions on the above three options?  if there
> are any issue regarding above approaches and if those issues can be solved,
> I would love to contribute on that.
> 
> Thanks,
> 
> Indika
> 
> 
> On Fri, Jan 7, 2011 at 11:22 AM, Stu Hood  wrote:
> 
>>> (1) has the problem of multiple memtables (a large amount just isn't
>> viable
>> There are some very straightforward solutions to this particular problem: I
>> wouldn't rule out running with a very large number of
>> keyspace/columnfamilies given some minor changes.
>> 
>> As Brandon said, some of the folks that were working on multi-tenancy for
>> Cassandra are no longer focused on it. But the code that was generated
>> during our efforts is very much available, and is unlikely to have gone
>> stale. Would love to talk about this with you.
>> 
>> Thanks,
>> Stu
>> 
>> On Thu, Jan 6, 2011 at 8:08 PM, indika kumara 
>> wrote:
>> 
>>> Thank you very much Brandon!
>>> 
>>> On Fri, Jan 7, 2011 at 12:40 AM, Brandon Williams 
>>> wrote:
>>> 
 On Thu, Jan 6, 2011 at 12:33 PM, indika kumara  wrote:
 
> Hi Brandon,
> 
> I would like you feedback on my two ideas for implementing mufti
>>> tenancy
> with the existing implementation.  Would those be possible to
>>> implement?
> 
> Thanks,
> 
> Indika
> 
>> Two vague ideas: (1) qualified keyspaces (by the tenet domain)
>>> (2)
> multiple Cassandra storage configurations in a single node (one per
> tenant).
> For both options, the resource hierarchy would be /cassandra/
> //keyspaces//
> 
 
 (1) has the problem of multiple memtables (a large amount just isn't
>>> viable
 right now.)  (2) more or less has the same problem, but in JVM
>> instances.
 
 I would suggest a) not trying to offer cassandra itself, and instead
>>> build
 a
 service that uses cassandra under the hood, and b) splitting up tenants
>>> in
 this layer.
 
 -Brandon
 
>>> 
>> 


Looking for Cassandra work.

2011-01-27 Thread aaron morton
I've decided to leave Weta Digital so I can spend more time working on and with 
Cassandra. If you would like to hire me from mid March please contact me 
directly on aa...@thelastpickle.com

I'm an Australian based in New Zealand and have skills in Python, Java, C#, 
Cassandra and other No Sql's , RDBMS, web and fat client development. 


Cheers
Aaron 



Re: [VOTE] 0.7.1 (3 times the charm?)

2011-02-07 Thread aaron morton
I just re-opened CASSANDRA-2081 
https://issues.apache.org/jira/browse/CASSANDRA-2081 there was a bug in 
StorageProxy.scan() that may need to be included. 

I listed another possible Message problem in the ticket, may pay to get someone 
else to give the StorageProxy a good going over. 

Aaron

On 5/02/2011, at 3:06 PM, Jeremy Hanna wrote:

> Just wondering - how does the distributed test framework fit into votes?  
> Does it get run each time a vote happens to check for bugs/regressions?
> 
> On Feb 4, 2011, at 1:40 PM, Eric Evans wrote:
> 
>> 
>> Lather. Rinse. Repeat.  Ya'll know the drill.
>> 
>> SVN:
>> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.7@r1067260
>> 0.7.1 artifacts: http://people.apache.org/~eevans
>> 
>> The vote will be open for 72 hours.
>> 
>> 
>> [1]: http://goo.gl/axEK0 (CHANGES.txt)
>> [2]: http://goo.gl/66yGY (NEWS.txt)
>> 
>> -- 
>> Eric Evans
>> eev...@rackspace.com
>> 
> 



Re: Using Cassandra-cli

2011-02-07 Thread Aaron Morton
There is also extensive online help in cassandra-clihelp;AaronOn 08 Feb, 2011,at 07:24 AM, Vishal Gupta  wrote:Hi,

there is a README.txt file in CASSANDRA_HOME which presents clear steps to
use get and set command Also i guess you need to first use Keyspace and
then fire set command.

Regards,
vishal

On Mon, Feb 7, 2011 at 11:43 PM, Eranda Sooriyabandara <0704...@gmail.com>wrote:

> Hi all,
> I tried Cassandra cli option in my machine. Here are my cli commands and
> the
> outputs.
>
> >>./cassandra-cli -host localhost -port 9160 -username eranda -keyspace
> keyspace1 -password eranda
> Keyspace 'keyspace1' not found.
>
> >>./cassandra-cli -host localhost -port 9160
> Connected to: "Test Cluster" on localhost/9160
> Welcome to cassandra CLI.
>
> [default@unknown] set keyspace1.standard['emahesh']['first']='eranda';
> Syntax error at position 13: mismatched input '.' expecting '['
>
> As the output say my commands did not work well. Here I used the commands
> which is in http://wiki.apache.org/cassandra/CassandraCli. I couldn't find
> the error of mine. Can anyone please help me to figure out the error.
>
> thanks
> Eranda
>


Re: How do secondary indices work

2011-02-08 Thread Aaron Morton
Moving to the user group.On 08 Feb, 2011,at 11:39 PM, alta...@ceid.upatras.gr wrote:Hello,

I'd like some information about how secondary indices work under the hood.

1) Is data stored in some external data structure, or is it stored in an
actual Cassandra table, as columns within column families?
2) Is data stored sorted or not? How is it partitioned?
3) How can I access index data?

Thanks in a advance,

Alexander Altanis


Re: Monitoring Cluster with JMX

2011-02-08 Thread Aaron Morton
Can't you get the length of the list on the monitoring side of things ?aaronOn 08 Feb, 2011,at 10:25 PM, Roland Gude  wrote:Hello,

we are trying to monitor our cassandra cluster with Nagios JMX checks. While there are JMX attributes which expose the list of reachable/unreachable hosts, it would be very helpful to have additional numeric attributes exposing the size of these lists. This could be used to set thresholds (in Nagios monitoring) i.e. at least 3 hosts must be reachable before Nagios issues a warning.
This is probably not hard to do and we are willing to implement/supply patches if someone could point us in the right direction on where to implement it.

Greetings,
roland

--
YOOCHOOSE GmbH

Roland Gude
Software Engineer

Im Mediapark 8, 50670 Köln

+49 221 4544151 (Tel)
+49 221 4544159 (Fax)
+49 171 7894057 (Mobil)


Email: roland.g...@yoochoose.com
WWW: www.yoochoose.com>

YOOCHOOSE GmbH
Geschäftsführer: Dr. Uwe Alkemper, Michael Friedmann
Handelsregister: Amtsgericht Köln HRB 65275
Ust-Ident-Nr: DE 264 773 520
Sitz der Gesellschaft: Köln



Gossip messages at DEBUG

2011-02-08 Thread Aaron Morton
I've just put the latest 0.7 build on a node and it's logging gossip messages at DEBUG and making the logs really hard to use. Anyone object to moving these to TRACE level ? e.g.here's 6 in a second for a machine doing nothing. DEBUG [GossipStage:1] 2011-02-09 15:56:04,259 MessagingService.java (line org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:302)) jb04/192.168.114.63 sending GOSSIP_DIGEST_ACK to 9815@/192.168.114.67DEBUG [ScheduledTasks:1] 2011-02-09 15:56:04,424 MessagingService.java (line org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:302)) jb04/192.168.114.63 sending GOSSIP_DIGEST_SYN to 9816@/192.168.114.67DEBUG [ScheduledTasks:1] 2011-02-09 15:56:04,424 MessagingService.java (line org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:302)) jb04/192.168.114.63 sending GOSSIP_DIGEST_SYN to 9817@/192.168.114.65DEBUG [GossipStage:1] 2011-02-09 15:56:04,424 MessagingService.java (line org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:302)) jb04/192.168.114.63 sending GOSSIP_DIGEST_ACK2 to 9818@/192.168.114.67DEBUG [GossipStage:1] 2011-02-09 15:56:04,424 MessagingService.java (line org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:302)) jb04/192.168.114.63 sending GOSSIP_DIGEST_ACK2 to 9819@/192.168.114.65DEBUG [GossipStage:1] 2011-02-09 15:56:04,483 MessagingService.java (line org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:302)) jb04/192.168.114.63 sending GOSSIP_DIGEST_ACK to 9820@/192.168.114.66Aaron

Re: Gossip messages at DEBUG

2011-02-09 Thread Aaron Morton
thanks.AOn 10 Feb, 2011,at 08:21 AM, Brandon Williams  wrote:On Tue, Feb 8, 2011 at 9:01 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> I've just put the latest 0.7 build on a node and it's logging gossip
> messages at DEBUG and making the logs really hard to use. Anyone object to
> moving these to TRACE level ?
>

Moved to TRACE.  I think when this was moved from sendRR to sendOneWay
gossip wasn't considered.

-Brandon


Re: RE: SEVERE Data Corruption Problems

2011-02-10 Thread Aaron Morton
Looks like the bloom filter for the row is corrupted, does it happen for all reads or just for reads on one row ? After the upgrade to 0.7 (assuming an 0.7 nightly build) did you run anything like nodetool repair ? Have you tried asking on the #cassandra IRC room to see if their are any comitters around ? AaronOn 11 Feb, 2011,at 01:18 PM, Dan Hendry  wrote:Upgraded one node to 0.7. Its logging exceptions like mad (thousands per
minute). All like below (which is fairly new to me):

ERROR [ReadStage:721] 2011-02-10 18:13:56,190 AbstractCassandraDaemon.java
(line 114) Fatal exception in thread Threa
d[ReadStage:721,5,main]
java.io.IOError: java.io.EOFException
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNa
mesIterator.java:75)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(Nam
esQueryFilter.java:59)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFil
ter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilySto
re.java:1275)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.
java:1167)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.
java:1095)
at org.apache.cassandra.db.Table.getRow(Table.java:384)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadComma
nd.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(Stor
ageProxy.java:473)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.cassandra.utils.BloomFilterSerializer.deserialize(BloomFilterSeri
alizer.java:48)
at
org.apache.cassandra.utils.BloomFilterSerializer.deserialize(BloomFilterSeri
alizer.java:30)
at
org.apache.cassandra.io.sstable.IndexHelper.defreezeBloomFilter(IndexHelper.
java:108)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableName
sIterator.java:106)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNa
mesIterator.java:71)
... 12 more

Dan


-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: February-09-11 18:14
To: dev
Subject: Re: SEVERE Data Corruption Problems

Hi Dan,

it would be very useful to test with 0.7 branch instead of 0.7.0 so at
least you're not chasing known and fixed bugs like CASSANDRA-1992.

As you say, there's a lot of people who aren't seeing this, so it
would also be useful if you can provide some kind of test harness
where you can say "point this at a cluster and within a few hours

On Wed, Feb 9, 2011 at 4:31 PM, Dan Hendry 
wrote:
> I have been having SEVERE data corruption issues with SSTables in my
> cluster, for one CF it was happening almost daily (I have since shut down
> the service using that CF as it was too much work to manage the Cassandra
> errors). At this point, I can’t see how it is anything but a Cassandra bug
> yet it’s somewhat strange and very scary that I am the only one who seems
to
> be having such serious issues. Most of my data is indexed in two ways so I
> have been able to write a validator which goes through and back fills
> missing data but it’s kind of defeating the whole point of Cassandra. The
> only way I have found to deal with issues when they crop up to prevent
nodes
> crashing from repeated failed compactions is delete the SSTable. My
cluster
> is running a slightly modified 0.7.0 version which logs what files errors
> for so that I can stop the node and delete them.
>
>
>
> The problem:
>
> -  Reads, compactions and hinted handoff fail with various
> exceptions (samples shown at the end of this email) which seem to indicate
> sstable corruption.
>
> -  I have seen failed reads/compactions/hinted handoff on 4 out of
4
> nodes (RF=2) for 3 different super column families and 1 standard column
> family (4 out of 11) and just now, the Hints system CF. (if it matters the
> ring has not changed since one CF which has been giving me trouble was
> created). I have check SMART disk info and run various diagnostics and
there
> does not seem to be any hardware issues, plus what are the chances of all
> four nodes having the same hardware problems at the same time when for all
> other purposes, they appear fine?
>
> -  I have added logging which outputs what sstable are causing
> exceptions to be thrown. The corrupt sstables have been both freshly
flushed
> memtables and the output of compaction (ie, 4 sstables which all seem to
be
> fine get comp

6.12 release?

2011-02-16 Thread Aaron Morton
A guy on the user list has asked about getting a 6.12 release out that includes the fix for CASSANDRA-2081. Without it get_range_slice where CL > ONE will timeout as the message id's are reused. Jonathan has back ported the relevant parts of the second patch (which concerned get_indexed_slices) from the ticket. Can we get this one released?Aaron

rewriting cli help

2011-02-16 Thread Aaron Morton
I'm working on moving the cli online help into a yaml file for ease of maintenance and am now trying to merge the existing cli help with whats in cassandra.yaml and the wiki. If you have any desires for how it should look please comment on the https://issues.apache.org/jira/browse/CASSANDRA-2008ThanksAaron

Re: Data model

2011-03-07 Thread aaron morton
Will answer on the user list. 

Aaron

On 8/03/2011, at 1:11 AM, Baskar wrote:

> Does Cassandra allow nesting of column families?
> 
> Here is the use case
> - we need to store calls made by employees
> - employees are associated with an account
> - accounts have phone numbers
> - many calls are made by employees for a given account and phone 
> 
> If possible, would like to store call related data against employee. 
> 
> Thanks
> Baskar



Fwd: batch inserts in cassandra 0.7

2011-03-15 Thread aaron morton
batch_insert was depricated in 0.6, you should have been using batch_mutate 
http://wiki.apache.org/cassandra/API

Aaron

Begin forwarded message:

> From: Anurag Gujral 
> Date: 16 March 2011 10:04:56 GMT+13:00
> To: dev@cassandra.apache.org
> Subject: batch inserts in cassandra 0.7
> Reply-To: dev@cassandra.apache.org
> 
> Hi All,
>  I am moving from cassandra 0.6 to 0.7  I was using  function
> send_batch_inserts to do batch inserts in cassandra 0.6 when I moved to 0.7
> I dont see the function send_batch_insert
> is there a way to do batch inserts in cassandra 0.7 using thrift-0.0.5.
> 
> Thanks
> Anurag



Re: Limitations on number of secondary indexes

2011-04-20 Thread aaron morton
Moving to user.
Aaron

On 20 Apr 2011, at 10:45, Jason Kolb wrote:

> I apologize if this has been answered before, I've tried to do some pretty
> exhaustive searching of the archives and haven't been able to see if this
> question has been answered before.
> 
> I was wondering if anyone knows if there is a practical upper limit on the
> number of secondary indexes used, if they're sparsely populated (say, 10,000
> secondary indexes only 2 of which are populated per row).  My understanding
> is that Cassandra creates another column family for each secondary index in
> the background, so the real limitation would appear to be the number of
> column families.
> 
> Is this correct?  And if so (or even if not), does anyone know the answer to
> the question about the upper limit on the number of secondary indexes?
> 
> Thanks!
> Jason



Re: Compacting single file forever

2011-04-21 Thread aaron morton
Moving to the user list. 

Aaron

On 20 Apr 2011, at 21:25, Shotaro Kamio wrote:

> Hi,
> 
> I found that our cluster repeats compacting a single file forever
> (cassandra 0.7.5). We are wondering if compaction logic is wrong. I'd
> like to have comments from you guys.
> 
> Situation:
> - After trying to repair a column family, our cluster's disk usage is
> quite high. Cassandra cannot compact all sstables at once. I think it
> repeats compacting single file at the end. (you can check the attached
> log below)
> - Our data doesn't have deletes. So, the compaction of single file
> doesn't make free disk space.
> 
> We are approaching to full-disk. But I believe that the repair
> operation made a lot of duplicate data on the disk and it requires
> compaction. However, most of nodes stuck on compacting a single file.
> The only thing we can do is to restart the nodes.
> 
> My question is why the compaction doesn't stop.
> 
> I looked at the logic in CompactionManager.java:
> -
>String compactionFileLocation =
> table.getDataFileLocation(cfs.getExpectedCompactedFileSize(sstables));
>// If the compaction file path is null that means we have no
> space left for this compaction.
>// try again w/o the largest one.
>List smallerSSTables = new
> ArrayList(sstables);
>while (compactionFileLocation == null && smallerSSTables.size() > 1)
>{
>logger.warn("insufficient space to compact all requested
> files " + StringUtils.join(smallerSSTables, ", "));
>smallerSSTables.remove(cfs.getMaxSizeFile(smallerSSTables));
>compactionFileLocation =
> table.getDataFileLocation(cfs.getExpectedCompactedFileSize(smallerSSTables));
>}
>if (compactionFileLocation == null)
>{
>logger.error("insufficient space to compact even the two
> smallest files, aborting");
>return 0;
>}
> -
> 
> The while condition: smallerSSTables.size() > 1
> Is this should be "smallerSSTables.size() > 2" ?
> 
> In my understanding, compaction of single file makes free disk space
> only when the sstable has a lot of tombstone and only if the tombstone
> is removed in the compaction. If cassandra knows the sstable has
> tombstones to be removed, it's worth to compact it. Otherwise, it
> might makes free space if you are lucky. In worst case, it leads to
> infinite loop like our case.
> 
> What do you think the code change?
> 
> 
> Best regards,
> Shotaro
> 
> 
> * Cassandra compaction log
> -
> WARN [CompactionExecutor:1] 2011-04-20 01:03:14,446
> CompactionManager.java (line 405) insufficient space to compact all
> requested files SSTableReader(
> path='foobar-f-3020-Data.db'), SSTableReader(path='foobar-f-3034-Data.db')
> INFO [CompactionExecutor:1] 2011-04-20 03:47:29,833
> CompactionManager.java (line 482) Compacted to
> foobar-tmp-f-3035-Data.db.  260,646,760,319 to 260,646,760,319 (~100%
> of original) bytes for 6,893,896 keys.  Time: 9,855,385ms.
> 
> WARN [CompactionExecutor:1] 2011-04-20 03:48:11,308
> CompactionManager.java (line 405) insufficient space to compact all
> requested files SSTableReader(path='foobar-f-3020-Data.db'),
> SSTableReader(path='foobar-f-3035-Data.db')
> INFO [CompactionExecutor:1] 2011-04-20 06:31:41,193
> CompactionManager.java (line 482) Compacted to
> foobar-tmp-f-3036-Data.db.  260,646,760,319 to 260,646,760,319 (~100%
> of original) bytes for 6,893,896 keys.  Time: 9,809,882ms.
> 
> WARN [CompactionExecutor:1] 2011-04-20 06:32:22,476
> CompactionManager.java (line 405) insufficient space to compact all
> requested files SSTableReader(path='foobar-f-3020-Data.db'),
> SSTableReader(path='foobar-f-3036-Data.db')
> INFO [CompactionExecutor:1] 2011-04-20 09:20:29,903
> CompactionManager.java (line 482) Compacted to
> foobar-tmp-f-3037-Data.db.  260,646,760,319 to 260,646,760,319 (~100%
> of original) bytes for 6,893,896 keys.  Time: 10,087,424ms.
> -
> You can see that compacted size is always the same. It repeats
> compacting the same single sstable.



Fwd: Error trying to move a node - 0.7

2011-06-19 Thread aaron morton
Will answer on the user list. 

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

Begin forwarded message:

> From: Ben Frank 
> Date: 17 June 2011 07:42:07 GMT+12:00
> To: dev@cassandra.apache.org
> Subject: Error trying to move a node - 0.7
> Reply-To: dev@cassandra.apache.org
> 
> Hi All,
>   I'm getting the following error when trying to move a nodes token:
> 
> nodetool -h 145.6.92.82 -p 18080 move 56713727820156410577229101238628035242
> cassandra.in.sh executing for environment DEV1
> Exception in thread "main" java.lang.AssertionError
>at
> org.apache.cassandra.locator.TokenMetadata.firstTokenIndex(TokenMetadata.java:393)
>at
> org.apache.cassandra.locator.TokenMetadata.ringIterator(TokenMetadata.java:418)
>at
> org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalEndpoints(NetworkTopologyStrategy.java:94)
>at
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:807)
>at
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:773)
>at
> org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1468)
>at
> org.apache.cassandra.service.StorageService.move(StorageService.java:1605)
>at
> org.apache.cassandra.service.StorageService.move(StorageService.java:1580)
> .
> .
> .
> 
> my ring looks like this:
> 
> Address Status State   LoadOwnsToken
> 
> 113427455640312821154458202477256070484
> 145.6.99.80  Up Normal  1.63 GB 36.05%
> 4629135223504085509237477504287125589
> 145.6.92.82  Up Normal  2.86 GB 1.09%
> 6479163079760931522618457053473150444
> 145.6.99.81  Up Normal  2.01 GB 62.86%
> 113427455640312821154458202477256070484
> 
> 
> '80' and '81' are configured to be in the East coast data center and '82' is
> in the West
> 
> Anyone shed any light as to what might be going on here?
> 
> -Ben



Re: Reoganizing drivers

2011-06-19 Thread aaron morton
I can see the drivers have moved to
http://svn.apache.org/repos/asf/cassandra/drivers/

Just wondering where that path is available on 
git://git.apache.org/cassandra.git

These are the remote branches I can find 

$ git ls-remote  | grep drivers
From git://git.apache.org/cassandra.git
20635cec24389d83b146af51fa902fcf2d21491brefs/remotes/tags/drivers
dd06878fa6b143dbff1e1e338087041b1b230d48refs/tags/drivers
20635cec24389d83b146af51fa902fcf2d21491brefs/tags/drivers^{}

Thanks
A

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 8 Jun 2011, at 05:01, Eric Evans wrote:

> On Tue, 2011-06-07 at 18:40 +0200, Sylvain Lebresne wrote:
>> On Tue, Jun 7, 2011 at 3:18 PM, Jonathan Ellis 
>> wrote:
>>> Sounds fine as far as it goes, but don't we want some concept of
>>> branches/tags for driver releases too?
>> 
>> Our idea so far (Eric can correct me if I'm wrong :)) was to consider
>> the drivers directory as the 'trunk' for drivers, and create branches
>> and tags for them alongside the cassandra ones.
> 
> Yup.  In fact, I already tagged the Python and Java drivers as
> tags/drivers// during the last release (neither of those
> driver artifacts corresponded to the same SVN rev, nor did they
> correspond to the rev for 0.8.0).
>> 
>> Truth is, I even think that consider the drivers as a whole is not
>> granular enough. It's unlikely the different drivers will move at the
>> same pace.
> 
> As far as I know, there is no reason that a tag (say
> tags/drivers/py/1.1.1) can't point to a subdirectory of drivers/ (i.e.
> drivers/py).  In fact, that's how the tags mentioned above were done
> (except those pointed to branches/cassandra-0.8.0/drivers/).  I
> think it just boils down to a matter convention.
>> 
>> *But*, we believe that moving the drivers up one level is at least a
>> first step towards something better than the status quo.
> 
> Yeah, even if we decide to do something different later on, this is an
> improvement over what we have now. 
> 
> -- 
> Eric Evans
> eev...@rackspace.com
> 



Re: Reoganizing drivers

2011-06-28 Thread aaron morton
Asked on #asfinfra and was told the only things mirrored on git are trunk / 
tags / branches . 

git-svn it is. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 20 Jun 2011, at 16:28, Jonathan Ellis wrote:

> Maybe the non-standard path is giving the git mirror fits.
> 
> On Sun, Jun 19, 2011 at 11:26 PM, aaron morton  
> wrote:
>> I can see the drivers have moved to
>> http://svn.apache.org/repos/asf/cassandra/drivers/
>> 
>> Just wondering where that path is available on 
>> git://git.apache.org/cassandra.git
>> 
>> These are the remote branches I can find
>> 
>> $ git ls-remote  | grep drivers
>> From git://git.apache.org/cassandra.git
>> 20635cec24389d83b146af51fa902fcf2d21491brefs/remotes/tags/drivers
>> dd06878fa6b143dbff1e1e338087041b1b230d48refs/tags/drivers
>> 20635cec24389d83b146af51fa902fcf2d21491brefs/tags/drivers^{}
>> 
>> Thanks
>> A
>> 
>> -
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 8 Jun 2011, at 05:01, Eric Evans wrote:
>> 
>>> On Tue, 2011-06-07 at 18:40 +0200, Sylvain Lebresne wrote:
>>>> On Tue, Jun 7, 2011 at 3:18 PM, Jonathan Ellis 
>>>> wrote:
>>>>> Sounds fine as far as it goes, but don't we want some concept of
>>>>> branches/tags for driver releases too?
>>>> 
>>>> Our idea so far (Eric can correct me if I'm wrong :)) was to consider
>>>> the drivers directory as the 'trunk' for drivers, and create branches
>>>> and tags for them alongside the cassandra ones.
>>> 
>>> Yup.  In fact, I already tagged the Python and Java drivers as
>>> tags/drivers// during the last release (neither of those
>>> driver artifacts corresponded to the same SVN rev, nor did they
>>> correspond to the rev for 0.8.0).
>>>> 
>>>> Truth is, I even think that consider the drivers as a whole is not
>>>> granular enough. It's unlikely the different drivers will move at the
>>>> same pace.
>>> 
>>> As far as I know, there is no reason that a tag (say
>>> tags/drivers/py/1.1.1) can't point to a subdirectory of drivers/ (i.e.
>>> drivers/py).  In fact, that's how the tags mentioned above were done
>>> (except those pointed to branches/cassandra-0.8.0/drivers/).  I
>>> think it just boils down to a matter convention.
>>>> 
>>>> *But*, we believe that moving the drivers up one level is at least a
>>>> first step towards something better than the status quo.
>>> 
>>> Yeah, even if we decide to do something different later on, this is an
>>> improvement over what we have now.
>>> 
>>> --
>>> Eric Evans
>>> eev...@rackspace.com
>>> 
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com



CASSANDRA-2249 not in CHANGES.txt for 1.0

2011-09-15 Thread aaron morton
It's in NEWS should it also be in CHANGES?

https://issues.apache.org/jira/browse/CASSANDRA-2449

Cheers


-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com



wiki updates

2011-10-06 Thread aaron morton
With 1.0 almost here I was going to try and spruce up the wiki a bit to make 
things a little more welcoming for new users. 

I've created a copy of the home page here 
http://wiki.apache.org/cassandra/FrontPage_draft_aaron as a working draft. 

I've re-aranged things a little, and added some links to pages that do not yet 
exist. I was going to use it as a planning tool by working through all the 
pages linked there to see if they needed updated examples, or were yet to be 
written, that sort of thing. 

Thoughts ? 

I'll probably ask for some volunteers on the user list. 

Cheers

---------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com



Re: wiki updates

2011-10-06 Thread aaron morton
Ok, I'll go ahead and work out a way for people to contribute.

@Nick, yes to include CQL examples, and to give them first, but I would also 
keep the rpc call for now. As the RPC interface is still officially supported. 

@Yang, Happy to re-arrange things once we have some content. For better or 
worse I used the Hive wiki as guide 
https://cwiki.apache.org/confluence/display/Hive/Home  . Creating new content 
takes time, first I'd like to improve what we have and make sure it is correct. 


Thanks

-----
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 7/10/2011, at 1:24 AM, Yi Yang wrote:

> Thanks Aaron for the hard work.   The new front page gives a much clear
> image on Cassandra.
> 
> However I would also like to present some of my thoughts - based on my own
> learning path:
> 
> 1) The first part should present a clear image on what Cassandra is, and
> what's inside Cassandra - thus we'd better include the Data Model section in
> it - people will easily get to know what's the difference between Cassandra
> and other key based databases like HBase, MangoDB etc., therefore make wise
> choices.
> 
> 2) The second part could give a glance at how to make Cassandra running on a
> small to moderate level application - aka the a small development platform.
>  Including how to create a cluster and also running on Windows / Amazon
> EC2. Here StorageConfiguration is not so important because the target of
> this section is to help users build up a usable Cassandra Cluster.
> 
> 3) The third part can help the administrators of large scale application,
> and also introducing management strategies like storage configuration and
> also other monitoring, node operations techniques.
> 
> The rest parts are perfect. I hope we can match the list with a typical
> users' learning experience, so that for users at different level they can
> focus on their own section. It's just my own idea - and it might different
> from others' experiences, hopefully it could help. Thanks again for the
> great work.
> 
> Best,
> Yi
> 
> On Thu, Oct 6, 2011 at 7:29 PM, aaron morton wrote:
> 
>> With 1.0 almost here I was going to try and spruce up the wiki a bit to
>> make things a little more welcoming for new users.
>> 
>> I've created a copy of the home page here
>> http://wiki.apache.org/cassandra/FrontPage_draft_aaron as a working draft.
>> 
>> I've re-aranged things a little, and added some links to pages that do not
>> yet exist. I was going to use it as a planning tool by working through all
>> the pages linked there to see if they needed updated examples, or were yet
>> to be written, that sort of thing.
>> 
>> Thoughts ?
>> 
>> I'll probably ask for some volunteers on the user list.
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> 



Re: cassandra node is not starting

2012-01-02 Thread aaron morton
What version of cassandra and what OS ? 

It sort of looks like it tried to delete a secondary in CF that was defined in 
the system KS. 

Turn the logging up to DEBUG and see what happens. 

Hope that helps. 
Aaron

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/01/2012, at 11:20 PM, Michael Vaknine wrote:

> Hi,
> 
> 
> 
> During restart cassandra not is failing to start
> 
> The error is
> 
> ERROR [main] 2012-01-01 05:03:42,903 AbstractCassandraDaemon.java (line 354)
> Exception encountered during startup
> 
> java.lang.AssertionError: attempted to delete non-existing file
> AttractionUserIdx.AttractionUserIdx_09partition_idx-h-1-Data.db
> 
>at
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:49)
> 
>at
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:44)
> 
>at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:133)
> 
>at
> org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyS
> tore.java:355)
> 
>at
> org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyS
> tore.java:402)
> 
>at
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandra
> Daemon.java:174)
> 
>at
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassan
> draDaemon.java:337)
> 
>at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
> 
> 
> 
> can someone tell me how to recover from that
> 
> thanks
> 
> 
> 
> Michael 
> 



Re: Welcome committer Aaron Morton!

2012-01-18 Thread aaron morton
Thanks Jonathan and the other committers.

Cheers :)
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/01/2012, at 7:19 AM, Jonathan Ellis wrote:

> The Apache Cassandra PMC has voted to add Aaron as a committer.
> Thanks for helping make Cassandra what it is today!
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com



Re: How to create a table in Cassandra

2012-01-28 Thread aaron morton
This question belongs on the user list, I will answer it there. 

Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 28/01/2012, at 3:36 AM, anandbab...@polarisft.com wrote:

> 
> Can anyone tell me how to create a table in the Cassandra. I have
> installed it... and I am new to this...
> Thanks,
> Barnabas
> 
> 
> 
> This e-Mail may contain proprietary and confidential information and is sent 
> for the intended recipient(s) only.  If by an addressing or transmission 
> error this mail has been misdirected to you, you are requested to delete this 
> mail immediately. You are also hereby notified that any use, any form of 
> reproduction, dissemination, copying, disclosure, modification, distribution 
> and/or publication of this e-mail message, contents or its attachment other 
> than by its intended recipient/s is strictly prohibited.
> 
> Visit us at http://www.polarisFT.com
> 



Re: Thift vs. CQL

2012-01-28 Thread aaron morton
This question belongs on the user list, I will answer it there. 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 28/01/2012, at 1:26 AM, bxqdev wrote:

> Hello!
> 
>  Datastax's Cassandra documentation says that CQL API is the future of 
> Cassandra API. It's also says that eventually Thift API will be removed 
> completely. Is it true? Do you have any plans of removing Thift API, leaving 
> CQL API only??
> 
> thanks.



Re: extra diffs showing up in update column family

2012-01-30 Thread aaron morton
Can you raise a ticket at https://issues.apache.org/jira/browse/CASSANDRA with 
steps to reproduce. 

Thanks
p.s. the user list is the appropriate list for emails like this. 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/01/2012, at 9:31 AM, Dave Brosius wrote:

> If a user specifies a Comparator in an update column family (as was from a 
> irc user), as
> 
> update column family report_by_account_content with comparator=UTF8Type and 
> column_metadata = [{ column_name:'meta:account-id', 
> validation_class:UTF8Type,index_type:KEYS},{ column_name:'meta:filter-hash', 
> validation_class:UTF8Type,index_type:KEYS}];
> 
> 
> The comparator value is seen as different because the original comparator was 
> the fully qualified name of the org.apache.cassandra.db.marshal.UTF8Type, and 
> new one is what is passed in UTF8Type. So CFMetaData.diff sees this as a 
> change and does extra work because of it.
> 
> I'm guessing there are other class name values where this holds true as well.
> 
> Is this a big enough concern to address?
> 
> thanks
> dave
> 
> 



Re: understanding cassandra internal

2012-01-30 Thread aaron morton
The code is where it's at, and...

http://www.datastax.com/2011/08/video-cassandra-internals-presentation-from-cassandra-sf-2011

http://wiki.apache.org/cassandra
http://planetcassandra.org/
http://www.datastax.com/docs/1.0/index

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/01/2012, at 12:07 PM, Thanh Do wrote:

> hi all,
> 
> I would like to study the internal code
> of cassandra. The website (wiki)
> provides limited documentation.
> 
> Is there any way (documents, blogs)
> that mention in details about how
> cassandra internally works?
> Is there a fast way beside
> walking through the code
> and reason about how it works.
> 
> many thanks,
> Thanh



Re: extra diffs showing up in update column family

2012-01-30 Thread aaron morton
Sorry, i thought bug reports went to the user list. 

A

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/01/2012, at 1:39 PM, Brandon Williams wrote:

> On Mon, Jan 30, 2012 at 6:36 PM, aaron morton  wrote:
>> p.s. the user list is the appropriate list for emails like this.
> 
> I disagree, this was on-topic for dev@ imho.
> 
> -Brandon



Re: Queries on AuthN and AuthZ for multi tenant Cassandra

2012-02-01 Thread aaron morton
The existing authentication plug-in does not support row level authorization. 

You will need to add authentication to your API layer to ensure that a request 
from client X always has the client X key prefix. Or modify cassandra to 
provide row level authentication.

The 1.x Memtable memory management is awesome, but I would still be hesitant 
about creating KS's and CF's at the request of an API client.

Cheers

   
-----
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/02/2012, at 8:52 AM, Subrahmanya Harve wrote:

> We are using Cassandra 0.8.7 and building a multi-tenant cassandra platform
> where we have a common KS and common CFs for all tenants. By using Hector's
> virtual keyspaces, we are able to add modify rowkeys to have a tenant
> specific id. (Note that we do not allow tenants to modify/create KS/CF. We
> just allow tenants to write and read data) However we are in the process of
> adding authentication and authorization on top of this platform such that
> no tenant should be able to retrieve data belonging to any other tenant.
> 
> By configuring Cassandra for security using the documentation here -
> http://www.datastax.com/docs/0.8/configuration/authentication , we were
> able to apply the security constraints on the common keyspace and common
> CFs. However this does not prevent a tenant from retrieving data belonging
> to another tenant. For this to happen, we would need to have separate CFs
> and/or keyspaces for each tenant.
> Looking for more information on the topic here
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Multi-tenancy-and-authentication-and-authorization-td5935230.htmland
> other places, it looks like the recommendation is "not" to create
> separate CFs and KSs for every tenant as this would have impacts on
> Memtables and other memory issues. Does this recommendation still hold
> good?
> With jiras like
> https://issues.apache.org/jira/browse/CASSANDRA-2006resolved, does it
> mean we can now create multiple (but limited) CFs and KSs?
> More generally, how do we prevent a tenant from intentional/accidental data
> manipulation of data owned by another tenant? (given that all tenants will
> provide the right credentials)



Re: [VOTE] Release Apache Cassandra 0.8.10

2012-02-09 Thread aaron morton
+1
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 9/02/2012, at 5:19 AM, Sylvain Lebresne wrote:

> It's been close to 2 months since 0.8.9 and while things are mostly calm on
> the 0.8 branch, we do have a few fixes in there that is worth releasing.
> I thus propose the following artifacts for release as 0.8.10.
> 
> Git sha1: 038b8f212eb37c98ff4f230b722bc9a76daf1658
> Git: 
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/0.8.10-tentative
> Artifacts: 
> https://repository.apache.org/content/repositories/orgapachecassandra-209/org/apache/cassandra/apache-cassandra/0.8.10/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-209/
> 
> The artifacts as well as the debian package are also available here:
> http://people.apache.org/~slebresne/
> 
> The vote will be open for 72 hours (longer if needed).
> 
> [1]: http://goo.gl/ZOnuf (CHANGES.txt)
> [2]: http://goo.gl/EXtfL (NEWS.txt)



Re: Welcome committer Peter Schuller

2012-02-13 Thread aaron morton
Congratulations. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/02/2012, at 8:08 AM, Peter Schuller wrote:

>> The Apache Cassandra PMC has voted to add Peter as a committer.  Thank
>> you Peter, and we look forward to continuing to work with you!
> 
> Thank *you*, as do I :)
> 
> -- 
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)



Re: nosetests

2012-04-14 Thread aaron morton
Looks like it's hanging while talking to the cluster. Ensure cassandra is 
running and on default ports. 

I also run nosetests with -vdx for verbose, detailed errors and stop of first 
fail (http://readthedocs.org/docs/nose/en/latest/usage.html#extended-usage)

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/04/2012, at 6:35 AM, Mark Dewey wrote:

> PS I got the following trace when I aborted.
> 
> Traceback (most recent call last):
>  File "/usr/bin/nosetests", line 9, in 
>load_entry_point('nose==0.11.4', 'console_scripts', 'nosetests')()
>  File "/usr/lib/pymodules/python2.7/nose/core.py", line 117, in __init__
>**extra_args)
>  File "/usr/lib/python2.7/unittest/main.py", line 95, in __init__
>self.runTests()
>  File "/usr/lib/pymodules/python2.7/nose/core.py", line 196, in runTests
>result = self.testRunner.run(self.test)
>  File "/usr/lib/pymodules/python2.7/nose/core.py", line 61, in run
>test(result)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 176, in __call__
>return self.run(*arg, **kw)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 223, in run
>test(orig)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 176, in __call__
>return self.run(*arg, **kw)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 223, in run
>test(orig)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 176, in __call__
>return self.run(*arg, **kw)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 223, in run
>test(orig)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 176, in __call__
>return self.run(*arg, **kw)
>  File "/usr/lib/pymodules/python2.7/nose/suite.py", line 223, in run
>test(orig)
>  File "/usr/lib/pymodules/python2.7/nose/case.py", line 44, in __call__
>return self.run(*arg, **kwarg)
>  File "/usr/lib/pymodules/python2.7/nose/case.py", line 132, in run
>self.runTest(result)
>  File "/usr/lib/pymodules/python2.7/nose/case.py", line 150, in runTest
>test(result)
>  File "/usr/lib/python2.7/unittest/case.py", line 385, in __call__
>return self.run(*args, **kwds)
>  File "/usr/lib/python2.7/unittest/case.py", line 312, in run
>self.setUp()
>  File "/usr/lib/pymodules/python2.7/nose/case.py", line 367, in setUp
>try_run(self.inst, ('setup', 'setUp'))
>  File "/usr/lib/pymodules/python2.7/nose/util.py", line 491, in try_run
>return func()
>  File "/home/mildewey/Projects/cassandra/test/system/__init__.py", line
> 113, in setUp
>self.define_schema()
>  File "/home/mildewey/Projects/cassandra/test/system/__init__.py", line
> 180, in define_schema
>self.client.system_add_keyspace(ks)
>  File
> "/home/mildewey/Projects/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 1440, in system_add_keyspace
>return self.recv_system_add_keyspace()
>  File
> "/home/mildewey/Projects/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 1451, in recv_system_add_keyspace
>(fname, mtype, rseqid) = self._iprot.readMessageBegin()
>  File
> "/usr/local/lib/python2.7/dist-packages/thrift/protocol/TBinaryProtocol.py",
> line 137, in readMessageBegin
>name = self.trans.readAll(sz)
>  File
> "/usr/local/lib/python2.7/dist-packages/thrift/transport/TTransport.py",
> line 58, in readAll
>chunk = self.read(sz-have)
>  File
> "/usr/local/lib/python2.7/dist-packages/thrift/transport/TTransport.py",
> line 272, in read
>self.readFrame()
>  File
> "/usr/local/lib/python2.7/dist-packages/thrift/transport/TTransport.py",
> line 276, in readFrame
>buff = self.__trans.readAll(4)
>  File
> "/usr/local/lib/python2.7/dist-packages/thrift/transport/TTransport.py",
> line 58, in readAll
>chunk = self.read(sz-have)
>  File
> "/usr/local/lib/python2.7/dist-packages/thrift/transport/TSocket.py", line
> 94, in read
>buff = self.handle.recv(sz)
> 
> 
> On Sat, Apr 14, 2012 at 1:34 PM, Mark Dewey  wrote:
> 
>> I thought I followed the instructions to set up the nose tests, but when I
>> run them all they do is (slowly) print out ".E" and then hang. Any clues?
>> 
>> Mark
>> 



Re: Server Side Logic/Script - Triggers / StoreProc

2012-04-24 Thread aaron morton
Out of interest some questions…

When writing through triggers how do you handle the CL guarantee ? Is the CL 
level checked once at the start or checked for each embedded code invocation ? 

Do you still guarantee the (non counter) writes as idempotent ?  i.e. do the 
triggers need to be deterministic ? Can clients retry operations that timed out 
?

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/04/2012, at 5:13 AM, Colin Clark wrote:

> In my opinion, triggers/stored procedures are an absolute requirement for any 
> distributed database.
> 
> We've been using stored procedures in Cassandra now for a while, we've made 
> modifications such that we don't really write directly anymore but pass 
> everything through either a default stored procedures (which is just what was 
> there before) or a dynamically loaded piece of java.
> 
> These stored procedures can call other dynamically loaded pieces of java as 
> well - we don't have any plans to implement any scripting capabilities.  We 
> can also 'select' from procedures.
> 
> The idea of downloading data from a distributed data base for processing 
> flies in the face of what nosql and bigdata is all about - you've got to do 
> it in the db.
> 
> On Apr 22, 2012, at 11:35 AM, Brian O'Neill wrote:
> 
>> Praveen,
>> 
>> We are certainly interested. To get things moving we implemented an add-on 
>> for Cassandra to demonstrate the viability (using AOP):
>> https://github.com/hmsonline/cassandra-triggers
>> 
>> Right now the implementation executes triggers asynchronously, allowing you 
>> to implement a java interface and plugin your own java class that will get 
>> called for every insert.
>> 
>> Per the discussion on 1311, we intend to extend our proof of concept to be 
>> able to invoke scripts as well.  (minimally we'll enable javascript, but 
>> we'll probably allow for ruby and groovy as well)
>> 
>> -brian
>> 
>> On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
>> 
>>> I found that Triggers are coming in Cassandra 1.2 
>>> (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of 
>>> any StoreProc like pattern.
>>> 
>>> I know this has been discussed so many times but never met with any 
>>> initiative. Even Groovy was staged out of the trunk.
>>> 
>>> Cassandra is great for logging and as such will be infinitely more useful 
>>> if some logic can be pushed into the Cassandra cluster nearer to the 
>>> location of Data to generate a materialized view useful for applications.
>>> 
>>> Server Side Scripts/Routines in Distributed Databases could soon prove to 
>>> be the differentiating factor.
>>> 
>>> Let me reiterate things with a use case.
>>> 
>>> In our application we store time series data in wide rows with TTL set on 
>>> each point to prevent data from growing beyond acceptable limits. Still the 
>>> data size can be a limiting factor to move all of it from the cluster node 
>>> to the querying node and then to the application via thrift for processing 
>>> and presentation.
>>> 
>>> Ideally we should process the data on the residing node and pass only the 
>>> materialized view of the data upstream. This should be trivial if Cassandra 
>>> implements some sort of server side scripting and CQL semantics to call it.
>>> 
>>> Is anybody else interested in a similar feature? Is it being worked on? Are 
>>> there any alternative strategies to this problem?
>>> 
>>> Praveen
>>> 
>>> 
>> 
>> -- 
>> Brian ONeill
>> Lead Architect, Health Market Science (http://healthmarketscience.com)
>> mobile:215.588.6024
>> blog: http://weblogs.java.net/blog/boneill42/
>> blog: http://brianoneill.blogspot.com/
>> 
>