Re: Cassandra schema migrator

2014-12-05 Thread Ben Hood
On Tue, Nov 25, 2014 at 12:49 PM, Phil Wise wrote: > https://github.com/advancedtelematic/cql-migrate Great to see these tools out there! Just to add to the list https://github.com/mattes/migrate Might not be as C* specific as the other tools mentioned earlier in this thread, but it does integ

Re: [RELEASE] Apache Cassandra 2.1.1 released

2014-10-24 Thread Ben Hood
On Fri, Oct 24, 2014 at 6:36 PM, Ben Hood <0x6e6...@gmail.com> wrote: > Or does the release require time to propagate itself out? The ccm team inform me that the binaries might take up to 48 hours to propagate their way out.

Re: [RELEASE] Apache Cassandra 2.1.1 released

2014-10-24 Thread Ben Hood
Thanks very much for this maintenance release :-) Are there any known issues with ccm on 2.1.1 (see trace below)? Or does the release require time to propagate itself out? Traceback (most recent call last): File "/usr/local/bin/ccm", line 4, in __import__('pkg_resources').run_script('ccm=

Re: How do you run integration tests for your cassandra code?

2014-10-13 Thread Ben Hood
FWIW we run a 3 node cluster with ccm on Travis to regression test the gocql driver - here's the descriptor: https://github.com/gocql/gocql/blob/master/.travis.yml On Mon, Oct 13, 2014 at 9:04 PM, Philip Thompson < philip.thomp...@datastax.com> wrote: > Kevin, > > Have you looked at the Cassandra

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Ben Hood
On Wed, Jul 23, 2014 at 1:30 PM, Peter Lin wrote: > > I sent a request to add a link my .Net driver for cassandra to the wiki > over 5 weeks back and no response at all. > TL;DR There is something wrong with Cassandra information sharing, but I am partly to blame. My experience has not been too

Re: Should PREPARE QUERY return metadata for the query result?

2014-07-23 Thread Ben Hood
On Wed, Jul 23, 2014 at 12:07 PM, Ben Hood <0x6e6...@gmail.com> wrote: > Or have I just been looking at the wrong version of the spec all along? So it turns out that this is a case of PEBCAK: v2 of the protocol is formulated thusly: 4.2.5.4. Prepared The result to a PREPARE message.

Re: Should PREPARE QUERY return metadata for the query result?

2014-07-23 Thread Ben Hood
On Wed, Jul 23, 2014 at 11:14 AM, Ben Hood <0x6e6...@gmail.com> wrote: > But I was wondering if we were doing something wrong by not returning > the result meta data from the PREPARE result (if it does indeed > exist). Looking into this a bit further, it looks like the client

Should PREPARE QUERY return metadata for the query result?

2014-07-23 Thread Ben Hood
Hi all, I'm looking at the specification of statement preparation (section 4.2.5.4 of the CQL protocol) and I'm wondering whether the metadata result of the PREPARE query only returns column information for the query arguments, and not for the columns of the actual query result. The background is

Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9

2014-07-23 Thread Ben Hood
On Wed, Jul 23, 2014 at 1:53 AM, Robert Coli wrote: > On Tue, Jul 22, 2014 at 1:53 AM, Ben Hood <0x6e6...@gmail.com> wrote: > Indeed, reading up on the issue (and discussing it with folks) there are a > number of mitigating factors, most significantly driver workarounds use of >

Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9

2014-07-23 Thread Ben Hood
On Wed, Jul 23, 2014 at 1:53 AM, Robert Coli wrote: > On Tue, Jul 22, 2014 at 1:53 AM, Ben Hood <0x6e6...@gmail.com> wrote: > In this particular case, the answer to "why not" involves the idea that one > needs to be able to test with a driver in order to expose

Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9

2014-07-22 Thread Ben Hood
On Tue, Jul 22, 2014 at 1:26 AM, Robert Coli wrote: > I'm pretty sure reversed comparator timestamps are a common type of schema, > given that there are blog posts recommending their use, so I struggle to > understand how this was not detected by unit tests. As Karl has suggested, client driver m

Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9

2014-07-21 Thread Ben Hood
On Sat, Jul 19, 2014 at 7:35 PM, Karl Rieb wrote: > Can now be followed at: > https://issues.apache.org/jira/browse/CASSANDRA-7576. Nice work! Finally we have a proper solution to this issue, so well done to you.

Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9

2014-07-18 Thread Ben Hood
On Fri, Jul 18, 2014 at 3:03 PM, Karl Rieb wrote: > Why is the protocol ID correct for some tables but not others? I have no idea. > Why does it work when I do a clean install on a new 2.0.x cluster? I still have no idea. > The bug seems to be on the Cassandra side and the clients seem to just

Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9

2014-07-18 Thread Ben Hood
On Fri, Jul 18, 2014 at 3:38 AM, Karl Rieb wrote: > Any suggestions on what is going on or how to fix it? I'm not sure how much this will help, but one of the gocql users reported similar symptoms when upgrading to 2.0.6. We ended up applying a client side patch to address the issue, the details

Re: DROP Table put Cassandra in an inconsistent state

2014-07-18 Thread Ben Hood
On Fri, Jul 4, 2014 at 10:31 AM, Simon Chemouil wrote: > Hi, > > I just encountered a bug with 2.1-rc1 (didn't have the chance to update > to rc2 yet), and wondering if it's known or if I should report the issue > on JIRA. FWIW I think this issue might be related to what you are seeing: https://i

AssertionError as a result of a timeout

2014-04-10 Thread Ben Hood
empty subject :-( Cheers, Ben On Wed, Apr 9, 2014 at 11:34 AM, Ben Hood <0x6e6...@gmail.com> wrote: > Hi all, > > I'm getting the following error in a 2.0.6 instance: > > ERROR [Native-Transport-Requests:16633] 2014-04-09 10:11:45,811 > ErrorMessage.java (line 222

[no subject]

2014-04-09 Thread Ben Hood
Hi all, I'm getting the following error in a 2.0.6 instance: ERROR [Native-Transport-Requests:16633] 2014-04-09 10:11:45,811 ErrorMessage.java (line 222) Unexpected exception during request java.lang.AssertionError: localhost/127.0.0.1 at org.apache.cassandra.service.StorageProxy.submitHint(Stora

Re: Data model for boolean attributes

2014-03-21 Thread Ben Hood
Hey Duy Hai, On Fri, Mar 21, 2014 at 7:34 PM, DuyHai Doan wrote: > Your previous "select * from x where flag = true;" translate into: > > SELECT * FROM x WHERE id=... AND flag = true > > Of course, you'll need to provide the id in any case. This is an interesting option, though this app needs

Re: Data model for boolean attributes

2014-03-21 Thread Ben Hood
On Sat, Mar 22, 2014 at 3:32 AM, Ben Hood <0x6e6...@gmail.com> wrote: > Also a very good point. The main query paths the app needs to support are: > > select * from x where flag=true and id = ? and timestamp >= ? and timestamp > <= ? > select * from x where flag=fal

Re: Data model for boolean attributes

2014-03-21 Thread Ben Hood
On Sat, Mar 22, 2014 at 1:31 AM, Laing, Michael wrote: > Whoops now there are only 2 partition keys! Not good if you have any > reasonable number of rows... Yes, this column family will have a large number of rows. > I monitor partition sizes and shard enough to keep them reasonable in this > so

Data model for boolean attributes

2014-03-21 Thread Ben Hood
Hi, I was wondering what the best way is to lay column families out so that you can to query by a boolean attribute. For example: create table x( id text, timestamp timeuuid, flag boolean, // other fields primary key (id, timestamp) ) So that you can query select * from x where flag

Re: Planet Cassandra

2014-03-20 Thread Ben Hood
Hey Brady, Thanks for sorting this one out. The URL for gocql has changed to https://github.com/gocql/gocql I'd also like to add a link to cqlc (http://relops.com/cqlc/) which is a CQL compiler that works with gocql for Go. BTW, do you know who I need to bug for the Apache wiki? Many thanks,

Planet Cassandra

2014-03-20 Thread Ben Hood
Hey all, Does anybody know who to contact to update the client tools page on Planet Cassandra and the Apache wiki? Cheers, Ben

Background flushing appears to peg CPU

2014-02-26 Thread Ben Hood
Hi, Using Cassandra 2.0.5 we seem to be running into an issue with a continuous flush of a column family that has no current data ingress. After disconnecting all clients from the node, the Cassandra instance seems to continuously flushing a specific column family with this line appearing all over

Re: CQL decimal encoding

2014-02-26 Thread Ben Hood
On Thu, Feb 27, 2014 at 12:05 AM, Ben Hood <0x6e6...@gmail.com> wrote: > BTW thanks and kudos go to Theo and Tyler (of the cql-rb and the > datastax python drivers respectively) for publishing encoding test > cases for the decimal type - that was quite helpful :-) Sorry, I forgot

Re: CQL decimal encoding

2014-02-26 Thread Ben Hood
On Thu, Feb 27, 2014 at 12:01 AM, Ben Hood <0x6e6...@gmail.com> wrote: > Hopefully the gocql team can code review this soon and if that's good > to go, we'll have another CQL driver that can deal with decimals. BTW thanks and kudos go to Theo and Tyler (of the cql-rb a

Re: CQL decimal encoding

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 12:10 PM, Laing, Michael wrote: > go uses 'zig-zag' encoding, perhaps that is the difference? > > > On Wed, Feb 26, 2014 at 6:52 AM, Peter Lin wrote: >> >> >> You may need to bit shift if that is the case Thanks for everybody's help, I've managed to solve the issue: the u

Re: Flushing after dropping a column family

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 3:59 PM, Tupshin Harper wrote: > This is a known issue that is fixed in 2.1beta1. > https://issues.apache.org/jira/browse/CASSANDRA-5202 > > Until 2.1, we do not recommend relying on the recycling of tables through > drop/create or truncate. > > However, on a single node cl

Re: Flushing after dropping a column family

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 3:58 PM, DuyHai Doan wrote: > Try truncate foo instead of drop table foo. > > About the nodetool clearsnapshot, I've experienced the same behavior also > before. Snapshots cleaning is not immediate I get the same behavior with truncate as well.

Re: Flushing after dropping a column family

2014-02-26 Thread Ben Hood
On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan wrote: > "I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to > leave the underlying data behind." > > --> What do you mean by "underlying data" ? Are you talking about > "snapshots" ? I was referring to all of the state related

Flushing after dropping a column family

2014-02-26 Thread Ben Hood
Hi, I'm trying to truncate data on a single node 2.0.5 instance and I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. So I was wondering what nodetool operation I should use to completely nuke the old data, short of dropping the entire key

Re: CQL decimal encoding

2014-02-25 Thread Ben Hood
Hey Colin, On Tue, Feb 25, 2014 at 10:26 PM, Colin Blower wrote: > It looks like you are trying to implement the Decimal type. You might want > to start with implementing the Integer type. The Decimal type follows pretty > easily from the Integer type. > > For example: > i = unmarchalInteger(data

Re: CQL decimal encoding

2014-02-25 Thread Ben Hood
On Tue, Feb 25, 2014 at 12:50 PM, Peter Lin wrote: > > if I have time this week, I'll try to make a patch for the spec. Can't > promise I can get to it this week, but having come across this issue with > FluentCassandra, I'd like to help others avoid it. So I may be running into an encoding bug w

Re: CQL decimal encoding

2014-02-25 Thread Ben Hood
Sylvain, On Tue, Feb 25, 2014 at 10:38 AM, Sylvain Lebresne wrote: > The honest answer is, no-one took the time to write that down properly and > include it in the spec. My small excuse for initially skipping it in the > spec is that the CQL data type encodings are really not different from what

Re: How should clients handle the user defined types in 2.1?

2014-02-24 Thread Ben Hood
On Mon, Feb 24, 2014 at 7:52 PM, Theo Hultberg wrote: > (I posted this on the client-dev list the other day, but that list seems > dead so I'm cross posting, sorry if it's the wrong thing to do) I didn't even realize there was a list for driver implementors - is this used at all? Is it worth bein

Re: CQL decimal encoding

2014-02-24 Thread Ben Hood
On Mon, Feb 24, 2014 at 7:50 PM, Theo Hultberg wrote: > I don't know if it's by design or if it's by oversight that the data types > aren't part of the binary protocol specification. I had to reverse engineer > how to encode and decode all of them for the Ruby driver. There were > definitely a few

Re: CQL decimal encoding

2014-02-24 Thread Ben Hood
On Mon, Feb 24, 2014 at 7:43 PM, Paul "LeoNerd" Evans wrote: > On Mon, 24 Feb 2014 19:14:48 +0000 > Ben Hood <0x6e6...@gmail.com> wrote: > >> So I have a question about the encoding of 0: \x00\x00\x00\x00\x00. > > The first four octets are the decimal shift

Re: CQL decimal encoding

2014-02-24 Thread Ben Hood
On Mon, Feb 24, 2014 at 7:02 PM, Paul "LeoNerd" Evans wrote: > On Mon, 24 Feb 2014 13:55:07 -0500 > Peter Lin wrote: > >> I did the same thing :) >> >> I inserted lots of bigDecimal in Cqlsh and read it from my C# client. >> Then I did the opposite, inserts BigDecimal from C# and query it from >>

Re: CQL decimal encoding

2014-02-24 Thread Ben Hood
Hey Paul, On Mon, Feb 24, 2014 at 6:40 PM, Paul "LeoNerd" Evans wrote: > And the unit tests live here: > > > https://metacpan.org/source/PEVANS/Protocol-CassandraCQL-0.11/t/02types.t#L111 Very cool - I'll port these examples to the gocql marshaling test suite - kudos to you for reverse engine

Re: CQL decimal encoding

2014-02-24 Thread Ben Hood
On Mon, Feb 24, 2014 at 6:09 PM, Peter Lin wrote: > I took a look at the code. Java uses big endian encoding. I don't know if GO > defaults to big or little. In my port of Hector to C#, I reverse the bytes > due to the fact that .Net uses little endian. Cool - I'll take this as a spec - thanks fo

Re: CQL decimal encoding

2014-02-24 Thread Ben Hood
Hey Peter, On Mon, Feb 24, 2014 at 5:25 PM, Peter Lin wrote: > > Not sure what you mean by the question. > > Are you talking about the structure of BigDecimal in java? If that is your > question, the java's BigDecimal uses the first 4 bytes for scale and > remaining bytes for BigInteger I'm talk

CQL decimal encoding

2014-02-24 Thread Ben Hood
Hi, I'd like to implement decimal encoding for gocql but I'm wondering what this should be compatible with. Is there some kind of wire format that arbitrary precision numbers should adhere to to ensure interoperability? Cheers, Ben

Re: CQL list command

2014-02-06 Thread Ben Hood
On Thu, Feb 6, 2014 at 9:01 PM, Andrew Cobley wrote: > I often use the CLI command LIST for debugging or when teaching students > showing them what's going on under the hood of CQL. I see that CLI swill be > removed in Cassandra 3 and we will lose this ability. It would be nice if > CQL retai

Re: Lots of deletions results in death by GC

2014-02-06 Thread Ben Hood
On Wed, Feb 5, 2014 at 2:52 AM, srmore wrote: > Dropped messages are the sign that Cassandra is taking heavy that's the load > shedding mechanism. I would love to see some sort of back-pressure > implemented. +1 for back pressure in general with Cassandra

Re: CQL flow control

2014-02-05 Thread Ben Hood
On Wed, Feb 5, 2014 at 7:32 PM, Edward Capriolo wrote: > I agree you can not really ask your database to capacity plan for you. > Cassandra does have backpressure of sorts if requests fail with > TimedOutException or UnavailableException. You might be having a capacity > problem. > > The way I wo

Re: CQL flow control

2014-02-05 Thread Ben Hood
On Wed, Feb 5, 2014 at 6:55 PM, Robert Coli wrote: > I think most deploys of Cassandra deal with this reality by carefully > managing available capacity so that they don't risk getting in this > situation. This is what I have done in my production apps. Basically I have found the system's sweet s

CQL flow control

2014-02-05 Thread Ben Hood
Hi, A discussion has arisen in the gocql team about how to handle saturation when CQL clients are sending in packets at a faster rate than the Cassandra cluster can sustain. What is the general approach to this from a server perspective? Is there any flow control that the server can apply to back

cqlc - type safe CQL statements in Go

2014-01-27 Thread Ben Hood
Hi, I just wanted to share a Cassandra project that I've been working on. cqlc generates Go code from your Cassandra schema so that you can write type safe CQL statements in Go with a natural query syntax. It's aimed at people using CQL in Golang apps who are looking to reduce boilerplate code.

Re: Writes during schema migration

2013-12-23 Thread Ben Hood
ew Zealand > @aaronmorton > > Co-Founder & Principal Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > On 19/12/2013, at 3:02 am, Ben Hood <0x6e6...@gmail.com> wrote: > > Hi, > > I was wondering if anybody knows any best practice

Writes during schema migration

2013-12-18 Thread Ben Hood
Hi, I was wondering if anybody knows any best practices of how to apply a schema migration across a cluster. I've been reading this article: http://www.datastax.com/dev/blog/the-schema-management-renaissance to see what is happening under the covers. However the article doesn't seem to talk abo

Re: Consistency level 256

2013-11-19 Thread Ben Hood
me reason, but it's a lot more likely a priori that the driver just > sends something wrong. In any case, since as far as I know no-one has seen > that with any other driver, you'd probably want to track that down with the > gocql authors. > > -- > Sylvain > &g

Re: Consistency level 256

2013-11-18 Thread Ben Hood
not really sure it that really the cause of this issue. On Tue, Nov 19, 2013 at 12:56 AM, Ben Hood <0x6e6...@gmail.com> wrote: > Hi, > > Using 2.0.2 with the gocql driver, I'm getting this intermittent error: > > "Unknown code 256 for a consistency level"

Consistency level 256

2013-11-18 Thread Ben Hood
Hi, Using 2.0.2 with the gocql driver, I'm getting this intermittent error: "Unknown code 256 for a consistency level" Is this something that the server could be returning, or is this maybe only a client side issue? Cheers, Ben

Re: Modeling multi-tenanted Cassandra schema

2013-11-14 Thread Ben Hood
t functions at scale, though I'm sure there are others >>> around, just not open sourced or actually running large deployments. >>> >>> Astyanax can do this as well albeit with a little more work required: >>> >>> https://github.com/Netflix/astyanax/

Modeling multi-tenanted Cassandra schema

2013-11-12 Thread Ben Hood
Hi, I've just received a requirement to make a Cassandra app multi-tenanted, where we'll have up to 100 tenants. Most of the tables are timestamped wide row tables with a natural application key for the partitioning key and a timestamp key as a cluster key. So I was considering the options: (a)

Re: Maintaining counter column consistency

2013-10-02 Thread Ben Hood
, Ben On October 2, 2013 at 10:09:40 AM, Haithem Jarraya (a-hjarr...@expedia.com) wrote: Hi Ben, If you make sure R + W > N you should be fine. Have a read of thisĀ  http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency Thanks, H On 1 Oct 2013, at 18:29, Be

Maintaining counter column consistency

2013-10-01 Thread Ben Hood
Hi, We're maintaining a bunch of application specific counters that are incremented on a per event basis just after the event has been inserted. Given the fact that they can get of sync, we were wondering if there are any best practices or just plain real world experience for handling the consist

Re: Cassandra libraries for Golang

2013-02-11 Thread Ben Hood
Hi Boris, I use this one with Cassandra 1.2+ (you'll need to turn the native port on): https://github.com/titanous/gocql HTH, Ben On Friday, 8 February 2013 at 16:40, Boris Solovyov wrote: > Hi, > > I'm developing Go application. I see there is gossie, which doesn't support > the native b

Re: Wide rows in CQL 3

2013-01-09 Thread Ben Hood
I'm currently in the process of porting my app from Thrift to CQL3 and it seems to me that the underlying storage layout hasn't really changed fundamentally. The difference appears to be that CQL3 offers a neater abstraction on top of the wide row format. For example, in CQL3, your query results ar

Re: CQL3 Frame Length

2013-01-08 Thread Ben Hood
rth parallelizing the > message > encoding (which require you encode it in memory first) since it's an > asynchronous protocol and so there will likely be multiple writer > simultaneously. > > -- > Sylvain > > > > On Tue, Jan 8, 2013 at 12:48 PM, Ben Hood <0x6e6.

CQL3 Frame Length

2013-01-08 Thread Ben Hood
Hi, I've read the CQL wire specification and naively, I can't see how the frame length length header is used. To me, it looks like on the read side, you know which type of structures to expect based on the opcode and each structure is TLV encoded. On the write side, you need to encode TLV struct

Re: Batch mutation streaming

2012-12-12 Thread Ben Hood
land > > @aaronmorton > http://www.thelastpickle.com > > On 9/12/2012, at 7:18 AM, Ben Hood <0x6e6...@gmail.com> wrote: > >> Thanks for the clarification Andrey. If that is the case, I had better >> ensure that I don't put the entire contents of a v

Re: Batch mutation streaming

2012-12-08 Thread Ben Hood
create such message. Nothing happens until you send > this message. Probably, this is what you call "close the batch". > > Thank you, > Andrey > > > On Fri, Dec 7, 2012 at 5:34 AM, Ben Hood <0x6e6...@gmail.com> wrote: >> Hi, >> >> I&#x

Batch mutation streaming

2012-12-07 Thread Ben Hood
Hi, I'd like my app to stream a large number of events into Cassandra that originate from the same network input stream. If I create one batch mutation, can I just keep appending events to the Cassandra batch until I'm done, or are there some practical considerations about doing this (e.g. too

Re: 1000's of CF's.

2012-10-09 Thread Ben Hood
I'm not a Cassandra dev, so take what I say with a lot of salt, but AFAICT, there is a certain amount of overhead in maintaining a CF, so when you have large numbers of CFs, this adds up. From a layperson's perspective, this observation sounds reasonable, since zero-cost CFs would be tantamount to

Re: 1000's of column families

2012-10-02 Thread Ben Hood
Dean, On Tuesday, October 2, 2012 at 18:52, Hiller, Dean wrote: > Because the data for an index is not all together(ie. Need a multi get to get > the data). It is not contiguous. > > The prefix in a partition they keep the data so all data for a prefix from > what I understand is contiguous.

Re: 1000's of column families

2012-10-02 Thread Ben Hood
Jeremy, On Tuesday, October 2, 2012 at 17:06, Jeremy Hanna wrote: > Another option that may or may not work for you is the support in Cassandra > 1.1+ to use a secondary index as an input to your mapreduce job. What you > might do is add a field to the column family that represents which virt

Re: 1000's of column families

2012-10-02 Thread Ben Hood
On Tue, Oct 2, 2012 at 3:37 PM, Brian O'Neill wrote: > Exactly. So you're back to the deliberation between using multiple CFs (potentially with some known working upper bound*) or feeding your map reduce in some other way (as you decided to do with Storm). In my particular scenario I'd like to be

Re: Getting serialized Rows from CommitLogSegment file

2012-10-02 Thread Ben Hood
Filipe, On Tue, Oct 2, 2012 at 2:56 PM, Felipe Schmidt wrote: > Seems like the information was dropped or, maybe, not existent in this > instance of the Schema. But, as soon as I know, it's just one instance of > the schema in Cassandra, right? If I understand you correctly, you are trying to pr

Re: 1000's of column families

2012-10-02 Thread Ben Hood
Brian, On Tue, Oct 2, 2012 at 2:20 PM, Brian O'Neill wrote: > > Without putting too much thought into it... > > Given the underlying architecture, I think you could/would have to write > your own partitioner, which would partition based on the prefix/virtual > keyspace. I might be barking up the

Re: 1000's of column families

2012-10-02 Thread Ben Hood
Dean, On Tue, Oct 2, 2012 at 1:37 PM, Hiller, Dean wrote: > Ben, > to address your question, read my last post but to summarize, yes, there > is less overhead in memory to prefix keys than manage multiple Cfs EXCEPT > when doing map/reduce. Doing map/reduce, you will now have HUGE overhead > i

Re: 1000's of column families

2012-10-01 Thread Ben Hood
On Mon, Oct 1, 2012 at 9:38 PM, Brian O'Neill wrote: > Its just a convenient way of prefixing: > http://hector-client.github.com/hector/build/html/content/virtual_keyspaces.html So given that it is possible to use a CF per tenant, should we assume that there at sufficient scale that there is less

Re: 1000's of column families

2012-10-01 Thread Ben Hood
Brian, On Mon, Oct 1, 2012 at 4:22 PM, Brian O'Neill wrote: > We haven't committed either way yet, but given Ed Anuff's presentation > on virtual keyspaces, we were leaning towards a single column family > approach: > http://blog.apigee.com/detail/building_a_mobile_data_platform_with_cassandra_-_

Re: Using the commit log for external synchronization

2012-09-21 Thread Ben Hood
Brian, On Sep 22, 2012, at 1:46, "Brian O'Neill" wrote: >> IMHO it's a better design to multiplex the data stream at the application >> level. > +1, agreed. > > That is where we ended up. (and Storm is proving to be a solid > framework for that) Thanks for the heads up, I'll check it out. Che

Re: Using the commit log for external synchronization

2012-09-21 Thread Ben Hood
Rob, On Sep 22, 2012, at 0:39, Rob Coli wrote: > The above gets you most of the way there, but Aaron's point about the > commitlog not reflecting whether the app met its CL remains true. The > possibility that Cassandra might coalesce to a value that the > application does not know was successfu

Re: Using the commit log for external synchronization

2012-09-21 Thread Ben Hood
Hi Aaron, Thanks for your input. On Fri, Sep 21, 2012 at 9:56 AM, aaron morton wrote: > The commit log is essentially internal implementation. The total size of the > commit log is restricted, and the multiple files used to represent segments > are recycled. So once all the memtables have been f

Using the commit log for external synchronization

2012-09-20 Thread Ben Hood
Hi, I'd like to incrementally synchronize data written to Cassandra into an external store without having to maintain an index to do this, so I was wondering whether anybody is using the commit log to establish what updates have taken place since a given point in time? Cheers, Ben