Dynamic schema modification an anti-pattern?

2014-10-06 Thread Todd Fast
There is a team at my work building a entity-attribute-value (EAV) store using Cassandra. There is a column family, called Entity, where the partition key is the UUID of the entity, and the columns are the attributes names with their values. Each entity will contain hundreds to thousands of attribu

Re: Bitmaps

2014-10-06 Thread DuyHai Doan
Yes this one, not Ooyala sorry. Very inventive usage of C* indeed. Thanks for the links On Mon, Oct 6, 2014 at 11:01 PM, Peter Sanford wrote: > On Mon, Oct 6, 2014 at 1:56 PM, DuyHai Doan wrote: > >> Isn't there a video of Ooyala at some past Cassandra Summit demonstrating >> usage of Cassandra

Re: Bitmaps

2014-10-06 Thread Peter Sanford
On Mon, Oct 6, 2014 at 1:56 PM, DuyHai Doan wrote: > Isn't there a video of Ooyala at some past Cassandra Summit demonstrating > usage of Cassandra for text search using Trigram ? AFAIK they were storing > kind of bitmap to perform OR & AND operations on trigram > That sounds like the talk Matt

Re: Bitmaps

2014-10-06 Thread graham sanderson
You certainly have plenty of freedom to trade off size vs access granularity using multiple blobs. It really depends on how mutable the data is, how you intend to read it, whether it is highly sparse and or highly dense (in which case you perhaps don’t need to store every bit) etc. On Oct 6, 20

Re: Bitmaps

2014-10-06 Thread DuyHai Doan
Isn't there a video of Ooyala at some past Cassandra Summit demonstrating usage of Cassandra for text search using Trigram ? AFAIK they were storing kind of bitmap to perform OR & AND operations on trigram On Mon, Oct 6, 2014 at 10:53 PM, Russell Bradberry wrote: > I highly recommend against sto

Re: Indexes Fragmentation

2014-10-06 Thread Robert Coli
On Fri, Oct 3, 2014 at 6:03 PM, Arthur Zubarev wrote: > I now see I had misspelled the word tall for toll, anyways, if I > understood correctly, your reply implies there is no impact whatsoever and > there is no need to defrug indexes of the frequently changing columns. > "Cases with lots of sec

Re: Bitmaps

2014-10-06 Thread Russell Bradberry
I highly recommend against storing data structures like this in C*. That really isn't it's sweet spot. For instance, if you were to use the blob type which will give you the smallest size, you are still looking at a cell size of (90,000,000/8/1024) = 10,986 or over 10MB in size, which is prohibiti

Re: Exploring Simply Queueing

2014-10-06 Thread Robert Coli
On Mon, Oct 6, 2014 at 1:40 PM, Jan Algermissen wrote: > Hmm, I was under the impression that issues with old queue state disappear > after gc_grace_seconds and that the goal primarily is to keep the rows > ‘short’ enough to achieve a tombstones read performance impact that one can > live with in

Bitmaps

2014-10-06 Thread Eduardo Cusa
Hi Guys, what data type recommend to store bitmaps? I am planning to store maps of 90,000,000 length and then query by key. Example: key : 22_ES bitmap : 10101101010111010101011 Thanks Eduardo

Re: Exploring Simply Queueing

2014-10-06 Thread Ranjib Dey
i want answer the first question why one might use cassandra as a queuing solution: - its the only opensource distributed persistence layer (i.e. no SPOF), that you can run over WAN and provide lan/wan specific quorum controls i know its sub optimal, as the deletion imposes additional compaction/r

Re: Exploring Simply Queueing

2014-10-06 Thread Jan Algermissen
Robert, On 06 Oct 2014, at 17:50, Robert Coli wrote: > In theory they can also be designed such that history is not infinite, which > mitigates the buildup of old queue state. > Hmm, I was under the impression that issues with old queue state disappear after gc_grace_seconds and that the goa

Re: Exploring Simply Queueing

2014-10-06 Thread Jan Algermissen
Shane, On 06 Oct 2014, at 16:34, Shane Hansen wrote: > Sorry if I'm hijacking the conversation, but why in the world would you want > to implement a queue on top of Cassandra? It seems like using a proper > queuing service > would make your life a lot easier. Agreed - however, the use case sim

Re: ConnectionException while trying to connect with Astyanax over Java driver

2014-10-06 Thread Ruchir Jha
That exception is on the cassandra server and not on the client. On Mon, Oct 6, 2014 at 2:10 PM, DuyHai Doan wrote: > java.lang.NoSuchMethodError -> Jar dependency issue probably. Did you try > to create an issue on the Astyanax github repo ? > > On Mon, Oct 6, 2014 at 6:01 PM, Ruchir Jha wrote

RE: Cassandra Data Model design

2014-10-06 Thread Rahul Gupta
You need rethink your data model for client_data table. Unlike RDBMS, Cassandra heavily relies on Primary Key for filtering data. In fact using any column other than primary key is not recommended when you are using Cassandra. This means that how you design your Primary Key is critical. There ar

Re: IN versus multiple asynchronous queries

2014-10-06 Thread DuyHai Doan
"Definitely better to not make the coordinator hold on to that memory while it waits for other requests to come back" --> You get it. When loading big documents, you risk starving the heap quickly, triggering long GC cycle on the coordinator etc... On Mon, Oct 6, 2014 at 6:22 PM, Robert Wille wro

Re: ConnectionException while trying to connect with Astyanax over Java driver

2014-10-06 Thread DuyHai Doan
java.lang.NoSuchMethodError -> Jar dependency issue probably. Did you try to create an issue on the Astyanax github repo ? On Mon, Oct 6, 2014 at 6:01 PM, Ruchir Jha wrote: > All, > > I am trying to use the new astyanax over java driver to connect to > cassandra version 1.2.12, > > Following set

assertion error on joining

2014-10-06 Thread Kais Ahmed
Hi all, I'm a bit stuck , i want to expand my cluster C* 2.0.6 but i encountered an error on the new node. ERROR [FlushWriter:2] 2014-10-06 16:15:35,147 CassandraDaemon.java (line 199) Exception in thread Thread[FlushWriter:2,5,main] java.lang.AssertionError: 394920 at org.apache.cassandr

Re: IN versus multiple asynchronous queries

2014-10-06 Thread Robert Wille
As far as latency is concerned, it seems like it wouldn't matter very much if the coordinator has to wait for all the responses to come back, or the client waits for all the responses to come back. I’ve got the same latency either way. I would assume that 50 coordinations is more expensive than

ConnectionException while trying to connect with Astyanax over Java driver

2014-10-06 Thread Ruchir Jha
All, I am trying to use the new astyanax over java driver to connect to cassandra version 1.2.12, Following settings are turned on in cassandra.yaml: start_rpc: true native_transport_port: 9042 start_native_transport: true *Code to connect:* final Supplier> hostSupplier = new Supplier>() {

Re: Exploring Simply Queueing

2014-10-06 Thread Robert Coli
On Mon, Oct 6, 2014 at 8:30 AM, Minh Do wrote: > Just let you know if you base your implementation on Netflix's queue > recipe, there are many issues with it. > > In general, we don't advise people to use that recipe so I suggest you to > save your time by not going that same route again. > I +1

Re: Exploring Simply Queueing

2014-10-06 Thread Minh Do
Hi Jan, Both Chris and Shane say what I believe the correct thinking. Just let you know if you base your implementation on Netflix's queue recipe, there are many issues with it. In general, we don't advise people to use that recipe so I suggest you to save your time by not going that same route

Re: CQL query throws TombstoneOverwhelmingException against a LeveledCompactionStrategy table

2014-10-06 Thread dlu66061
BTW, I am using Cassandra 2.0.6. Is this the same as CASSANDRA-6654 (Droppable tombstones are not being removed from LCS table despite being above 20%) ? I checked my table in JConsole and the droppable tombstone ratio of over 60%. If it is

Re: Exploring Simply Queueing

2014-10-06 Thread Shane Hansen
Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. That being said, there might be a better way to play to the strengths of C*. Ideally everything you

Re: Increasing size of "Batch of prepared statements"

2014-10-06 Thread shahab
Thanks Jens for the comment. Actually I am using Cassandra Stress Tool and this is the tools who inserts such a large statements. But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? What shall i do if I want columns with l