Re: design that mimics twitter tweet search

2012-03-18 Thread Tharindu Mathew
Sasha, It depends on the way you implement I guess... Maybe twitter uses Solandra, who's very good at indexing these in different ways but has the power of Cassandra underneath... If your doing your own impl of indexing be mindful that you can break the sentence into four words and index or you i

Re: design that mimics twitter tweet search

2012-03-18 Thread Andrey V. Panov
Why you suppose they did search on Cassandra? On 19 March 2012 00:16, Sasha Dolgy wrote: > yes -- but given i have two keywords, and want to find all tweets that > have "cassandra" and "bestest" ... means, retrieving all columns + values > in each row, iterating through both to see if tweet id's

Re: consistency level question

2012-03-18 Thread Watanabe Maki
Yes, read and write won't fail with single node failure. But your read may return old data. maki On 2012/03/19, at 1:08, Caleb Rackliffe wrote: > That sounds right to me :) > > Caleb Rackliffe | Software Developer > M 949.981.0159 | ca...@steelhouse.com > > > From: Tamar Fraenkel > Reply-

Re: Link in Wiki broken

2012-03-18 Thread Tharindu Mathew
Appreciate the quick reply. Thanks. On Mon, Mar 19, 2012 at 12:20 AM, Benoit Perroud wrote: > http://blip.tv/datastax/getting-to-know-the-cassandra-codebase-4034648 > > > 2012/3/18 Tharindu Mathew : > > Hi, > > > > It seems that [1] is broken. Wonder if it exists somewhere else? > > > > [1] - >

Re: Link in Wiki broken

2012-03-18 Thread Benoit Perroud
http://blip.tv/datastax/getting-to-know-the-cassandra-codebase-4034648 2012/3/18 Tharindu Mathew : > Hi, > > It seems that [1] is broken. Wonder if it exists somewhere else? > > [1] - > http://www.channels.com/episodes/show/11765800/Getting-to-know-the-Cassandra-Codebase > > -- > Regards, > > Tha

Re: consistency level question

2012-03-18 Thread Caleb Rackliffe
That sounds right to me :) Caleb Rackliffe | Software Developer M 949.981.0159 | ca...@steelhouse.com [cid:8E620335-844B-4EFF-ACAB-3D4439A3B4B6] From: Tamar Fraenkel mailto:ta...@tok-media.com>> Reply-To: "user@cassandra.apache.org" mailto:user@cassandra.apache.

Re: design that mimics twitter tweet search

2012-03-18 Thread Sasha Dolgy
yes -- but given i have two keywords, and want to find all tweets that have "cassandra" and "bestest" ... means, retrieving all columns + values in each row, iterating through both to see if tweet id's in one, exist in the other and finishing up with a consolidated list of tweet id's that only exis

Re: design that mimics twitter tweet search

2012-03-18 Thread Benoit Perroud
The simpliest modeling you could have is using the keyword as key, a timestamp/time UUID as column name and the tweetid as value -> cf['keyword']['timestamp'] = tweetid then you do a range query to get all tweetid sorted by time (you may want them in reverse order) and you can limit to the number

design that mimics twitter tweet search

2012-03-18 Thread Sasha Dolgy
Hi All, With twitter, when I search for words like: "cassandra is the bestest", 4 tweets will appear, including one i just did. My understand that the internals of twitter work in that each word in a tweet is allocated, irrespective of the presence of a # hash tag, and the tweet id is assigned

RE: Secondary Index Validation Type Parse Error

2012-03-18 Thread Sam Hodgson
Hi me again - sorry i've just read that bytestype will expect hex input so my question now is how to create a column that will accept non-validated text as as input? I think I can maybe get round this by forcing UTF8Encoding regardless if the string is already identified as UTF8 or not however

Secondary Index Validation Type Parse Error

2012-03-18 Thread Sam Hodgson
Hi All, Getting the following parse error when trying to create a CF with a secondary index using the bytestype attribute, the index is for a column called 'subject': java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: cannot parse 'subject' as hex bytes Im doing all

Re: consistency level question

2012-03-18 Thread Tamar Fraenkel
Thanks! I updated replication factor to 2, and now when I took one node down all continued running (I did see Hector complaining on the node being down), but things were saved to db and read from it. Just so I understand, now, having replication factor of 2, if I have 2 out of 3 nodes running all

Re: consistency level question

2012-03-18 Thread Watanabe Maki
Because your RF is 1, so you need all nodes up. maki On 2012/03/18, at 16:15, Tamar Fraenkel wrote: > Hi! > I have a 3 node cassandra cluster. > I use Hector API. > > I give hecotr one of the node's IP address > I call setAutoDiscoverHosts(true) and setRunAutoDiscoveryAtStartup(true). > > Th

Re: consistency level question

2012-03-18 Thread Tamar Fraenkel
Hi! Thanks for the prompt answer, That is true, I intend to have it at two. What you say, is that if I change that, then even when the node is down, my application will be able to read\write from the other node where the data is replicated? Forgot to mention that I have ConfigurableConsist

Re: Token Ring Gaps in a 2 DC Setup

2012-03-18 Thread Caleb Rackliffe
More detail… I'm running 1.0.7 on these boxes, and the keyspace readout from the CLI looks like this: create keyspace Users with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 1, DC1 : 2} and durable_writes = true; Thanks! Caleb Rackliffe | Software Develope

Re: consistency level question

2012-03-18 Thread Caleb Rackliffe
If your replication factor is set to one, your cluster is obviously in a bad state following any node failure. At best, I think it would make sense that about a third of your operations fail, but I'm not sure why all of them would. I don't know if Hector just refuses to work with a compromised

consistency level question

2012-03-18 Thread Tamar Fraenkel
Hi! I have a 3 node cassandra cluster. I use Hector API. I give hecotr one of the node's IP address I call setAutoDiscoverHosts(true) and setRunAutoDiscoveryAtStartup(true). The describe on one node returns: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true

Token Ring Gaps in a 2 DC Setup

2012-03-18 Thread Caleb Rackliffe
Hi Everyone, I have a cluster using NetworkTopologyStrategy that looks like this: 10.41.116.22 DC1 RAC1 Up Normal 13.21 GB10.00% 0 10.54.149.202 DC2 RAC1 Up Normal 6.98 GB 0.00% 1 10.41.116.20 DC1 RAC2 Up