Re: Usage Pattern : "unique" value of a key.

2011-01-12 Thread Oleg Anastasyev
Benoit Perroud noisette.ch> writes: > > My idea to solve such use case is to have both thread writing the > username, but with a colum like "lock-", and then read > the row, and find out if the first lock column appearing belong to the > thread. If this is the case, it can continue the process,

Old data not indexed

2011-01-12 Thread Tan Yeh Zheng
I tried to run the example on http://www.riptano.com/blog/whats-new-cassandra-07-secondary-indexes programatically. After I index the column "state", I tried to get_indexed_slices (where state = 'UT') but it returned an empty list. But if I index first, then query, it'll return the correct result

RE: Advice wanted on modeling

2011-01-12 Thread Steven Mac
> Date: Thu, 13 Jan 2011 01:29:33 +0100 > Subject: Re: Advice wanted on modeling > From: peter.schul...@infidyne.com > To: user@cassandra.apache.org > > > The application will have a large number of records, with the records > > consisting of a fixed part and a number (n) of periodic parts. > > *

RE: about the data directory

2011-01-12 Thread Viktor Jevdokimov
>I have 4 nodes, then I I create one keyspace (such as FOO) with replica >factor =1 and insert an data, > why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As > I know, I just have one replica So why do you have installed 4 nodes, not 1? They're for your data to be dis

about the data directory

2011-01-12 Thread raoyixuan (Shandy)
I have 4 nodes, then I I create one keyspace (such as FOO) with replica factor =1 and insert an data, why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica 华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo] Phone: 28358610 Mobile: 13

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
Thanks , I totally get it. From: Tyler Hobbs [mailto:ty...@riptano.com] Sent: Thursday, January 13, 2011 2:19 PM To: user@cassandra.apache.org Subject: Re: about the insert data The coordinator node routes the request in parallel to all of the replicas and waits for responses. One of those repl

Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck
On Wed, 2011-01-12 at 14:21 -0800, Ryan King wrote: > What consistency level did you use to write the > data? R=1,W=1 (reads happen a long time afterwards). ~mck -- "It is now quite lawful for a Catholic woman to avoid pregnancy by a resort to mathematics, though she is still forbidden to reso

Re: about the insert data

2011-01-12 Thread Tyler Hobbs
The coordinator node routes the request in parallel to all of the replicas and waits for responses. One of those replicas might happen to be the coordinator itself. Only replicas read/write data they are responsible for, not the coordinator (unless the coordinator is also a replica for that data)

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
I mean whether both the coordinate node and the replica node keep the insert data. Or just the replica node keep the insert data. And the coordinate node just route the insert data to the replica. Can you get me? -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: T

Re: about the insert data

2011-01-12 Thread Jonathan Ellis
On Wed, Jan 12, 2011 at 5:46 PM, raoyixuan (Shandy) wrote: > So you mean the coordinator node is just responsible for routing the request. Right. Of course, if the coordinator node happens to also be a replica, it can be a little more efficient by performing that operation directly rather than g

Re: about the write consistency

2011-01-12 Thread Brandon Williams
2011/1/12 raoyixuan (Shandy) > if I have 20 nodes, and replica factor is 3, whether all the node have > the replica finally or just have 3 replica? > 3. -Brandon

about the write consistency

2011-01-12 Thread raoyixuan (Shandy)
if I have 20 nodes, and replica factor is 3, whether all the node have the replica finally or just have 3 replica? 华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo] Phone: 28358610 Mobile: 13425182943 Email: raoyix...@huawei.com 地址:深圳市龙岗区坂田华为基地 邮编:518129 Huawei

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
So you mean the coordinator node is just responsible for routing the request. where the request will be Routed? whether the coordinator node route the request to the first replica to insert the data? whether -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of

Re: Advice wanted on modeling

2011-01-12 Thread Peter Schuller
> The application will have a large number of records, with the records > consisting of a fixed part and a number (n) of periodic parts. > * The fixed part is updated occasionally. > * The periodic parts are never updated, but a new one is added every 5 to 10 > minutes. Only the last n periodic par

Re: Node Inconsistency

2011-01-12 Thread Peter Schuller
> We will follow your suggestion and we will run Node Repair tool more > often in the future. However, what happens to data inserted/deleted > after Node Repair tool runs (i.e., between Node Repair and Major > Compaction). It is handled as you would expect; deletions are propagated across the clus

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Brandon Williams
On Wed, Jan 12, 2011 at 4:08 PM, mck wrote: > > > You're using an ordered partitioner and your nodes are evenly spread > > around the ring, but your data probably isn't evenly distributed. > > This load number seems equals to `du -hs ` and > since i've got N == RF shouldn't the data size always b

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:08 PM, mck wrote: > >> You're using an ordered partitioner and your nodes are evenly spread >> around the ring, but your data probably isn't evenly distributed. > > This load number seems equals to `du -hs ` and > since i've got N == RF shouldn't the data size always be t

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 23:04 +0100, mck wrote: > > Caused by: TimedOutException() > > What is the exception in the cassandra logs? Or tried increasing rpc_timeout_in_ms? ~mck -- "When there is no enemy within, the enemies outside can't hurt you." African proverb | www.semb.wever.org | www.sesa

Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck
> You're using an ordered partitioner and your nodes are evenly spread > around the ring, but your data probably isn't evenly distributed. This load number seems equals to `du -hs ` and since i've got N == RF shouldn't the data size always be the same on every node? ~mck -- "Traveller, there

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 18:40 +, Jairam Chandar wrote: > Caused by: TimedOutException() What is the exception in the cassandra logs? ~mck -- "Don't use Outlook. Outlook is really just a security hole with a small e-mail client attached to it." Brian Trosko | www.semb.wever.org | www.sesat.no

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Aaron Morton
Whats happening in the cassandra server logs when you get these errors? Reading through the hadoop 0.6.6 code it looks like it creates a thrift client with an infinite timeout. So it may be an internode timeout, which is set in storage-conf.xml.AaronOn 13 Jan, 2011,at 07:40 AM, Jairam Chandar wrot

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:00 PM, mck wrote: > I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner. > > When i run "nodetool ring" it reports > >> Address         Status State   Load            Owns    Token >>                                                         >> Token(bytes[ff0343

Should nodetool ring give equal load ?

2011-01-12 Thread mck
I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner. When i run "nodetool ring" it reports > Address Status State LoadOwnsToken > > > Token(bytes[ff034355152567a5b2

Re: unsubscribe

2011-01-12 Thread Robert Coli
On Tue, Jan 11, 2011 at 10:29 PM, Nichole Kulobone wrote: > http://wiki.apache.org/cassandra/FAQ#unsubscribe =Rob

Re: best way to do a count

2011-01-12 Thread Aaron Morton
There is a get_count() API function http://wiki.apache.org/cassandra/API , it's going to count the columns in a row or row+super column. This function is available in me.prettyprint.cassandra.service.KeyspaceService.There are distributed counters submitted to the trunk http://wiki.apache.org/cassan

best way to do a count

2011-01-12 Thread Michael Fortin
I was working on a schema that looks something like this: HitFamily [UUID 1] ['user-agent'] = '…' HitFamily [UUID 1] ['referer'] = '…' HitFamily [UUID 1] ['client_id'] = Long … HitCountFamily [client_id as Long] [Current Date as Long] = UUID1 What I'd like to do is count the columns between a d

Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Jairam Chandar
Hi folks, We have a Cassandra 0.6.6 cluster running in production. We want to run Hadoop (version 0.20.2) jobs over this cluster in order to generate reports. I modified the word_count example in the contrib folder of the cassandra distribution. While the program is running fine for small datasets

Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
Created: https://issues.apache.org/jira/browse/INFRA-3356 On Wed, Jan 12, 2011 at 9:25 AM, zGreenfelder wrote: > On Wed, Jan 12, 2011 at 11:39 AM, Oleg Tsvinev > wrote: > > I'm sending it from my GMail account. I'm opening a new topic, which > rules > > out top-posting. > > The message had mixed

Re: about the insert data

2011-01-12 Thread Peter Schuller
> Firstly, the data will be inserted by the coordinate node. > > Secondly, it will find the first replica node based by the partitioner ,such > randompartitioner, > > Thirdly, it will replicate the data based by the replica factor Replicate placement is entirely independent of which node you talk

Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
On Wed, Jan 12, 2011 at 11:39 AM, Oleg Tsvinev wrote: > I'm sending it from my GMail account. I'm opening a new topic, which rules > out top-posting. > The message had mixed fonts in it, that might be a problem. > Here's what I'm getting from GMail while sending the message in question: > Technica

Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 09:09 -0800, Oleg Tsvinev wrote: > Which component? Mail Archives or Mail (qmail)? Mail would be my guess. -- Eric Evans eev...@rackspace.com

Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
Which component? Mail Archives or Mail (qmail)? On Wed, Jan 12, 2011 at 9:06 AM, Eric Evans wrote: > On Wed, 2011-01-12 at 08:39 -0800, Oleg Tsvinev wrote: > > > > And I be damned if I spam. Time to tweak some filters, eh? > > Maybe so. We don't have any control over that though I'm afraid. Ca

Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 08:39 -0800, Oleg Tsvinev wrote: > > And I be damned if I spam. Time to tweak some filters, eh? Maybe so. We don't have any control over that though I'm afraid. Can you submit a ticket to INFRA? https://issues.apache.org/jira/browse/INFRA > On Wed, Jan 12, 2011 at 8:17 A

Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
I'm sending it from my GMail account. I'm opening a new topic, which rules out top-posting. The message had mixed fonts in it, that might be a problem. Here's what I'm getting from GMail while sending the message in question: Technical details of permanent failure: Google tried to deliver your mes

Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 16:46 +0200, David Boxenhorn wrote: > What's wrong with topposting? > > This email is non-plain and topposted... Because a little piece of me dies every time you do. -- Eric Evans eev...@rackspace.com

Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
On Wed, Jan 12, 2011 at 9:46 AM, David Boxenhorn wrote: > What's wrong with topposting? > > This email is non-plain and topposted... > I suspect your origin domain (lookin2.com) gets tagged less often by spam assassin (or whatever the moral equivalent being used for this list may be) and the limi

Usage Pattern : "unique" value of a key.

2011-01-12 Thread Benoit Perroud
Hi ML, I wonder if someone has already experiment some kind of unique index on a column family key. Let's go for a short example : the key is the username. What happens if 2 users want to signup at the same time with the same username ? So has someone already addressed this "pattern" in Cassandr

Re: Why my posts are marked as spam?

2011-01-12 Thread Sven Johansson
On Wed, Jan 12, 2011 at 3:46 PM, David Boxenhorn wrote: > What's wrong with topposting? > > "A: Because it's counterintuitive to the way we read. Q: Why is top-posting bad?" ...and because it disregards context and makes a thread harder to follow. -- Sven Johansson Twitter: @svjson

Re: Why my posts are marked as spam?

2011-01-12 Thread David Boxenhorn
What's wrong with topposting? This email is non-plain and topposted... On Wed, Jan 12, 2011 at 4:32 PM, zGreenfelder wrote: > > > > On 12 January 2011 05:28, Oleg Tsvinev wrote: > > > Whatever I do, it happens :( > >On Wed, Jan 12, 2011 at 1:53 AM, Arijit Mukherjee > wrote: > > > > I think thi

Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
> > On 12 January 2011 05:28, Oleg Tsvinev wrote: > > Whatever I do, it happens :( >On Wed, Jan 12, 2011 at 1:53 AM, Arijit Mukherjee wrote: > > I think this happens for RTF. Some of the mails in the post are RTF, > and the reply button creates an RTF reply - that's when it happens. > Wonder how

Re: Reclaim deleted rows space

2011-01-12 Thread David Boxenhorn
I think that if SSTs are partitioned within the node using RP, so that each partition is small and can be compacted independently of all other partitions, you can implement an algorithm that will spread out the work of compaction over time so that it never takes a node out of commission, as it does

Re: how to do a get_range_slices where all keys start with same string

2011-01-12 Thread Stephen Connolly
or set the end key to "com.googlf" On 12 January 2011 02:49, Aaron Morton wrote: > If you were using OPP and get_range_slices then set the start_key to be > "com.google" and the end_key to be "". Get is slices of say 1,000 (use the > last key read as the next start_ket) and when you see the firs