Re: Limitations on secondary indexes

2013-03-07 Thread Sylvain Lebresne
Oh, and btw, that's not 2 billion indexed rows per-node, that's 2 billion rows that have the *same* index value per-node (each index value is a separate row in the index). On Fri, Mar 8, 2013 at 8:29 AM, Sylvain Lebresne wrote: > Not exactly. Each replica only indexes the rows that it is a repli

Re: Limitations on secondary indexes

2013-03-07 Thread Sylvain Lebresne
Not exactly. Each replica only indexes the rows that it is a replica for. So there is a limitation currently indeed, but it is of 2 billion indexed rows per-node. On Thu, Mar 7, 2013 at 8:05 PM, Edward Sargisson wrote: > Hi, > Please correct me if this statement is wrong. > > Secondary indexes

Re: Size Tiered -> Leveled Compaction

2013-03-07 Thread Al Tobey
We saw the exactly the same thing as Wei Zhu, > 100k tables in a directory causing all kinds of issues. We're running 128MiB ssTables with LCS and have disabled compaction throttling. 128MiB was chosen to get file counts under control and reduce the number of files C* has to manage & search. I ju

Re: Cassandra automatic setup

2013-03-07 Thread Jason Wee
Hi, you can use cassandra-cli / cqlsh with option --file to load the ddl. or cassandra client..http://wiki.apache.org/cassandra/ClientOptions Jason On Fri, Mar 8, 2013 at 12:43 AM, vck wrote: > Hi, so we are just in the process of setting up dse cassandra to be used > for our services. > At t

VNodes and nodetool repair

2013-03-07 Thread Kanwar Sangha
Hi Guys - I have a question on Vnodes and nodetool repair. If I have configured the nodes as vnodes, say for example 2 nodes with Rf=2. Questions - *There are some columns set with TTL as X. After X Cassandra will mark them as tombstones. Is there still a probability of running into

Re: Bloom filters and LCS

2013-03-07 Thread Edward Capriolo
I read that the change was made because Cassandra does not work well when they are off. This makes sense because cassandra uses bloom filters to decide if a row can be deleted without major compaction. However since LCS does not major compact without bloom filters you can end up in cases where rows

Re: Bloom filters and LCS

2013-03-07 Thread Wei Zhu
Where did you read that bloom filters are off for LCS on 1.1.9? Those are the two issues I can find regarding this matter: https://issues.apache.org/jira/browse/CASSANDRA-4876 https://issues.apache.org/jira/browse/CASSANDRA-5029 Looks like in 1.2, it defaults at 0.1, not sure about 1.1.X -Wei

Re: Bloom filters and LCS

2013-03-07 Thread Edward Capriolo
It was found out that that having no bloom filter is a bad idea because it causes issues where deleted rows are never removed from disk. Newer versions have fixed this. You should adjust your bloom filter settings to be >0 sized. On Thu, Mar 7, 2013 at 4:18 PM, Michael Theroux wrote: > Hello, >

Bloom filters and LCS

2013-03-07 Thread Michael Theroux
Hello, (Hopefully) Quick question. We are running Cassandra 1.1.9. I recently converted some tables from Size tiered to Leveled Compaction. The amount of space for Bloom Filters on these tables went down tremendously (which is expected, LCS in 1.1.9 does not use bloom filters). However, alt

Re: Write latency spikes

2013-03-07 Thread Wei Zhu
If you are tight about your SLA, try set socketTimeout from Hector with small number so that it can retry faster given the assumption that your write is idempotent. Regarding your write latency, don't have much insight. We see spike on the reads due to GC/compaction etc. But not write latency.

Can't replace dead node

2013-03-07 Thread Andrey Ilinykh
Hello everybody! I used to run cassandra 1.1.5 with Priam. To replace dead node priam launches cassandra with cassandra.replace_token property. It works smoothly with 1.1.5. Couple days ago I moved to 1.1.10 and have a problem now. New cassandra successfully starts, joins the ring but it doesn't s

Limitations on secondary indexes

2013-03-07 Thread Edward Sargisson
Hi, Please correct me if this statement is wrong. Secondary indexes are limited to indexing 2 billion rows - because they turn a row into a column and C* has a limit of 2 billion columns. Cheers, Edward

Re: should I file a bug report on this or is this normal?

2013-03-07 Thread Wei Zhu
It seems to be normal to explode data size during repair. For our case, we have a node around 200G with RF =3, during repair, it goes to as high as 300G. We are using LCS, it creates more than 5000 compaction tasks and takes more than a day to finish. We are on 1.1.6 There is parallel LCS featu

Re: pycassa : composite key and UTF8Type ,DateType

2013-03-07 Thread Tyler Hobbs
FYI, there's a mailing list that's dedicated to pycassa: https://groups.google.com/forum/?fromgroups#!forum/pycassa-discuss On Thu, Mar 7, 2013 at 8:13 AM, Sloot, Hans-Peter wrote: > I have a tab separatd file with a number of columns. > > Columns 5 and 6 are the date as -mm-dd and hh24:mi:

pycassa : composite key and UTF8Type ,DateType

2013-03-07 Thread Sloot, Hans-Peter
Hi, I have a tab separatd file with a number of columns. Columns 5 and 6 are the date as -mm-dd and hh24:mi:ss I want to add rows with a composite key that consists of an UTF8Type and the DateType (which are fields 5 and 6). The column family is : CREATE COLUMN FAMILY traffic WITH com

Re: Hinted handoff

2013-03-07 Thread Michal Michalski
I think it's still true, but not because of network-related issues, but because of the maintenance problems it will cause during per-node operations. For example in my case running 'upgradesstables' on ~300GB node takes about 30+ hours. The other IO-intensive operations will probably be a pain

RE: Hinted handoff

2013-03-07 Thread Kanwar Sangha
In the normal case it's best to have around 300 to 500GB per node. With that much data is will take a week to run repair or replace a failed node. Hi Aaron - This was true for pre 1.2 but with 1.2 and virtual nodes, does this still hold ? 1 TB at 1Gb/s will take roughly 2.2hrs assume we stream

RE: data model to store large volume syslog

2013-03-07 Thread moshe.kranc
Row key based on hour will create hot spots for write - for an entire hour, all the writes will be going to the same node, i.e., the node where the row resides. You need to come up with a row key that distributes writes evenly across all your C* nodes, e.g., time concatenated with a sequence cou

Re: Consistency level for system_auth keyspace

2013-03-07 Thread Jean-Armel Luce
Hi Vitali, >From my point of view, I think that what you propose is the right solution. With READ ONE and WRITE ALL, we shall still have a strong consistency. I am going to add a comment in the ticket 5310. Thanks. Jean Armel 2013/3/7 Vitalii Tymchyshyn > Why not WRITE.ALL READ.ONE? I don'

Re: Consistency level for system_auth keyspace

2013-03-07 Thread Vitalii Tymchyshyn
Why not WRITE.ALL READ.ONE? I don't think permissions are updated often and READ.ONE provides maximum availability. 2013/3/4 aaron morton > In this case, it means that if there is a network split between the 2 > datacenters, it is impossible to get the quorum, and all connections will > be reje