Re: One node misbehaving (lot's of GC), ideas?

2015-04-15 Thread Michal Michalski
Hi Erik, Forgetting for a while that it's only a single row: does this node store any super-long rows? The first things that come to my mind after reading your e-mail is unthrottled compaction (sounds like a possible issue, but it would affect other nodes too) or very large rows. Or a mix of both?

Re: Cassandra vs OS x

2015-04-07 Thread Michal Michalski
Out of curiosity - could you elaborate on that or drop a link? Kind regards, Michał Michalski, michal.michal...@boxever.com On 7 April 2015 at 12:41, Serega Sheypak wrote: > It's single-threaded for writing :) > > 2015-04-07 13:13 GMT+02:00 Jean Tremblay < > jean.tremb...@zen-innovations.com>:

Re: Cluster status instability

2015-04-02 Thread Michal Michalski
Hey Marcin, Are they actually going up and down repeatedly (flapping) or just down and they never come back? There might be different reasons for flapping nodes, but to list what I have at the top of my head right now: 1. Network issues. I don't think it's your case, but you can read about the is

Re: Reasonable range for the max number of tables?

2014-08-05 Thread Michal Michalski
>> - Use a keyspace per customer > These effectively amount to the same thing and they both fall foul to the > limit in the number of column families so do not scale. But then you can scale by moving some of the customers to a new cluster easily. If you keep everything in a single keyspace or - wo

Re: Cannot query secondary index

2014-06-09 Thread Michal Michalski
Secondary indexes internally are just CFs that map the indexed value to a row key which that value belongs to, so you can only query these indexes using "=", not ">", ">=" etc. However, your query does not require index *IF* you provide a row key - you can use "<" or ">" like you did for the date

Re: bloom filter + suddenly smaller CF

2014-04-14 Thread Michal Michalski
Sorry, I misread the question - I thought you've also changed FP chance value, not only removed the data. Kind regards, Michał Michalski, michal.michal...@boxever.com On 14 April 2014 15:07, Michal Michalski wrote: > Did you set Bloom Filter's FP chance before or after the step

Re: bloom filter + suddenly smaller CF

2014-04-14 Thread Michal Michalski
> RAM (I didn't explicitly write down the before numbers, but they seem about > the same) . So, compaction didn't change the BF's (unless cassandra needs > a 2nd compaction to see all of the data cleared by the 1st compaction). > > will > > > On Mon, Apr 14, 2014

Re: bloom filter + suddenly smaller CF

2014-04-14 Thread Michal Michalski
Bloom filters are built on creation / rebuild of SSTable. If you removed the data, but the old SSTables weren't compacted or you didn't rebuild them manually, bloom filters will stay the same size. M. Kind regards, Michał Michalski, michal.michal...@boxever.com On 14 April 2014 14:44, William O

Re: Cassandra disk usage

2014-04-13 Thread Michal Michalski
> Each columns have name of 15 chars ( digits ) and same 15 chars in value ( also digits ). > Each column should have 30 bytes. Remember about the standard Cassandra's column overhead which is, as far as I remember, 15 bytes, so it's 45 bytes in total - 50% more than you estimated, which kind of m

Re: nodetool ring showing different 'Load' size

2013-06-19 Thread Michal Michalski
You can start compaction via JMX if you need it and you know what you're doing: Find org.apache.cassandra.db:type=CompactionManager MBean and forceUserDefinedCompaction operation in it. First argument is keyspace name, second one is a comma-separated list of SSTables to compact (filename) You

Re: Coprosessors/Triggers in C*

2013-06-12 Thread Michal Michalski
I understood it as a "run trigger when column gets deleted due to TTL", so - as you said - it doesn't sound like something that can be done. Gareth, TTL'd columns in Cassandra are not really removed after TTL - they are just ignored from that time (so they're not returned by queries), but they

Re: [Cassandra] Expanding a Cassandra cluster

2013-06-11 Thread Michal Michalski
What will happen if I add nodetool cleanup to run periodically (similar to nodetool repair) ? Will node tool cleanup consume lot of IO and CPU even though there is nothing to clean ? Why would you need doing so? M. Thank you Emalayan From: Robert Coli

Re: Repair does not fix inconsistency

2013-06-11 Thread Michal Michalski
This looks to me more like a secondary index issue. If you say the access via rowkey is always correct, then the repair works fine. I think there might be something wrong with your secondary index then. Just a follow up in case someone will have the same case: This problem was solved by runni

Re: Removing authentification

2013-06-04 Thread Michal Michalski
How about authorizer? Is it set to org.apache.cassandra.auth.AllowAllAuthorizer? Authenticator is responsible for handling logging in as a specific user. Authorizer checks if logged user has appriopriate permissions. M. W dniu 04.06.2013 11:54, cscetbon@orange.com pisze: Hi, What's the

Re: Unable to drop secondary index

2013-05-27 Thread Michal Michalski
-- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 20/04/2013, at 1:42 AM, Michal Michalski wrote: It seems we can't update schemas at all. I tried to change read_repair_chance and it looks the same. However, in this case I'm 99% sur

Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-05-27 Thread Michal Michalski
For now I'm giving up, but I'll have to "refresh" this thread in future ;-) The last thing I found out is that entry that I marked in previous mail as "LAST VALID KEY/VALUE PAIR" is the problem - it is fine itself, but it "breaks" the stream somehow. Removing it fixes the problem, but I still

Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-05-24 Thread Michal Michalski
tream(IncomingTcpConnection.java:166) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66) It gets more interesting... ;-) M. W dniu 24.05.2013 10:46, Michal Michalski pisze: Sounds like a nasty heisenbug, can you replace or rebuild the machine? Heisenbug :D (never hear

Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-05-24 Thread Michal Michalski
New Zealand @aaronmorton http://www.thelastpickle.com On 21/05/2013, at 9:36 PM, Michal Michalski wrote: I've finally had some time to experiment a bit with this problem (it occured twice again) and here's what I found: 1. So far (three occurences in total), *when* it happened, it happened only for s

Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-05-21 Thread Michal Michalski
reelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 28/03/2013, at 2:26 PM, Michal Michalski wrote: We're streaming data to Cassandra directly from MapReduce job using BulkOutputFormat. It's been working for more than a year without any problems, bu

Re: Unable to drop secondary index

2013-04-26 Thread Michal Michalski
W dniu 26.04.2013 03:45, aaron morton pisze: You can drop the hints via JMX and stopping the node and deleting the SSTables. Thanks for advice :-) It's +/- what I did. I've paused hints delivery first and then I upgraded whole cluster to C* with CASSANDRA-5179 patch applied, removing the SSTa

Re: Unable to drop secondary index

2013-04-24 Thread Michal Michalski
lete within rpc_timeout. root@cssa02-13:~# ls -lahS /cassandra/system/hints/ | grep "Data\." | wc -l 83 I don't even want to think what's inside ;-) M. W dniu 24.04.2013 08:54, Michal Michalski pisze: The log messages seem fine to me. It's handling eventually

Re: Unable to drop secondary index

2013-04-22 Thread Michal Michalski
A little update: OK, after ~8 hours of GC madness and compacting node B (the one on which keyspace has disappeared) works fine. No issues noticed so far. Node A was started with larger heap and after I turned debugging on I can see it does this: DEBUG [MutationStage:110] 2013-04-23 06:23:44

Re: Unable to drop secondary index

2013-04-22 Thread Michal Michalski
Missing keyspace has reappeared on node B after restart, but it seems to behave like node A. It starts, GC goes wild, I can see this: INFO [ScheduledTasks:1] 2013-04-22 15:20:52,701 GCInspector.java (line 119) GC for ParNew: 646 ms for 1 collections, 6314013024 used; max is 8506048512 INFO [

Re: Unable to drop secondary index

2013-04-22 Thread Michal Michalski
W dniu 21.04.2013 22:17, aaron morton pisze: This is a tricky one to diagnose remotely. I could try using nodetool resetlocalschema on each node, it's just wild guess incase there is something odd one one node. I've run it on one node (let's call it A) and it finished without any problems. T

Re: Unable to drop secondary index

2013-04-19 Thread Michal Michalski
st cluster (probably with even more CLI & CQL mixing) and it works there. M. W dniu 19.04.2013 11:03, Michal Michalski pisze: Hi Aaron, Was the schema created with CQL or the CLI ? It was created using Pycassa and - as far as I know - it was managed only by CLI. > (It's not

Re: Unable to drop secondary index

2013-04-19 Thread Michal Michalski
Hi Aaron, Was the schema created with CQL or the CLI ? It was created using Pycassa and - as far as I know - it was managed only by CLI. > (It's not a good idea to manage one with the other) Yes, I know - I only tried using CQL after I realized that CLI is not working, as I had to make it

Unable to drop secondary index

2013-04-18 Thread Michal Michalski
As stated in topic, I'm unable to drop secondary index either by using cli or cqlsh. In both cases it looks like to command is processed properly (some uuid shows up in cli, no output in cqlsh), I can see in logs that schema is going to be updated (index name and type are set to null) and then.

Re: differences between DataStax Community Edition and Cassandra package

2013-04-18 Thread Michal Michalski
Probably Robert meant CFS: http://www.datastax.com/wp-content/uploads/2012/09/WP-DataStax-HDFSvsCFS.pdf :-) W dniu 18.04.2013 14:10, Nikolay Mihaylov pisze: whats CDFS ? I am sure you are not referring iso9660, e.g. CD-ROM filesystem? :) On Wed, Apr 17, 2013 at 10:42 PM, Robert Coli wrote:

Re: Does Memtable resides in Heap?

2013-04-15 Thread Michal Michalski
How about the bloom filter and index samples, are they part of off-heap? Starting from C* 1.2 bloom filters are stored off-heap. Index samples are stored on heap. M.

Re: Repair does not fix inconsistency

2013-04-05 Thread Michal Michalski
i and cqlsh - when testing it on my workstation. D'oh! ;-) M. W dniu 04.04.2013 16:07, Michal Michalski pisze: W dniu 04.04.2013 15:38, horschi pisze: I'm glad to hear that. I feared my ticket might be responsible for your data loss. I could not live the guilt ;-) Seriously: I'm glad

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
W dniu 04.04.2013 15:38, horschi pisze: I'm glad to hear that. I feared my ticket might be responsible for your data loss. I could not live the guilt ;-) Seriously: I'm glad we can rule out the repair change. Haha, I didn't notice before that it was your ticket! ;-) Yes, if it works with CL=o

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
Yes, maybe there are two issues here: repair not running and maybe really some index-thing. Repair is fine - all the data seem to be in SSTables. I've checked it and while index tells me that I have 1 tombstone and 0 live cells for a key, I can _see_, thanks to sstable2json, that I have 3 "l

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
Does CQL not allow CL=ONE queries? Why does it ask two nodes for the key, when you say that you are using CL=default=1? I'm a bit confused here (I'm a thrift user). Yup, that's another thing I'm curious about too (default CL is ONE for sure), but as for now it helps me to investigate my probl

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
Well... Strange. We have such problem with 6 users, but there's only ONE tombstone (created 8 days ago, so it's not gcable yet) in all the SSTables on 2:1 node - checked using sstable2json. Moreover, this tombstone DOES NOT belong to the row key I'm using for tests, because this user was NOT eve

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
Hi Sylvain, Thanks for explaination :-) However, in this case, I still do not get why this (probably) gcable tombstone on 2:1 could cause this mess. As AE ignores only the tombstone itself (which means that there are no data for this key on 2:1 node from repair's point of view), it should resu

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
Hi Christian, About CASSANDRA-4905 - thanks for explaining this :-) This looks to me more like a secondary index issue. If you say the access via rowkey is always correct, then the repair works fine. I think there might be something wrong with your secondary index then. This was my first thou

Re: Repair does not fix inconsistency

2013-04-04 Thread Michal Michalski
Hi Aaron, At first, before I go with a lot of logs: I'm considering a problem related to this issue: https://issues.apache.org/jira/browse/CASSANDRA-4905 Let's say the tombstone on one of the nodes (X) is gcable and was not compacted (purged) so far. After it was created we re-created this r

Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-04-03 Thread Michal Michalski
We tried to reproduce it in a few ways on 3 different environments, but we were unable to do it. We have to leave this problem for now. Thanks for help anyway :-) M. W dniu 02.04.2013 10:02, Michal Michalski pisze: Thanks for reply, Aaron. Unluckily, I think it's not the case - we did som

Repair does not fix inconsistency

2013-04-03 Thread Michal Michalski
Hi, TL;DR: I have inconsistend data (1 live row on node A & 1 tombstoned row on node B) that do not get fixed by repair. What can be a problem? Long version: I have a CF containing Users' info, which I sometimes query by key, and sometimes by indexed columns like email. I'm using RF=2. I wri

Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-04-02 Thread Michal Michalski
atch the zero length key in your map job ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 28/03/2013, at 2:26 PM, Michal Michalski wrote: We're streaming data to Cassandra directly from MapReduce job using BulkOutputFormat. It&#

Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-03-28 Thread Michal Michalski
We're streaming data to Cassandra directly from MapReduce job using BulkOutputFormat. It's been working for more than a year without any problems, but yesterday one of 600 mappers faild and we got a strange-looking exception on one of the C* nodes. IMPORTANT: It happens on one node and on one

Re: index_interval file size is the same after modifying 128 to 512?

2013-03-26 Thread Michal Michalski
have not checked the any further because the high level use cases look great. Dean On 3/26/13 2:35 AM, "Michal Michalski" wrote: Dean, as I can see you are satisfied with the result of increasing ii from 128 to 512, didn't you observed any drawbacks of this change? I remem

Re: index_interval file size is the same after modifying 128 to 512?

2013-03-26 Thread Michal Michalski
Dean, as I can see you are satisfied with the result of increasing ii from 128 to 512, didn't you observed any drawbacks of this change? I remember you mentioned no change in Read Latency and a significant drop of heap size, but did you check any other metrics? I did the opposite (512 -> 128;

Re: Hinted Handoff

2013-03-26 Thread Michal Michalski
It contains mutation (data) that is to be sent to proper endpoint. M. W dniu 25.03.2013 20:15, Kanwar Sangha pisze: Hi - Quick question. Do hints contain the actual data or the data is read from the SStables and then sent to the other node when it comes up ? Thanks, Kanwar

Re: Cassandra freezes

2013-03-21 Thread Michal Michalski
OK, I took a look at the source code and for now it seems to me that we both are partially right ( ;-) ), but changing index_interval does NOT require rebuilding SSTables: Yes, index sample file can be persisted (see io/sstable/IndexSummary.java, serialize/deserialize methods + io/sstable/SST

Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-21 Thread Michal Michalski
; need to get a second coffee ;-) M. W dniu 21.03.2013 09:29, Michal Michalski pisze: Dean, what is your row size approximately? We've been using ii = 512 for a long time because of memory issues, but now - as bloom filter is kept off-heap and memory is not an issue anymore - I've reve

Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-21 Thread Michal Michalski
Dean, what is your row size approximately? We've been using ii = 512 for a long time because of memory issues, but now - as bloom filter is kept off-heap and memory is not an issue anymore - I've reverted it to 128 to see if this improves anything. It seems it doesn't (except that I have less

Re: Cassandra freezes

2013-03-21 Thread Michal Michalski
About index_interval: 1) you have to rebuild stables ( not an issue if you are evaluating, doing test writes.. Etc, not so much in production ) Are you sure of this? As I understand indexes, it's not required because this parameter defines an interval of in-memory index sample, which is crea

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
"on my workstation with a < 0.01% sample of production" Is there a simple way of getting that ? On "Cassandra level"? Nope. I just had to prepare these data "manualy" using software we develop on very small input. I understand that it might not be so easy in all the use cases, as it was in

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
We have no such environment. It is expensive, we can't afford this for now. We do have QA cluster, but before even trying the 1.1.0 -> 1.1.9 / 1.2.1 upgrade on it (we were a bit undecided about the version ;-) ), I did some experiments using ccm ( https://github.com/pcmanus/ccm ) on my workst

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
It will happen if your rpc_address is set to 0.0.0.0. Ops, it's not what I meant ;-) It will happen, if your rpc_address is set to IP that is not defined in your cluster's config (e.g. in cassandra-topology.properties for PropertyFileSnitch) M. M. W dniu 14.03.2013 13:03, Alain RODRIGU

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
Just to make it clear: This bug will occur on single-DC configuration too. In our case it resulted in Exception like this at the very end of node startup: ERROR [WRITE-/] 2013-02-27 12:14:55,433 CassandraDaemon.java (line 133) Exception in thread Thread[WRITE-/,5,main] java.lang.RuntimeExcep

Re: Hinted handoff

2013-03-07 Thread Michal Michalski
I think it's still true, but not because of network-related issues, but because of the maintenance problems it will cause during per-node operations. For example in my case running 'upgradesstables' on ~300GB node takes about 30+ hours. The other IO-intensive operations will probably be a pain

Re: old data / tombstones are not deleted after ttl

2013-03-05 Thread Michal Michalski
thias.zeilin...@bwinparty.com bwin.party services (Austria) GmbH Marxergasse 1B A-1030 Vienna www.bwinparty.com -Original Message- From: Michal Michalski [mailto:mich...@opera.com] Sent: Dienstag, 05. März 2013 07:47 To: user@cassandra.apache.org Subject: Re: old data / tombstones are

Re: old data / tombstones are not deleted after ttl

2013-03-04 Thread Michal Michalski
Was it a major compaction? I ask because it's definitely a solution that had to work, but it's also a solution that - in general - probably no-one here would suggest you to use. M. W dniu 05.03.2013 07:08, Matthias Zeilinger pisze: Hi, I have done a manually compaction over the nodetool and

Re: old data / tombstones are not deleted after ttl

2013-03-03 Thread Michal Michalski
Did you try checking (using nodetool getsstables) how many SSTables your row's data are spread into? All the "parts" of the row have to be in one SSTable to remove it (data & tombstone). Remember, that even if you do not update your data, you still may have two SSTables containing row's data (o

Re: Other nodes are seen down with rpc_address 0.0.0.0 in version 1.2.2

2013-03-01 Thread Michal Michalski
Yes, it's caused by bug described in CASSANDRA-5299. It's because PropertyFileSnitch is using RPC address to obtain DC name from cassandra-topology.properties file. As you use 0.0.0.0 for RPC and this IP cannot be found in this config file, C* crashes. M. W dniu 01.03.2013 10:43, Jean-Armel L

Re: data model advice needed

2013-02-28 Thread Michal Michalski
I can't suggest you any book, but you might be interested in this: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/ http://www.ebaytechblog.com/2012/08/14/cassandra-data-modeling-best-practices-part-2/ M. W dniu 28.02.2013 08:44, Sloot, Hans-Peter pisze: Wh

Re: is upgradesstables required for 1.1.4 to 1.2.2? (I don't think it is)

2013-02-27 Thread Michal Michalski
I'm currently migrating 1.1.0 to 1.2.1 and on our small CI cluster, that I was testing some stuff on, it seems that it's not required to run upgradesstables (this doc doesn't mention about it too: http://www.datastax.com/docs/1.2/install/upgrading but the previous versions did). Of course I'd l

Re: NULL values

2013-02-27 Thread Michal Michalski
W dniu 27.02.2013 10:57, Marco Matarazzo pisze:> You may also be interested in this: > > https://issues.apache.org/jira/browse/CASSANDRA-3783 CASSANDRA-3783 might not be the case here. The question is about using null in SELECT statements, which will require modifications in secondary indexes

Re: Cassandra 1.1.2 -> 1.1.8 upgrade

2013-02-11 Thread Michal Michalski
OK, thanks Aaron. I ask because NEWS.txt is not a big help in case of > 1.1.5 versions because there's no info on them in it (especially on 1.1.7 which seems to be the most important one in this case, according to the DataStax' upgrade instructions) ;-) https://github.com/apache/cassandra/blob

Re: Cassandra 1.1.2 -> 1.1.8 upgrade

2013-02-10 Thread Michal Michalski
2) Upgrade one node at a time, running the clustered in a mixed 1.1.2->1.1.9 configuration for a number of days. I'm about to upgrade my 1.1.0 cluster and http://www.datastax.com/docs/1.1/install/upgrading#info says: "If you are upgrading to Cassandra 1.1.9 from a version earlier than 1.1.7,

Re: Question on TTLs and Tombstones

2013-01-02 Thread Michal Michalski
Thanks for your answer. Moreover, the issue you mentioned in the end was the answer to the question I was going to ask next ;-) Regards, Michał W dniu 02.01.2013 15:42, Sylvain Lebresne pisze: WHEN does Cassandra remove expired (because of TTL) data? When a compaction reads an expired column

Re: Question on TTLs and Tombstones

2013-01-02 Thread Michal Michalski
data? Which operations cause Cassandra to check for TTL and create Tombstones for them if needed? It happens during compaction, for sure. How about scrub, repair? Others? Regards, Michał W dniu 28.12.2012 09:08, Michal Michalski pisze: Hi, I have a question regarding TTLs and Tombstones with a p

Question on TTLs and Tombstones

2012-12-28 Thread Michal Michalski
Hi, I have a question regarding TTLs and Tombstones with a pretty long scenario + solution question. My first, general question is - when Cassandra checks for the TTL (if it expired) and creates the Tombstone if needed? I know it happens during compaction, but is this the only situation? How

Re: monitor cassandra 1.1.6 with MX4J

2012-11-12 Thread Michal Michalski
Hmm... It looks like it wasn't merged at some time (why?), because I can see that appropriate lines were present in a few branches. I didn't check if it works, but looking at git history tells me that you could try modifying cassandra-env.sh like this: Add this somewhere & configure: # To use

Re: RE where is cassandra debian packages?

2012-08-24 Thread Michal Michalski
Well, "Works for me". W dniu 24.08.2012 11:43, ruslan usifov pisze: no, i got 404 error. 2012/8/24 Romain HARDOUIN : Hi, The url you mentioned is OK: e.g. http://www.apache.org/dist/cassandra/debian/dists/11x/ ruslan usifov a écrit sur 24/08/2012 11:26:11 : Hello looks like http://www.a

Re: User authorized for cannot create CFs

2012-04-20 Thread Michal Michalski
Thanks for your reply, problem is solved. First, I missunderstood the modify-keyspace param and then I just missed the fact that I can do simply: test.=operator without any wildcards or so. I even tried this solution before and - after looking into the source code - I was sure it just has to w

User authorized for cannot create CFs

2012-04-17 Thread Michal Michalski
Hi, I'm suffering a problem, which maybe is a feature ( ;) ), but for me it's rather an annoying problem. I use SimpleAuthenticator and I have user who should be a kind of Cassandra's keyspace "root" - he should be allowed to do everything. So I set: =master Unluckily, when I try to create