Pagination and timeouts

2017-03-27 Thread Tom van den Berge
I have a table with some 1M rows, and I would like to get the partition key of each row. Using the java driver (2.1.9), I'm executing the query select distinct key from table; The result set is paginated automatically. My C* cluster has two datacenters, and when I run this query using consistency

Re: Modeling Audit Trail on Cassandra

2016-03-19 Thread Tom van den Berge
> > Is text the most appropriate data type to store JSON that contain couple > of dozen lines ? > It sure is the simplest way to store JSON. The query requirement is "where executedby = ?”. > Since executedby is a timeuuid, I guess you don't want to query a single record, since that would requ

Re: Unexplainably large reported partition sizes

2016-03-10 Thread Tom van den Berge
;>> find it. >>> I'm using 2.1.9. >>> >> >> https://issues.apache.org/jira/browse/CASSANDRA-7953 >> >> Rob may have a different one, but I've something similar from this issue. >> Fixed in 2.1.12. >> > > Nate is correct,

Re: Unexplainably large reported partition sizes

2016-03-07 Thread Tom van den Berge
Hi Bryan, > Do you use any collections on this column family? We've had issues in the > past with unexpectedly large partitions reported on data models with > collections, which can also generate tons of tombstones on UPDATE ( > https://issues.apache.org/jira/browse/CASSANDRA-10547) > I've been

Re: Unexplainably large reported partition sizes

2016-03-07 Thread Tom van den Berge
Hi Rob, The reason I didn't dump the table with sstable2json is that I didn't think of it ;) I just used it, and it looks very much like the "avalanche of tombstones" bug you are describing! I took one of the three sstables containing the key, and it resulted in a 4.75 million-line json file, of

Re: Unexplainably large reported partition sizes

2016-03-06 Thread Tom van den Berge
values ? > > On Sat, Mar 5, 2016 at 7:16 PM, Tom van den Berge > wrote: > >> I don't think compression can be the cause of the difference, because of >> two reasons: >> >> 1) The partition size I calculated myself (3 MB) is the uncompressed >> size,

Re: Unexplainably large reported partition sizes

2016-03-05 Thread Tom van den Berge
Fri, Mar 4, 2016 at 5:56 AM, Tom van den Berge > wrote: > >> Compacting large partition >> drillster/subscriberstats:rqtPewK-1chi0JSO595u-Q (1,470,058,292 bytes) >> >> This means that this single partition is about 1.4GB large. This is much >> larger that it can

Unexplainably large reported partition sizes

2016-03-04 Thread Tom van den Berge
Hi, I'm seeing warnings in my logs about compacting large partitions, e.g.: Compacting large partition drillster/subscriberstats:rqtPewK-1chi0JSO595u-Q (1,470,058,292 bytes) This means that this single partition is about 1.4GB large. This is much larger that it can possibly be, because of two r

Re: Removed node is not completely removed

2015-10-14 Thread Tom van den Berge
Thanks Sebastian, a restart solved the problem! On Wed, Oct 14, 2015 at 3:46 PM, Sebastian Estevez < sebastian.este...@datastax.com> wrote: > We still keep endpoints in memory. Not sure how you git to this state but > try a rolling restart. > On Oct 14, 2015 9:43 AM, &qu

Re: Removed node is not completely removed

2015-10-14 Thread Tom van den Berge
has its > own set of system tables. -ml > > On Wed, Oct 14, 2015 at 9:17 AM, Tom van den Berge < > tom.vandenbe...@gmail.com> wrote: > >> Hi Carlos, >> >> I'm using 2.1.6. The mysterious node is not in the peers table. Any other >> ideas? >>

Re: Removed node is not completely removed

2015-10-14 Thread Tom van den Berge
r data > > rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo > <http://linkedin.com/in/carlosjuzarterolo>* > Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 > www.pythian.com > > On Wed, Oct 14, 2015 at 12:26 PM, Tom van den Berge < >

Removed node is not completely removed

2015-10-14 Thread Tom van den Berge
I have removed a node with nodetool removenode, which completed ok. Nodetool status does not list the node anymore. But since then, Im seeing messages in my other nodes log files referring to the removed node: INFO [GossipStage:38] 2015-10-14 11:18:26,322 Gossiper.java (line 968) InetAddress /10

Re: Do vnodes need more memory?

2015-09-24 Thread Tom van den Berge
On Thu, Sep 24, 2015 at 12:45 AM, Robert Coli wrote: > On Wed, Sep 23, 2015 at 7:09 AM, Tom van den Berge < > tom.vandenbe...@gmail.com> wrote: > >> So it seems that Cassandra simply doesn't have enough memory. I'm trying >> to understand if this can be cau

Re: Do vnodes need more memory?

2015-09-23 Thread Tom van den Berge
the default? How much ram? > > Also, can you run this tool and send a minute worth of thread info: > > wget > https://bintray.com/artifact/download/aragozin/generic/sjk-plus-0.3.6.jar > java -jar sjk-plus-0.3.6.jar ttop -s localhost:7199 -n 30 -o CPU > On Sep 23, 2015 7:09 AM

Do vnodes need more memory?

2015-09-23 Thread Tom van den Berge
I have two data centers, each with the same number of nodes, same hardware (CPUs, memory), Cassandra version (2.1.6), replication factory, etc. The only difference it that one data center uses vnodes, and the other doesn't. The non-vnode DC works fine (and has been for a long time) under productio

Secondary index is causing high CPU load

2015-09-15 Thread Tom van den Berge
count" in the cfstats for the index go up with almost 20! When doing the same query on one of my "good" nodes, it only increases with a small number, as I would expect. Could it be that the use of vnodes is causing these problems? Regards, Tom On Mon, Sep 14, 2015 at 8:09

Extremely high CPU load in new data center

2015-09-14 Thread Tom van den Berge
I have a DC of 4 nodes that must be expanded to accommodate an expected growth in data. Since the DC is not using vnodes, we have decided to set up a new DC with vnodes enabled, start using the new DC, and decommission the old DC. Both DCs have 4 nodes. The idea is to add additional nodes to the n

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-09 Thread Tom van den Berge
> > > I've learned from experience that the node immediately joins the cluster, >> and starts accepting reads (from other DCs) for the range it owns. > > > This seems to be the incorrect assumption at the heart of the confusion. > You "should" be able to prevent this behavior entirely via correct u

Re: Trace evidence for LOCAL_QUORUM ending up in remote DC

2015-09-08 Thread Tom van den Berge
Nate, I've disabled it, and it's been running for about an hour now without problems, while before, the problem occurred roughly every few minutes. I guess it's safe to say that this proves that CASSANDRA-9753 is the cause of the problem. I'm

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-08 Thread Tom van den Berge
> Running nodetool rebuild on a node that was started with join_ring=false >> does not work, unfortunately. The nodetool command returns immediately, >> after a message appears in the log that the streaming of data has started. >> After that, nothing happens. > > > Per driftx, the author of CASSAND

Re: Trace evidence for LOCAL_QUORUM ending up in remote DC

2015-09-08 Thread Tom van den Berge
Just to be sure: can this bug result in a 0-row result while it should be > 0 ? Op 8 sep. 2015 6:29 PM schreef "Tyler Hobbs" : > See https://issues.apache.org/jira/browse/CASSANDRA-9753 > > On Tue, Sep 8, 2015 at 10:22 AM, Tom van den Berge < > tom.vandenbe...@gmai

Trace evidence for LOCAL_QUORUM ending up in remote DC

2015-09-08 Thread Tom van den Berge
I've been bugging you a few times, but now I've got trace data for a query with LOCAL_QUORUM that is being sent to a remove data center. The setup is as follows: NetworkTopologyStrategy: {"DC1":"1","DC2":"2"} Both DC1 and DC2 have 2 nodes. In DC2, one node is currently being rebuilt, and therefore

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-08 Thread Tom van den Berge
t to it. Streaming data across the Atlantic takes a lot more time :( > > kind regards, > Christian > > PS: I would love to see the results, if you perform any tests on the > write-survey. Please share it here on the mailing list :-) > > > > On Mon, Sep 7, 2015 at 11:10

Re: How to prevent queries being routed to new DC?

2015-09-08 Thread Tom van den Berge
gt; > Thanks > Anuj > > Sent from Yahoo Mail on Android > <https://overview.mail.yahoo.com/mobile/?.src=Android> > -- > *From*:"Tom van den Berge" > *Date*:Tue, 8 Sep, 2015 at 1:31 am > *Subject*:Re: How to prevent queries being routed to n

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-07 Thread Tom van den Berge
rites, but not serving reads. I have not tested it yet, but I > think it should work. > > Also the manual join mentioned in CASSANDRA-9667 sounds very interesting. > > kind regards, > Christian > > On Mon, Sep 7, 2015 at 10:11 PM, Tom van den Berge > wrote: > >>

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-07 Thread Tom van den Berge
Coli wrote: > On Fri, Sep 12, 2014 at 6:57 AM, Tom van den Berge > wrote: > >> Wouldn't it be far more efficient if a node that is rebuilding itself is >> responsible for not accepting reads until the rebuild is complete? E.g. by >> marking it as "Join

Re: How to prevent queries being routed to new DC?

2015-09-07 Thread Tom van den Berge
NetworkTopologyStrategy On Mon, Sep 7, 2015 at 4:39 PM, Ryan Svihla wrote: > What's your keyspace replication strategy? > > On Thu, Sep 3, 2015 at 3:16 PM Tom van den Berge < > tom.vandenbe...@gmail.com> wrote: > >> Thanks for your help so far! >> >&g

Re: How to prevent queries being routed to new DC?

2015-09-03 Thread Tom van den Berge
Thanks for your help so far! I have some problems trying to understand the jira mentioned by Rob :( I'm currently trying to set up the first node in the new DC with auto_bootstrap = true. The node then becomes visible with status "joining", which (hopefully) prevents other DCs from sending querie

Re: How to prevent queries being routed to new DC?

2015-09-03 Thread Tom van den Berge
hu, Sep 3, 2015 at 11:53 AM, Tom van den Berge < > tom.vandenbe...@gmail.com> wrote: > >> Hi Bryan, >> >> I'm using the PropertyFileSnitch, and it contains entries for all nodes >> in the old DC, and all nodes in the new DC. The replication factor for both &

Re: How to prevent queries being routed to new DC?

2015-09-03 Thread Tom van den Berge
y that they show up under a new DC and not as part of > the old? > > --Bryan > > On Thu, Sep 3, 2015 at 11:27 AM, Tom van den Berge < > tom.vandenbe...@gmail.com> wrote: > >> I want to start using vnodes in my cluster. To do so, I've set up a new >> data

How to prevent queries being routed to new DC?

2015-09-03 Thread Tom van den Berge
I want to start using vnodes in my cluster. To do so, I've set up a new data center with the same number of nodes as the existing one, as described in http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configVnodesProduction_t.html. The new DC is in the same physical location as the

Re: MarshalException after upgrading to 2.1.6

2015-06-11 Thread Tom van den Berge
5 at 9:23 AM, Tom van den Berge > wrote: > >> I've upgraded a node from 2.0.10 to 2.1.6. Before taking down the node, >> I've run nodetool upgradesstables and nodetool scrub. >> >> When starting up the node with 2.1.6, I'm getting a MarshalException >>

Fwd: MarshalException after upgrading to 2.1.6

2015-06-11 Thread Tom van den Berge
I've upgraded a node from 2.0.10 to 2.1.6. Before taking down the node, I've run nodetool upgradesstables and nodetool scrub. When starting up the node with 2.1.6, I'm getting a MarshalException (stacktrace included below). For some reason, it seems that C* is trying to convert a text value from t

Re: Is it possible to bootstrap the 1st node of a new DC?

2014-09-12 Thread Tom van den Berge
g", similar to a node that is being bootstrapped? Tom On Thu, Sep 11, 2014 at 11:10 PM, Tom van den Berge wrote: > Thanks, Rob. > I actually tried using LOCAL_ONE instead of ONE, but I still saw this > problem. Maybe I missed some queries when updating to LOCAL_ONE. Anyway, > it&#

Re: Is it possible to bootstrap the 1st node of a new DC?

2014-09-11 Thread Tom van den Berge
at 1:18 PM, Tom van den Berge > wrote: > >> When setting up a new (additional) data center, the documentation tells >> us to use "nodetool rebuild -- " to fill up the node(s) in the new >> dc, and to disable auto_bootstrap. >> >> I'm wondering if i

Is it possible to bootstrap the 1st node of a new DC?

2014-09-11 Thread Tom van den Berge
When setting up a new (additional) data center, the documentation tells us to use "nodetool rebuild -- " to fill up the node(s) in the new dc, and to disable auto_bootstrap. I'm wondering if it is possible to fill the node with "auto_bootstrap=true" instead of a nodetool rebuild command. If so, ho

Node being rebuilt receives read requests

2014-09-10 Thread Tom van den Berge
I have a datacenter with a single node, and I want to start using vnodes. I have followed the instructions ( http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html), and set up a new node in a new datacenter (auto_bootstrap=false, seed=node in old dc,

Re: Migration 1.2.14 to 2.0.8 causes "Tried to create duplicate hard link" at startup

2014-06-19 Thread Tom van den Berge
It turns out this is caused by an earlier, failed attempt to upgrade. Removing all pre-sstablemetamigration snapshot directories solved the issue. Credits to Markus Eriksson. On Wed, Jun 11, 2014 at 9:42 AM, Tom van den Berge wrote: > No, unfortunately I haven't. > > > &

Are writes to indexes performed asynchronously?

2014-06-19 Thread Tom van den Berge
Hi, I have a column family with a secondary index on one of its columns. I noticed that when I write a row to the column family, and immediately query that row through the secondary index, every now and then it won't give any results. Could it be that Cassandra performs the write to the internal

Re: Migration 1.2.14 to 2.0.8 causes "Tried to create duplicate hard link" at startup

2014-06-11 Thread Tom van den Berge
No, unfortunately I haven't. On Tue, Jun 10, 2014 at 5:35 PM, Chris Burroughs wrote: > Were you able to solve or work around this problem? > > > On 06/05/2014 11:47 AM, Tom van den Berge wrote: > >> Hi, >> >> I'm trying to migrate a development clu

Migration 1.2.14 to 2.0.8 causes "Tried to create duplicate hard link" at startup

2014-06-05 Thread Tom van den Berge
Hi, I'm trying to migrate a development cluster from 1.2.14 to 2.0.8. When starting up 2.0.8, I'm seeing the following error in the logs: INFO 17:40:25,405 Snapshotting drillster, Account to pre-sstablemetamigration ERROR 17:40:25,407 Exception encountered during startup java.lang.RuntimeExcept

StatusLogger output help

2014-03-28 Thread Tom van den Berge
Hi, In my cassandra logs, I see a lot of "StatusLogger" output lines. I'm trying to understand why this is logged, and how to interpret the output. Maybe someone can point me to some documentation on this particular logging aspect? I would like to know what is triggering the StatusLogger.java to

Help on StatusLogger output?

2014-03-20 Thread Tom van den Berge
Hi, In my cassandra logs, I see a lot of "StatusLogger" output lines. I'm trying to understand why this is logged, and how to interpret the output. Maybe someone can point me to some documentation on this particular logging aspect? I would like to know what is triggering the StatusLogger.java to

Re: How to monitor the progress of a HintedHandoff task?

2013-12-07 Thread Tom van den Berge
the entire node because of a OOM does not make sense, > could you please post the C* version that you are using & the head size you > have configured? > > Thanks > Rahul > > > On Tue, Dec 3, 2013 at 7:41 PM, Tom van den Berge wrote: > >> Rahul, >> >>

Re: How to measure data transfer between data centers?

2013-12-04 Thread Tom van den Berge
router/switch/fancy-network-gear level. > > > On 12/03/2013 06:25 AM, Tom van den Berge wrote: > >> Is there a way to know how much data is transferred between two nodes, or >> more specifically, between two data centers? >> >> I'm especially interested in how

Re: OutOfMemory Java Heap Space error on startup...

2013-12-04 Thread Tom van den Berge
To start up your node again, you could delete the stored key caches ( /var/lib/cassandra/saved_caches/*). Regards, Tom On Wed, Dec 4, 2013 at 7:46 PM, Krishna Chaitanya wrote: > Hey Nate, > Thanks for the reply. The link was really good...!!! Looking > forward to making the necessary c

Re: How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Tom van den Berge
hints > cf? > > Thanks > Rahul > > > On Tue, Dec 3, 2013 at 6:41 PM, Tom van den Berge wrote: > >> Hi Rahul, >> >> Thanks for your reply. >> >> I have never seen message like "Timed out replaying hints to...", which >> is a good thin

Re: How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Tom van den Berge
aying hints to {}; aborting ({} delivered > > > > OR > > Finished hinted handoff of {} rows to endpoint {} > > > > Thanks > Rahul > > > On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge wrote: > >> Hi, >> >> Is there a way to monitor the

How to measure data transfer between data centers?

2013-12-03 Thread Tom van den Berge
Is there a way to know how much data is transferred between two nodes, or more specifically, between two data centers? I'm especially interested in how much data is being replicated from one data center to another, to know how much of the available bandwidth is used. Thanks, Tom

How to monitor the progress of a HintedHandoff task?

2013-12-03 Thread Tom van den Berge
Hi, Is there a way to monitor the progress of a hinted handoff task? I found the following two mbeans providing some info: org.apache.cassandra.internal:type=HintedHandoff, which tells me that there is 1 active task, and org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(), whic

What is listEndpointsPendingHints?

2013-11-26 Thread Tom van den Berge
When I run the operation "listEndpointsPendingHints" on the mbean org.apache.cassandra.db:type=HintedHandoffManager, I'm getting ( 126879603237190600081737151857243914981 ) It suggests that there are pending hints, but the org.apache.cassandra.internal:type=HintedHandoff mbean provides these figu

Re: OOM while reading key cache

2013-11-13 Thread Tom van den Berge
I'm having the same problem, after upgrading from 1.2.3 to 1.2.10. I can remember this was a bug that was solved in the 1.0 or 1.1 version some time ago, but apparently it got back. A workaround is to delete the contents of the saved_caches directory before starting up. Tom On Tue, Nov 12, 201

Re: filter using timeuuid column type

2013-11-05 Thread Tom van den Berge
This is because time2 is not part of the primary key. Only the primary key column(s) can be queried with > and <. Secondary indexes (like your timeuuid_test2_idx) can only be queried with the = operator. Maybe you can make time2 also part of your primary key? Good luck, Tom On Mon, Nov 4, 201

Re: Managing index tables

2013-11-05 Thread Tom van den Berge
Hi Thomas, I understand your concerns about ensuring the integrity of your data when having to maintain the indexes yourself. In some situations, using Cassandra's built in secondary indexes is more efficient -- when many rows contained the indexed value. Maybe your permissions fall in this categ

Re: Check out if Cassandra ready

2013-11-01 Thread Tom van den Berge
I recommend using CassandraUnit (https://github.com/jsevellec/cassandra-unit). It makes using Cassandra in unit tests quite easy. It allows you to start an embedded Cassandra synchronously with a single simple method call, optionally load your schema and initial data, and you're ready to start tes

Re: Disappearing index data.

2013-10-09 Thread Tom van den Berge
e, which is > responsible for storing index data. > > MBean you should look for looks like this: > > > org.apache.cassandra.db:type=IndexColumnFamilies,keyspace=,columnfamily=. > > M. > > W dniu 07.10.2013 15:22, Tom van den Berge pisze: > > On a 2-node cluster with replic

Re: Disappearing index data.

2013-10-07 Thread Tom van den Berge
the internal Cassandra's one, which is > responsible for storing index data. > > MBean you should look for looks like this: > > org.apache.cassandra.db:type=**IndexColumnFamilies,keyspace=<** > KS>,columnfamily=. > > M. > > W dniu 07.10.2013 15:22, Tom va

Disappearing index data.

2013-10-07 Thread Tom van den Berge
On a 2-node cluster with replication factor 2, I have a column family with an index on one of the columns. Every now and then, I notice that a lookup of the record through the index on node 1 produces the record, but the same lookup on node 2 does not! If I do a lookup by row key, the record is fo

HintedHandoff process does not finish

2013-09-27 Thread Tom van den Berge
Hi, One one of my nodes, the (storage) load increased dramatically (doubled), within one or two hours. The hints column family was causing the growth. I noticed one HintedHandoff process that was started some two hours ago, but hadn't finished. Normally, these processes take only a few seconds, 15

Re: is there a "no disk storage" mode ?

2011-12-01 Thread Tom van den Berge
Hi Dominique, I don't think there is a way to run cassandra without disk storage. But running it embedded can be very useful for unit testing. I'm using cassandra-unit (https://github.com/jsevellec/cassandra-unit) to integrate it in my tests. You don't need to configure any file paths; it wor