Re: Restarting nodes and reported load

2017-05-30 Thread tommaso barbugli
-4TB per node, and by load rises, I'm talking about load as >> reported by nodetool status. >> >> >> >> On May 30 2017, at 10:25 am, daemeon reiydelle >> wrote: >> >> When you say "the load rises ... ", could you clarify what you mean b

Re: Restarting nodes and reported load

2017-05-30 Thread tommaso barbugli
ood >> marker to decide on whether to increase disk space vs provisioning a new >> node? >> >> >> On May 29 2017, at 9:35 am, tommaso barbugli >> wrote: >> >> Hi Daniel, >> >> This is not normal. Possibly a capacity problem. Whats the RF, how much >

Re: Restarting nodes and reported load

2017-05-29 Thread tommaso barbugli
Hi Daniel, This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)? Cheers, Tommaso On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol wrote: > > I am running a 6 node cluster, and I have

Re: Splitting Cassandra Cluster between AWS availability zones

2017-03-07 Thread tommaso barbugli
Hi Richard, It depends on the snitch and the replication strategy in use. Here's a link to a blogpost about how we deployed C* on 3AZ http://highscalability.com/blog/2016/8/1/how-to-setup-a-highly-available-multi-az-cassandra-cluster-o.html Best, Tommaso On Mar 7, 2017 18:05, "Ney, Richard"

Re: Practical limit on number of column families

2016-03-01 Thread tommaso barbugli
Hi Fernando, I used to have a cluster with ~300 tables (1 keyspace) on C* 2.0, it was a real pain in terms of operations. Repairs were terribly slow, boot of C* slowed down and in general tracking table metrics becomes bit more work. Why do you need this high number of tables? Tommaso On Tue, Ma

scylladb

2015-11-05 Thread tommaso barbugli
Hi guys, did anyone already try Scylladb (yet another fastest NoSQL database in town) and has some thoughts/hands-on experience to share? Cheers, Tommaso

Re: Datastax EC2 Ami

2015-04-27 Thread tommaso barbugli
Hi, I would remove the node and start a new one. You can pick a specific Cassandra release using user data (eg. --release 2.0.11) Cheers, Tommaso On Mon, Apr 27, 2015 at 8:53 PM, Eduardo Cusa < eduardo.c...@usmediaconsulting.com> wrote: > Hi Guys, we start our cassandra cluster with the followi

Re: alter table issues on 2.0.10

2014-10-12 Thread tommaso barbugli
Sun, Oct 12, 2014 at 10:53 PM, tommaso barbugli > wrote: > >> Hi, >> it actually seems to be worse than what I thought; I get an exception in >> cassandra logs every time I try to create a new table. >> >> Cql query: >> CREATE TABLE shard12 ("feed_id&

Re: alter table issues on 2.0.10

2014-10-12 Thread tommaso barbugli
#x27;sstable_size_in_mb': 64, 'class': 'LeveledCompactionStrategy'}; This is the error: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatib

alter table issues on 2.0.10

2014-10-12 Thread tommaso barbugli
Hi, I am seeing errors every time I make a schema migration of this kind on cassandra 2.0.10 ALTER TABLE notifications add "unread_ids" set static Weird enough DESCRIBE COLUMNFAMILY notifications; shows that the column unread_ids is created after the error. Any idea if this is an actual bug or

unreadable partitions

2014-09-28 Thread tommaso barbugli
Hi, I see some data stored in Cassandra (2.0.7) being not readable from CQL; this affects entire partitions, querying this partitions raise a Java exception: ERROR [ReadStage:540638] 2014-09-28 12:40:38,992 CassandraDaemon.java (line 198) Exception in thread Thread[ReadStage:540638,5,main] java.l

update static column using partition key

2014-09-07 Thread tommaso barbugli
Hi, I am trying to use a couple of static columns; I am using cassandra 2.0.7 and when I try to set a value using the partition key only, I get a primary key incomplete error. Here is the schema and the query with the error I get from cqlsh CREATE TABLE shard75 ( group_id ascii, event_id time

Re: disk space and tombstones

2014-08-18 Thread tommaso barbugli
2014-08-18 13:25 GMT+02:00 clslrns : > That scheme assumes we have to read counter value before write something to > the timeline. This is what we try to avoid as an anti-pattern. You can work around the read counter before read, but I agree that it would be much better if disk space was reclaim

Re: disk space and tombstones

2014-08-18 Thread tommaso barbugli
what about you timeline versioning? every time a timeline has more than x columns, you bump its version (which should be part of its row key) and start writing on that one (though this will make your app substantially more complex). AFAIK reclaiming disk space for delete rows is far easier than rec

Re: disk space and tombstones

2014-08-18 Thread tommaso barbugli
I was exactly in your same situation, I could only reclaim disk space for trimmed data this way: very low gc_grace + size tiered compaction + slice timestamp deletes + major compaction 2014-08-18 12:06 GMT+02:00 Rahul Neelakantan : > Is that GC_grace 300 days? > > Rahul Neelakantan > > > On Aug

aggregating data in cassandra

2014-08-09 Thread tommaso barbugli
Hi everyone, I am a bit stuck with my data model on Cassandra; What I am trying to do is to be able to retrieve rows in groups, something similar to sql's GROUP BY but that works only on one attribute. I am keeping data grouped together in a different CF (eg. GROUP BY x had his own CF groupby_x),

Re: estimated row count for a pk range

2014-07-21 Thread tommaso barbugli
you to manage the counting manually > 2) SELECT DISTINCT partitionKey FROM Normally this query is > optimized and is much faster than a SELECT *. However if you have a very > big number of distinct partitions it can be slow > > > On Sun, Jul 20, 2014 at 6:48 PM, tommaso barbu

estimated row count for a pk range

2014-07-20 Thread tommaso barbugli
Hello, Lately I collapsed several (around 1k) column families in a bunch (100) of column families. To keep data separated I have added an extra column (family) which is part of the PK. While previous approach allowed me to always have a clear picture of every column family's size; now I have no ot

order by on different columns

2014-07-15 Thread tommaso barbugli
Hi, We need to retrieve the data stored in cassandra on something different than its "natural" order; we are looking for possible ways to sort results from a column family data based on columns that are not part of the primary key; is denormalizing the data on another column family the only option

Re: keyspace with hundreds of columnfamilies

2014-07-12 Thread tommaso barbugli
st scared that i will get into some bad situation problem when 1k CFs will grow to 5 or 10k > > -- Jack Krupansky > > *From:* tommaso barbugli > > *Sent:* Saturday, July 12, 2014 7:58 AM > *To:* user@cassandra.apache.org > > *Subject:* Re: keyspace with hundreds

Re: keyspace with hundreds of columnfamilies

2014-07-12 Thread tommaso barbugli
n Sat, Jul 5, 2014 at 12:32 PM, tommaso barbugli > wrote: > >> Yes my question what about CQL-style columns. >> >> >> 2014-07-04 12:40 GMT+02:00 Jens Rantil > >: >> >> Just so you guys aren't misunderstanding each other; Tommaso, you were >>&g

Re: keyspace with hundreds of columnfamilies

2014-07-05 Thread tommaso barbugli
; > romain.hardo...@urssaf.fr> wrote: > >> Cassandra can handle many more columns (e.g. time series). >> So 100 columns is OK. >> >> Best, >> Romain >> >> >> >> tommaso barbugli a écrit sur 03/07/2014 21:55:18 : >> >> &

Re: keyspace with hundreds of columnfamilies

2014-07-03 Thread tommaso barbugli
ly straightforward to allow disabling > the SlabAllocator.” Emphasis on “almost certainly a Bad Idea.” > > See: > https://issues.apache.org/jira/browse/CASSANDRA-5935 > “Allow disabling slab allocation” > > IOW, this is considered an anti-pattern, but... > > -- Jack Krupan

Re: keyspace with hundreds of columnfamilies

2014-07-02 Thread tommaso barbugli
ot intended to be tweaked so it might not be a good idea to > change it. > > Best, > Romain > > tommaso barbugli a écrit sur 02/07/2014 17:40:18 : > > > De : tommaso barbugli > > A : user@cassandra.apache.org, > > Date : 02/07/2014 17:40 > > Objet

Re: keyspace with hundreds of columnfamilies

2014-07-02 Thread tommaso barbugli
usands of CF it means > thousands of mega bytes... > Up to 1,000 CF I think it could be doable, but not 10,000. > > Best, > > Romain > > > tommaso barbugli a écrit sur 02/07/2014 10:13:41 : > > > De : tommaso barbugli > > A : user@cassandra.apache.org, &g

Re: keyspace with hundreds of columnfamilies

2014-07-02 Thread tommaso barbugli
ap issues. What is driving the > requirement for so many Cfs? > > > On Jul 2, 2014, at 4:14 AM, tommaso barbugli > wrote: > > > > Hi, > > Are there any known issues, shortcomings about organising data in > hundreds of column families? > > At this present

keyspace with hundreds of columnfamilies

2014-07-02 Thread tommaso barbugli
Hi, Are there any known issues, shortcomings about organising data in hundreds of column families? At this present I am running with 300 column families but I expect that to get to a couple of thousands. Is this something discouraged / unsupported (I am using Cassandra 2.0). Thanks Tommaso

Re: Questions about timestamp set at writetime

2014-06-17 Thread tommaso barbugli
t; --> > It is possible : > http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/batch_r.html > > > On Tue, Jun 17, 2014 at 2:17 PM, tommaso barbugli > wrote: > >> when inserting with a batch every row have the same timestamp; I also >> think (not 100%)

Re: Questions about timestamp set at writetime

2014-06-17 Thread tommaso barbugli
when inserting with a batch every row have the same timestamp; I also think (not 100%) that is not possible to define different timestamps within a batch. Tommaso 2014-06-17 14:10 GMT+02:00 DuyHai Doan : > Hello all > > I know that at write time a timestamp is automatically generated by the >

CQL IN query with 2i index

2014-06-14 Thread tommaso barbugli
Hi there, I was wondering if there is a good reason for select queries on secondary indexes to not support any where operator other than the equality operator, or if its just a missing feature in CQL. Thanks, Tommaso

Re: Backup procedure

2014-05-02 Thread tommaso barbugli
it is done per single cell so unless one stores) 2014-05-02 19:01 GMT+02:00 Robert Coli : > On Fri, May 2, 2014 at 2:07 AM, tommaso barbugli wrote: > >> If you are thinking about using Amazon S3 storage I wrote a tool that >> performs snapshots and backups on multiple nodes. >

Re: Backup procedure

2014-05-02 Thread tommaso barbugli
If you are thinking about using Amazon S3 storage I wrote a tool that performs snapshots and backups on multiple nodes. Backups are stored compressed on S3. https://github.com/tbarbugli/cassandra_snapshotter Cheers, Tommaso 2014-05-02 10:42 GMT+02:00 Artur Kronenberg : > Hi, > > we are running

Re: Cassandra data retention policy

2014-04-28 Thread tommaso barbugli
TTL is good for this but I have no idea how you will ever be able to restore data removed from disk like that. Perhaps one could make a snapshot and then delete everything with timestamp older than a date and then run compaction on every node to reclaim the disk. 2014-04-28 21:57 GMT+02:00 Donald

Re: clearing tombstones?

2014-04-11 Thread tommaso barbugli
st checked, >> and this CF has GCGraceSeconds of 10 days). >> >> >> On Fri, Apr 11, 2014 at 10:10 AM, tommaso barbugli >> wrote: >> >>> compaction should take care of it; for me it never worked so I run >>> nodetool compaction on every node

Re: clearing tombstones?

2014-04-11 Thread tommaso barbugli
compaction should take care of it; for me it never worked so I run nodetool compaction on every node; that does it. 2014-04-11 16:05 GMT+02:00 William Oberman : > I'm wondering what will clear tombstoned rows? nodetool cleanup, nodetool > repair, or time (as in just wait)? > > I had a CF that w