Since I was just fiddling around with sst2json: if you have row level
deletes you might get problems since row level deletion info is not
exported in at least 1.0.
But if you're not using those you might be fine.
Віталій Тимчишин wrote:
I suppose the way is to convert all SST to json, then i
Hi all
we are running c* 1.0.8 and found some strange row level tombstone problems.
Some rows (~50 in around 2B keys) have markedForDeleteAt timestamps in
the future (so they 'drop' all writes) and 0 values as localDeletionTime.
A non-thorough check didn't bring up any code paths that could l
t sure if hints are
written when request time out but CL is reached.
On Jan 23, 2012, at 6:47 PM, Daniel Doubleday wrote:
> Your first thought was pretty much correct:
>
> 1. The node which is called by the client is the coordinator
> 2. The coordinator determines the nodes in th
Your first thought was pretty much correct:
1. The node which is called by the client is the coordinator
2. The coordinator determines the nodes in the ring which can handle the
request ordered by expected latency (via snitch). The coordinator may or may
not be part of these nodes
3. Given the c
>> server was heavier than the others, the only choice was to "scale up"
>> the hardware.
>>
>> My understanding of Cassandra's current sharding is consistent and
>> random. Does the new feature sit some where in-between? Are you
>> thinking of a p
was heavier than the others, the only choice was to "scale up"
> the hardware.
>
> My understanding of Cassandra's current sharding is consistent and
> random. Does the new feature sit some where in-between? Are you
> thinking of a pluggable API so that you can provide
Allow for deterministic / manual sharding of rows.
Right now it seems that there is no way to force rows with different row keys
will be stored on the same nodes in the ring.
This is our number one reason why we get data inconsistencies when nodes fail.
Sometimes a logical transaction requires w
With leveled compaction this should work pretty nicely.
If you need fast access and want to use the row cache you will need to do some
further patching though.
This is early brainstorming phase so any comments would be welcome
Cheers,
Daniel Doubleday
smeet.com
On Oct 31, 2011, at 7:08 PM, Ed
Sounds like this one:
http://comments.gmane.org/gmane.comp.db.cassandra.user/15828
or
http://comments.gmane.org/gmane.comp.db.cassandra.user/15936
Hope you have a backup. That would make your life much easier ...
On Jul 21, 2011, at 4:54 PM, cbert...@libero.it wrote:
> Hi all,
> I can't get
http://permalink.gmane.org/gmane.comp.db.cassandra.user/14225
but given
https://issues.apache.org/jira/browse/CASSANDRA-2868
and me thinking 2 secs longer I guess it was the leaked native memory from gc
inspector that has been swapped out.
(I didn't believe that mlockall is broken but at that
When using jna the mlockall call will result in all pages locked in rss and
thus reported there so you have either configured -Xms650M or you are running
on a small box and the start script calculated it for you.
Also our experience shows that the jna call does not prevent swapping so the
gener
res, but the OS can page it out
> instead of killing the process.
>
> On Mon, Jul 4, 2011 at 5:52 AM, Daniel Doubleday
> wrote:
>> Hi all,
>> we have a mem problem with cassandra. res goes up without bounds (well until
>> the os kills the process because we dont have
27;ve switched it to Sun and this part of the issue stabilized. The other
> issues we had were Heap going through the roof and then OOM under load.
>
>
> On Mon, Jul 4, 2011 at 11:01 AM, Daniel Doubleday
> wrote:
> Just to make sure:
> You were seeing that res mem was mor
score 13723 or a child
On Jul 4, 2011, at 2:42 PM, Jonathan Ellis wrote:
> mmap'd data will be attributed to res, but the OS can page it out
> instead of killing the process.
>
> On Mon, Jul 4, 2011 at 5:52 AM, Daniel Doubleday
> wrote:
>> Hi all,
>> we have a
stabilized. The other
> issues we had were Heap going through the roof and then OOM under load.
>
>
> On Mon, Jul 4, 2011 at 11:01 AM, Daniel Doubleday
> wrote:
> Just to make sure:
> You were seeing that res mem was more than twice of max java heap and that
> did c
eads, writes?
>
> SC
>
> On Mon, Jul 4, 2011 at 6:52 AM, Daniel Doubleday
> wrote:
> Hi all,
>
> we have a mem problem with cassandra. res goes up without bounds (well until
> the os kills the process because we dont have swap)
>
> I found a thread that
Just to make sure:
The yaml doesn't matter. The cache config is stored in the system tables. Its
the "CREATE ... WITH ..." stuff you did via cassandra-cli to create the CF.
In Jconsole you see that the cache capacity is > 0?
On Jul 4, 2011, at 11:18 AM, Shay Assulin wrote:
> Hi,
>
> The row c
Hi all,
we have a mem problem with cassandra. res goes up without bounds (well until
the os kills the process because we dont have swap)
I found a thread that's about the same problem but on OpenJDK:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-n
Hi all - or rather devs
we have been working on an alternative implementation to the existing row
cache(s)
We have 2 main goals:
- Decrease memory -> get more rows in the cache without suffering a huge
performance penalty
- Reduce gc pressure
This sounds a lot like we should be using the new
Here's my understanding of things ... (this applies only for the regular heap
implementation of row cache)
> Why Cassandra does not cache a row that was requested few times?
What does the cache capacity read. Is it > 0?
> What the ReadCount attribute in ColumnFamilies indicates and why it rema
Hi all
now that JRockit is available for free and the claims are there that it has
better performance and gc I wanted to know if anybody out here has done any
testing / benchmarking yet.
Also interested in deterministic gc ... maybe its worth the 300 bucks?
Cheers,
Daniel
I can't remember. Easiest way is to configure it to
listen only on localhost and restart.
Thirdly does anyone know if the problem is contagious i.e. should I
consider decommissioning the whole node and try to rebuild from replicas?
No. That should not be necessary
Good luck
Thank
We are having problems with repair too.
It sounds like yours are the same. From today:
http://permalink.gmane.org/gmane.comp.db.cassandra.user/16619
On May 25, 2011, at 4:52 PM, Dominic Williams wrote:
> Hi,
>
> I've got a strange problem, where the database on a node has inflated 10X
> after
ing.
So I guess my next repair will be scheduled in 0.8.1.
But I don't understand why this did not hit others so hard that it is
considered more critical.
We seem to use cassandra in unusual ways.
Thanks again.
Daniel
On May 24, 2011, at 9:05 PM, Daniel Doubleday wrote:
> Ok th
this book
Daniel
If your interested here's the log: http://dl.dropbox.com/u/5096376/system.log.gz
I also lied about total size of one node. It wasn't 320 but 280. All nodes
On May 24, 2011, at 3:41 PM, Sylvain Lebresne wrote:
> On Tue, May 24, 2011 at 12:40 AM, Daniel Doubled
We are performing the repair on one node only. Other nodes receive reasonable
amounts of data (~500MB). It's only the repairing node itself which
'explodes'.
I must admit that I'm a noob when it comes to aes/repair. Its just strange that
a cluster that is up and running with no probs is doing
ds of data for that CF from the other nodes.
Sigh...
On May 23, 2011, at 7:48 PM, Sylvain Lebresne wrote:
> On Mon, May 23, 2011 at 7:17 PM, Daniel Doubleday
> wrote:
>> Hi all
>>
>> I'm a bit lost: I tried a repair yesterday with only one CF and that didn
Hi all
I'm a bit lost: I tried a repair yesterday with only one CF and that didn't
really work the way I expected but I thought that would be a bug which only
affects that special case.
So I tried again for all CFs.
I started with a nicely compacted machine with around 320GB of load. Total dis
Hi all
I was wondering if there might be some way to better communicate known issues.
We do try to track jira issues but at times some slip through or we miss
implications.
Things like the broken repair of specific CFs.
(https://issues.apache.org/jira/browse/CASSANDRA-2670). I know that this
Hi all
was wondering if there's anybody here planning to go to the Berlin Buzzwords
and attend the cassandra hackathon.
I'm still indecisive but it might be good to have the chance to talk about
experiences in more detail.
Cheers,
Daniel
Hi all
after upgrading to 0.7 we have a small problem with dynamic snitch:
we have rf=3, quorum read/write and read repair prop set to 0. Thus cassandra
always shortcuts reads to only 2 hosts.
Problem is that one of our nodes get ignored unless using a little patch and
initialize the scores.
Thanks - yes I agree. Didn't want to judge solely based on this figure.
It should just add to the picture. But since we know access patterns and other
stats like key and row cache hit ratios we hope to be able to make a more
educated guess whats going on.
On May 13, 2011, at 9:08 AM, Peter Sch
Hi all
got a question for folks with some code insight again.
To be able to better understand where our IO load is coming from we want to
monitor the number of bytes read from disc per cf. (we love stats)
What I have done is wrapping the FileDataInput in SSTableReader to sum the
bytes read in
it's the root cause of my
> problems, something something encoding error, but that doesn't really help
> me. :-)
>
> However, I've done all my tests with 0.7.5, I'm gonna try them again with
> 0.7.4, just to see how that version reacts.
>
>
> /Henrik
>
ssandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 5 May 2011, at 22:36, aaron morton wrote:
>>
>>> Interesting but as we are dealing with keys it should not matter as they
>>> are treated as byte buffers.
>>>
>&
This is a bit of a wild guess but Windows and encoding and 0.7.5 sounds like
https://issues.apache.org/jira/browse/CASSANDRA-2367
On May 3, 2011, at 5:15 PM, Henrik Schröder wrote:
> Hey everyone,
>
> We did some tests before upgrading our Cassandra cluster from 0.6 to 0.7,
> just to make su
apparently happened during compaction was
1. read sst and generate string based order rows
2. write the new file based on that order
3. read the compacted file based on raw bytes order -> crash
That bug never made it to production so we are fine.
On Apr 29, 2011, at 10:32 AM, Daniel Doubleday wr
What we are about to set up is a time machine like backup. This is more like an
add on to the s3 backup.
Our boxes have an additional larger drive for local backup. We create a new
backup snaphot every x hours which hardlinks the files in the previous snapshot
(bit like cassandras incremental_b
Bad == Broken
That means you cannot rely on 1 == 1. In such a scenario everything can happen
including data loss.
That's why you want ECC mem on production servers. Our cheapo dev boxes dont.
On Apr 28, 2011, at 7:46 PM, mcasandra wrote:
> What do you mean by Bad memory? Is it less heap size,
Hi all
on one of our dev machines we ran into this:
INFO [CompactionExecutor:1] 2011-04-28 15:07:35,174 SSTableWriter.java (line
108) Last written key : DecoratedKey(12707736894140473154801792860916528374,
74657374)
INFO [CompactionExecutor:1] 2011-04-28 15:07:35,174 SSTableWriter.java (line
FWIW: For whatever reason jna memlockall does not work for us. jna call is
successful but cassandra process swaps anyway.
see: http://www.mail-archive.com/user@cassandra.apache.org/msg11235.html
We disabled swap entirely.
On Mar 22, 2011, at 8:56 PM, Chris Goffinet wrote:
> The easiest way to
On Mar 22, 2011, at 5:09 AM, aaron morton wrote:
> 1) You should use nodes with the same capacity (CPU, RAM, HDD), cassandra
> assumes they are all equal.
Care to elaborate? While equal node will certainly make life easier I would
have thought that dynamic snitch would take care of performan
At least if you are using RackUnawareStrategy
Cheers,
Daniel
On Mar 15, 2011, at 6:44 PM, Huy Le wrote:
> Hi,
>
> We have a cluster with 12 servers and use RF=3. When running nodetool
> repair, do we have to run it on all nodes on the cluster or can we run on
> every 3rd node? Thanks!
>
>
Hi all
strange things here: we are using jna. Log file says mlockall was successful.
We start with -Xms2000M -Xmx2000M and run cassandra as root process so
RLIMIT_MEMLOCK limit should have no relevance. Still cassandra is swapping ...
Used swap varies between 100MB - 800MB
We removed the swap
Hi all,
on 0.6:
we are facing increased write latencies every now and then when an unfortunate
write command thread becomes the flush writer for a mem table because of an
already running mem table flush.
I was thinking of setting the work queue in CFS.flushWriterPool to
new LinkedBlockingQu
Hi all
we are still on 0.6.9 and plan to upgrade to 0.6.12 but are a little concerned
about:
https://issues.apache.org/jira/browse/CASSANDRA-2170
I thought of upgrading only one node (of 5) to .12 and monitor for a couple of
days.
Is this a bad idea?
Thanks,
Daniel
cluster load is lower.
For the time being I guess thats good enough and we hope that 0.7 works a
little smoother when doing repairs.
Cheers,
Daniel
On Mar 7, 2011, at 7:22 PM, Jonathan Ellis wrote:
> On Mon, Mar 7, 2011 at 11:18 AM, Daniel Doubleday
> wrote:
>> Since we alre
Hi all
we're still on 0.6 and are facing problems with repairs.
I.e. a repair for one CF takes around 60h and we have to do that twice (RF=3, 5
nodes). During that time the cluster is under pretty heavy IO load. It kinda
works but during peek times we see lots of dropped messages (including wr
It depends a little on your write pattern:
- Wide rows tend to get distributed over more sstables so more disk reads are
necessary. This will become noticeable when you have high io load and reads
actually hit the discs.
- If you delete a lot slice query performance might suffer: extreme example
; brendan.po...@new-law.co.uk
> 029 2078 4283
> www.new-law.co.uk
>
>
>
>
>
> From: Daniel Doubleday [mailto:daniel.double...@gmx.net]
> Sent: 03 February 2011 17:21
> To: user@cassandra.apache.org
> Subject: Re: Using Cassandra to store files
&g
Hundreds of thousands doesn't sound too bad. Good old NFS would do with an ok
directory structure.
We are doing this. Our documents are pretty small though (a few kb). We have
around 40M right now with around 300GB total.
Generally the problem is that much data usually means that cassandra beco
We use zabbix and cassandra like so:
http://www.mail-archive.com/user@cassandra.apache.org/msg08100.html
Daniel Doubleday,
smeet.com
On Jan 9, 2011, at 1:09 AM, ruslan usifov wrote:
> Zapcat is a simple bridge between JMX and zabbix protocol, and it imho
> doesn't allow collec
/ tried something in that direction.
Cheers,
Daniel Doubleday
smeet.com, Berlin
Hi all
wanted to share a cassandra usage pattern you might want to avoid (if you can).
The combinations of
- heavy rows,
- large volume and
- many updates (overwriting columns)
will lead to a higher count of live ssts (at least if you're not starting mayor
compactions a lot) with many ssts ac
You will loose part of the retry / fallback functionality offered by hector.
The job of the client lib is not only load-balancing. I.e. if a node is
bootstrapping it will accept TCP connections but throw an exception which will
be communicated via thrift. The client lib is supposed to handle tha
On 19.12.10 03:05, Wayne wrote:
Rereading through everything again I am starting to wonder if the page
cache is being affected by compaction.
Oh yes ...
http://chbits.blogspot.com/2010/06/lucene-and-fadvisemadvise.html
https://issues.apache.org/jira/browse/CASSANDRA-1470
We have been heavily
serve up 15+TB of data. Based on what we have seen we need 100 Cassandra
> > nodes with rf=3 to give us good read latency (by keeping the node data sizes
> > down). The cost/value equation just does not add up.
> >
> > Thanks in advance for any advice/experience you can provi
> the purpose of your thread is: How far are you away from being I/O
> bound (say in terms of % utilization - last column of iostat -x 1 -
> assuming you don't have a massive RAID underneath the block device)
No my cheap boss didn't want to by me a stack of these
http://www.ocztechnology.com/prod
How / what are you monitoring? Best practices someone?
Cheers,
Daniel Doubleday,
smeet.com, Berlin
On Dec 16, 2010, at 11:35 PM, Wayne wrote:
> I have read that read latency goes up with the total data size, but to what
> degree should we expect a degradation in performance? What is the "normal"
> read latency range if there is such a thing for a small slice of scol/cols?
> Can we really pu
...
Thanks,
Daniel
smeet.com, Berlin
> On Tue, Dec 14, 2010 at 1:55 PM, Daniel Doubleday
> wrote:
> Hi
>
> I'm sorry - don't want to be a pain in the neck with source questions. So
> please just ignore me if this is stupid:
>
> Isn't org.apache.cassa
Hi
I'm sorry - don't want to be a pain in the neck with source questions. So
please just ignore me if this is stupid:
Isn't org.apache.cassandra.service.ReadResponseResolver suposed to throw a
DigestMismatchException if it receives a digest wich does not match the digest
of a read message?
If
On Dec 14, 2010, at 2:29 AM, Brandon Williams wrote:
> On Mon, Dec 13, 2010 at 6:43 PM, Daniel Doubleday
> wrote:
> Oh - well but I see that the coordinator is actually using its own score for
> ordering. I was only concerned that dropped messages are ignored when
> calculatin
On 13.12.10 21:15, Brandon Williams wrote:
On Sun, Dec 12, 2010 at 10:49 AM, Daniel Doubleday
mailto:daniel.double...@gmx.net>> wrote:
Hi again.
It would be great if someone could comment whether the following
is true or not.
I tried to understand the consequences of
Hi Peter
I should have started with the why instead of what ...
Background Info (I try to be brief ...)
We have a very small production cluster (started with 3 nodes, now we have 5).
Most of our data is currently in mysql but we want to slowly move the larger
tables which are killing our mysql
Hi again.
It would be great if someone could comment whether the following is true
or not.
I tried to understand the consequences of using
|-Dcassandra.dynamic_snitch=true for the read path |and that's what I
came up with:
1) If using CL > 1 than using the dynamic snitch will result in a dat
Thanks for your help Peter.
We gave up and rolled back to our mysql implementation (we did all writes to
our old store in parallel so we did not lose anything).
Problem was that every solution we came up with would require at least on major
compaction before the new nodes could join and our clus
Hi good people.
I underestimated load during peak times and now I'm stuck with our production
cluster.
Right now its 3 nodes, rf 3 so everything is everywhere. We have ~300GB data
load. ~10MB/sec incoming traffic and ~50 (peak) reads/sec to the cluster
The problem derives from our quorum read
sion of Cassandra
is this?
On Fri, Dec 3, 2010 at 7:27 PM, Daniel Doubleday
wrote:
Yes.
I thought that would make sense, no? I guessed that the quorum read forces
the slowest of the 3 nodes to keep the pace of the faster ones. But it cant.
No matter how small the performance diff is. So it will
rs shooting their own foot as I did.
On 03.12.10 23:36, Jonathan Ellis wrote:
Am I understanding correctly that you had all connections going to one
cassandra node, which caused one of the *other* nodes to die, and
spreading the connections around the cluster fixed it?
On Fri, Dec 3, 2010 at 4:00 AM, D
imary node or the node after the the primary
(when the primary was located in the switched off dc)
Daniel Doubleday
smeet.com, Berlin
On Dec 2, 2010, at 6:11 PM, Jonathan Ellis wrote:
> On Thu, Dec 2, 2010 at 4:08 AM, Jake Maizel wrote:
>> Hello,
>>
>> We have a ring of 1
f all rows on
one node is enough. But the same thing will probably happen if you scan by
continuos tokens (meaning that you will read from the same node a long time).
Cheers,
Daniel Doubleday
smeet.com, Berlin
exist this ration will always
show 1.0
Meaning it is rather a measure of how many of your queries ask for non existing
values.
Cheers,
Daniel
On Oct 28, 2010, at 1:10 PM, Daniel Doubleday wrote:
> Hi Ryan
>
> I took a sample of one sstable (just flushed, not compacted).
>
&g
file size: 110730565 bytes
rows: 47432
FILTER FILE
file size: 96565 bytes
bloom filter bitset size: 771904
bloom filter bitset cardinalaity: 354610
On Oct 27, 2010, at 6:41 PM, Ryan King wrote:
> On Wed, Oct 27, 2010 at 3:24 AM, Daniel Doubleday
> wrote:
>> Hi people
>>
>>
l data model it's not unlikely that this sort of
skew exists since you'd tend to query for items towards the root of
the hierarchy more frequently.
Mike
On Wed, Oct 27, 2010 at 2:14 PM, Daniel Doubleday
mailto:daniel.double...@gmx.net>> wrote:
Hm -
not sure if I u
gt; couple outlier rows causing the false positives that are being queried
> over and over then that could just be the luck of the draw.
>
> On Wed, Oct 27, 2010 at 5:24 AM, Daniel Doubleday
> wrote:
>> Hi people
>>
>> We are currently moving our second use case from mysq
Hi people
We are currently moving our second use case from mysql to cassandra. While
importing the data (ongoing) I noticed that the BloomFilterFalseRation seems to
be pretty high compared to another CF which is in used in production right now.
Its a hierarchical data model and I cannot avoid t
Hi all,
just wanted to make sure that I get this right:
What this means is that I have to schedule repairs only on every RFs node?
So with 4 nodes and RF=2 I would repair nodes 1 and 3
and with 6 nodes and RF=3 I would repair nodes 1 and 4
and that would lead to a synched cluster?
> On Thu, Jul 15
Hi people
I was wondering if anyone already benchmarked such a situation:
I have:
day of year (row key) -> SomeId (column key) -> byte[0]
I need to make sure that I write SomeId, but in around 80% of the cases it will
be already present (so I would essentially replace it with itself). RF will
79 matches
Mail list logo