from:"Dennis"

Re: Why and How is Cassandra using all my ram ?

2018-07-23 Thread Dennis Lovely

you define the max size of your heap (-Xmx), but you do not define the max size of your offheap (MaxMetaspaceSize for jdk 8, PermSize for jdk7), so you could occupy all of the memory on the instance. your system killed the process to preserve itself. you should also take into account that the mem

Re: Question About Reaper

2018-05-24 Thread Dennis Lovely

looks like you're connecting to a service listening on SSL but you don't have the CA used in your truststore On Thu, May 24, 2018 at 1:58 PM, Surbhi Gupta wrote: > Getting below error: > > Caused by: sun.security.validator.ValidatorException: PKIX path building > failed: sun.security.provider.ce

Re: Spark Cassandra Python Connector

2016-06-20 Thread Dennis Lovely

https://github.com/TargetHolding/pyspark-cassandra On Mon, Jun 20, 2016 at 1:47 PM, Joaquin Alzola wrote: > Hi List > > Is there a Spark Cassandra connector in python? Of course there is the one > for scala ... > > BR > > Joaquin > This email is confidential and may be subject to privilege. If y

Re: Backup strategy

2016-06-16 Thread Dennis Lovely

Snapshot would flush your memtable to disk and you could stream your sstables out. Incremental backups would be the differences that have occurred since your last snapshot as far as I'm aware. Since it's reasonably unfeasible to constantly stream out full snapshots (depending on the density of yo

Re: Backup strategy

2016-06-16 Thread Dennis Lovely

Periodic snapshots + incremental backups I think are pretty good in terms of restoring to point in time. But you must manage cleaning up your snapshots + incremental backups on your own. I believe that tablesnap ( https://github.com/JeremyGrosser/tablesnap) is a pretty decent approach in terms of

Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-16 Thread Dennis Lovely

-archive.com/user@spark.apache.org/msg44793.html More info on tuning shuffle behavior: https://spark.apache.org/docs/1.5.1/configuration.html#shuffle-behavior On Thu, Jun 16, 2016 at 1:57 PM, Cassa L wrote: > Hi Dennis, > > On Wed, Jun 15, 2016 at 11:39 PM, Dennis Lovely wrote: > >

Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-15 Thread Dennis Lovely

You could try tuning spark.shuffle.memoryFraction and spark.storage.memoryFraction (both of which have been deprecated in 1.6), but ultimately you need to find out where you are bottlenecked and address that as adjusting memoryFraction will only be a stopgap. both shuffle and storage memoryFractio

Re: Spark Cassandra Java Connector: records missing despite consistency=ALL

2016-01-21 Thread Dennis Birkholz

wait a few more days before I am sure of that. Kind regards, Dennis Am 19.01.2016 um 19:39 schrieb Femi Anthony: So is the logging to Cassandra being done via Spark ? On Wed, Jan 13, 2016 at 7:17 AM, Dennis Birkholz mailto:birkh...@pubgrade.com>> wrote: Hi together, we Cassan

Spark Cassandra Java Connector: records missing despite consistency=ALL

2016-01-13 Thread Dennis Birkholz

ld really appreciate if someone could give me a hint how to fix this problem, thanks! Greets, Dennis P.s.: some information about our setup: Cassandra 2.1.12 in a two Node configuration with replication factor=2 Spark 1.5.1 Cassandra Java Driver 2.2.0-rc3 Spark Cassandra Java Connector 2.10-1.5.0-M2

Re: Cassandra metrics & Graphite

2014-12-17 Thread Dennis Lovely

- "^org.apache.cassandra.metrics.ClientRequest.+" - "^org.apache.cassandra.metrics.Storage.+" - "^org.apache.cassandra.metrics.ThreadPools.+" prefix: "servers.<%= hostname %>" HTH, -Dennis On Wed, Dec 17, 2014 at 9:16 AM, Karl Rieb wro

Re: Point in Time Recovery

2014-04-29 Thread Dennis Schwan

described but I think there should be a little more automation in it. Thanks all, Dennis Am 11.04.2014 21:11, schrieb Robert Coli: On Fri, Apr 11, 2014 at 1:21 AM, Dennis Schwan mailto:dennis.sch...@1und1.de>> wrote: The archived commitlogs are copied to the restore directory and afte

Re: Point in Time Recovery

2014-04-11 Thread Dennis Schwan

we only see the data from the snapshot, not the commitlogs. Regards, Dennis P.S.: Cassandra 2.0.6 Am 10.04.2014 23:17, schrieb Robert Coli: On Thu, Apr 10, 2014 at 1:19 AM, Dennis Schwan mailto:dennis.sch...@1und1.de>> wrote: do you know any description how to perform a point-in-time re

Point in Time Recovery

2014-04-10 Thread Dennis Schwan

Hey there, do you know any description how to perform a point-in-time recovery using the archived commitlogs? We have already tried several things but it just did not work. We have a 20 Node Cluster (10 in each DC). Thanks in Advance, Dennis -- Dennis Schwan Oracle DBA Mail Core 1&a

Re: [Cas 2.0.2] Looping Repair since activating PasswordAuthenticator

2013-11-04 Thread Dennis Schwan

Hi Yuki, thanks for your answer. I still do nt know if it is expected behaviour that Cassandra tries to repair these 1280 ranges everytime I run a nodetool repair on every node? Regards, Dennis Am 03.11.2013 03:27, schrieb Yuki Morishita: Hi Dennis, As you can see in the output, [2013

[Cas 2.0.2] Looping Repair since activating PasswordAuthenticator

2013-10-31 Thread Dennis Schwan

do at all. Thanks for your help! Dennis -- Dennis Schwan Oracle DBA Mail Core 1&1 Internet AG | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-8738 E-Mail: dennis.sch...@1und1.de | Web: www.1und1.de Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 6484 Vorstand: Ralph Dom

Re: I: Re: Are row-keys sorted by the compareWith?

2011-03-01 Thread Matthew Dennis

ow <http://www.sparrowmailapp.com> > > On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote: > > The map returned by multiget_slice (what I suspect is the underlying thrift > call for getColumnsFromRows) is not a order preserving map, it's a HashMap > s

Re: I: Re: Are row-keys sorted by the compareWith?

2011-02-23 Thread Matthew Dennis

The map returned by multiget_slice (what I suspect is the underlying thrift call for getColumnsFromRows) is not a order preserving map, it's a HashMap so the order of the returned results cannot be depended on. Even if it was a order preserving map, not all languages would be able to make use of t

Re: Reads and memory usage clarification

2011-02-23 Thread Matthew Dennis

Data is in Memtables from writes before they get flushed (based on first threshold of ops/size/time exceeded; all are configurable) to SSTables on disk. There is a keycache and a rowcache. The keycache caches offsets into SSTables for the rows. the rowcache caches the entire row. There is also

Re: latest rows

2011-02-16 Thread Matthew Dennis

+1 on avoiding OPP On Wed, Feb 16, 2011 at 3:27 PM, Tyler Hobbs wrote: > > Thanks for you input, but we have a set key that consists of name:timestamp >> that we are using.. and we need to also retrieve the oldest data as well.. >> > > Then you'll need to denormalize and store every row three wa

Re: Data distribution

2011-02-15 Thread Matthew Dennis

Assuming you aren't changing the RC, the normal bootstrap process takes care of all the problems like that, making sure things work correctly. Most importantly, if something fails (either the new node or any of the existing nodes) you can recover from it. Just don't connect clients directly to th

Re: Coordinator node

2011-02-15 Thread Matthew Dennis

You have a single HAProxy node in front of the cluster or you have a HAProxy node on each machine that is a client of Cassandra that points at all the nodes in the cluster? The former has a SPOF and bottleneck (the HAProxy instance), the latter does not (and is somewhat common, especially for thin

Re: Coordinator node

2011-02-15 Thread Matthew Dennis

tor node could have been avoided somehow. > Does the write on the coordinator node (incase it is not part of the N > replica nodes for that key) get deleted before response of the write is > returned back to the client ? > > > On Tue, Feb 15, 2011 at 4:40 PM, Matthew Dennis wro

Re: Coordinator node

2011-02-15 Thread Matthew Dennis

1. Yes, the coordinator node propagates requests to the correct nodes. 2. most (all?) higher level clients (pycassa, hector, etc) load balance for you. In general your client and/or the caller of the client needs to catch exceptions and retry. If you're using RRDNS and some of the nodes are temp

Re: What if write consistency level cannot me met ?

2011-02-15 Thread Matthew Dennis

But you can not depend on such behavior. If you do a write and you get an unavailable exception, the only thing you know is at that time it was not able to be placed on all the nodes required to meet your CL. It may eventually end up on all those nodes, it may not be on any of the nodes or at the

Re: What is the most solid version of Cassandra? No secondary indexes needed.

2011-02-15 Thread Matthew Dennis

0.7.1 is what I would go with right now. It's likely you'll eventually have to upgrade that as well, but moving to other 0.7.x releases should be fairly painless. Most development is happening on the 0.7 releases, which already have lots of fixes over the 0.6 series (not to mention performance im

Re: Data distribution

2011-02-14 Thread Matthew Dennis

regardless of increasing RF or not, RR happens based on the read_repair_chance setting. RR happens after the request has been replied to though, so it's possible that if you increase the RF and then read that the read might get stale/missing data. RR would then put the correct value on all the co

Re: NFS instead of local storage

2011-02-14 Thread Matthew Dennis

no, it's actually worse to do that. 1) you're introducing single points of failure (your array). 2) you're introducing complexity and expense 3) you're introducing latency 4) you're introducing bottle necks 5) some other reasons... You do want your commit log on a separate disk though. The o

Re: Data distribution

2011-02-14 Thread Matthew Dennis

On Mon, Feb 14, 2011 at 6:58 PM, Dan Hendry wrote: > > 1) If I insert a key and want to verify which node it went to then how do > I > > do that? > > I don't think you can and there should be no reason to care. Cassandra > abstracts where data is being stored, think in terms of consistency levels

Re: Internal error processing insert

2011-02-14 Thread Matthew Dennis

> > Write Latency: NaN ms. > > Pending Tasks: 0 > > Key cache capacity: 20 > > Key cache size: 0 > > Key cache hit rate: NaN > > Row cache: disabled > > Compacted row minimum size: 0 > > Compacted row maximum size: 0 > > Compacted row mean size: 0 &

Re: Extra Large Memtables

2011-02-14 Thread Matthew Dennis

On Mon, Feb 14, 2011 at 2:54 PM, Robert Coli wrote: > Regarding very large memtables, it is important to recognize that > throughput refers only to the size of the COLUMN VALUES, and not, for > example, their names. > That would be a bug in it's own right. There are lots of use cases that only

Re: Internal error processing insert

2011-02-14 Thread Matthew Dennis

On Mon, Feb 14, 2011 at 6:28 PM, Aaron Morton wrote: > Will take a closer look at the code tonight, perhaps we should return an > error if you try to using Network Topology it cannot detect any DC's . > > +1

Re: Internal error processing insert

2011-02-14 Thread Matthew Dennis

Is your ReplicationFactor (RF) really set to 0? Don't do that, it needs to be at least 1 and probably needs to be 3 in production if you care about your data. It must be greater than 0 and less than the number of nodes in your ring. It represents the number of nodes to copy/replicate data to. An

Re: RandomPartitioner

2011-02-14 Thread Matthew Dennis

nodes contain data for (prevTokenInRing, nodesOwnToken] (i.e. exclusive from previous token to inclusive of the nodes token). So .179 will contain things that hash in the range (152896308109140433971537345591636551711,0] and .12 will contain things that hash in range (0,152896308109140433971537345

Re: Adding nodes wrong/data not balanced across nodes

2010-10-27 Thread Matthew Dennis

You need to specify your initial tokens. LoadBalance really doesn't do a good job of balancing the load. Take a look at "Load Balancing" in http://wiki.apache.org/cassandra/Operations There is a little python script in there to help you pick tokens for a given cluster size. If you don't want to

Re: java.lang.OutOfMemoryError: Map failed

2010-10-27 Thread Matthew Dennis

2 GiB is pretty small for a C* node. You can also try reducing all the caching to zero with so little memory. If you have lots of CFs you probably want to reduce the memtable throughput too. On Wed, Oct 27, 2010 at 12:43 PM, Koert Kuipers < koert.kuip...@diamondnotch.com> wrote: > While bootst

Re: Dazed and confused with Cassandra on EC2 ...

2010-10-07 Thread Matthew Dennis

Also, in general, you probably want to set Xms = Xmx (regardless of the value you eventually decide on for that). If you set them equal, the JVM will just go ahead and allocate that amount on startup. If they're different, then when you grow above Xms it has to allocate more and move a bunch of s

Re: Retrieving dead node's token from system keyspace

2010-10-07 Thread Matthew Dennis

Allan, I'm confused on why removetoken doesn't do anything and would be interested in finding out why, but to answer your question: You can shutdown down your last node, nuke the system directory (make a backup just in case), restart the node, load the schema (export it first if need be) and be o

Re: Newbie Question about restarting Cassandra

2010-10-07 Thread Matthew Dennis

n CL.ONE. On Thu, Oct 7, 2010 at 7:11 PM, David McIntosh wrote: > Are there any data loss concerns if you have the commit log sync set to > periodic and are writing with CL One or Any? > > > > *From:* Matthew Dennis [mailto:mden...@riptano.com] > *Sent:* We

Re: Tuning cassandra to use less memory

2010-10-07 Thread Matthew Dennis

+1 on disabling swap On Oct 7, 2010 3:27 PM, "Peter Schuller" wrote: >> The nodes are still swapping, even though the swappiness is set to zero >> right now. After swapping comes the OOM. > > In addition to what's already been said, consider just flat out > disabling swap completely, unless you ha

Re: Heap Settings suggestions

2010-10-07 Thread Matthew Dennis

Keep in mind that .7 and on will have per-CF settings for most things so there will be even more control over the the tuning... On Oct 7, 2010 3:10 PM, "Peter Schuller" wrote: >> What if there is more than one keyspace in the system ? Assuming each >> keyspace has the same number of column familie

Re: Creating and using indices

2010-10-07 Thread Matthew Dennis

If I remember correctly the only operator supported for secondary indexes right now is EQ, not LTE (or the others). On Thu, Oct 7, 2010 at 6:13 AM, Christian Decker wrote: > I'm currently trying to get started on secondary indices in Cassandra > 0.7.0svn, but without any luck so far. I have the

Re: Does the secondary index in 0.7 cost extra space like an extra ColumnFamily?

2010-10-06 Thread Matthew Dennis

Creating indexes takes extra space (does in MySQL, PGSQL, etc too). https://issues.apache.org/jira/browse/CASSANDRA-749 has quite a bit of detail about how the secondary indexes currently work. On Wed, Oct 6, 2010 at 7:17 PM, Alvin UW wrote: > Hello, > > Before 0.7, actually we can create an ex

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Matthew Dennis

Rob is correct. drain is really on there for when you need the commit log to be empty (some upgrades or a complete backup of a shutdown cluster). There really is no point to using to shutdown C* normally, just kill it... On Wed, Oct 6, 2010 at 4:18 PM, Rob Coli wrote: > On 10/6/10 1:13 PM, Aar

Re: Re: Sorting in Cassandra

2010-10-06 Thread Matthew Dennis

The SCs are stored on disk in the order defined by the compareWith setting so if you want them back in a different order either someone is sorting them (C*, which doesn't sort them right now, or the client; which doesn't make much of a difference, it's just moving the load around) or you're denorma

Re: Retaining commit logs

2010-10-06 Thread Matthew Dennis

> > PS. Are other ppl interested in this functionality ? > I could file it to JIRA as well... > > Yes, please file it to Jira. It seems like it would be pretty useful for various things and fairly easy to change the code to move it to another directory whenever C* thinks it should be deleted...

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Matthew Dennis

Some relevant reading if you're interested: http://dslab.epfl.ch/pubs/crashonly/ http://web.archive.org/web/20060426230247/http://crash.stanford.edu/ On Wed, Oct 6, 2010 at 1:46 PM, Scott Mann wrote: > Yes. ctrl-C if running in the foreground. Use kill , if running > in the background (see the

Re: get keys based on values??

2010-10-06 Thread Matthew Dennis

uld my best bet be to simply get ALL of my users uuids and ages, then > throw away all of those that do not meet the required test? > > Thank you. > > On Oct 6, 2010, at 2:09 PM, Matthew Dennis wrote: > > As Norman said, secondary indexes are only in .7 but you can create > s

Re: get keys based on values??

2010-10-06 Thread Matthew Dennis

As Norman said, secondary indexes are only in .7 but you can create standard indexes in both .6 and .7 Basically have a email_domain_idx CF where the row key is the domain and the column names have the row id of the user (the column value is unused in this scenario). This sounds basically like wh

Re: Cassandra documentation available

2010-09-30 Thread Matthew Dennis

or d...@riptano.com On Wed, Sep 29, 2010 at 11:43 AM, Jonathan Ellis wrote: > We'll get those fixed. > > Here or tho...@riptano.com directly is fine. > > Thanks! >

Virtualization vs. Cassandra and Hadloop

2010-05-06 Thread Dennis

s should be set up on virtual VMs, but what about the Cassandra and Hadloop servers, should their be set up on VMs or directly on physical machines? If they should be set up on VMs, the data of Cassandra and Hadloop should be stored in local storage or a Storage Repository? Thanks,Dennis <>

50 matches

Mail list logo