Re: Issue with Cassandra consistency in results

2017-03-16 Thread Ryan Svihla
f the data was written at ONE, consistency is not guaranteed. ..but > considering you just restored the cluster, there's a good chance something > else is off. > > On Thu, Mar 16, 2017 at 18:19 srinivasarao daruna > wrote: > >> Want to make read and write QUORUM as well. >

Re: Issue with Cassandra consistency in results

2017-03-16 Thread Ryan Svihla
1.000MiB >> WARN [ScheduledTasks:1] 2017-03-14 14:58:37,141 QueryProcessor.java:103 >> - 88 prepared statements discarded in the last minute because cache limit >> reached (32 MB) >> The first api call returns 0 and the api calls later gives right values. >> >> Please let me know, if any other details needed. >> Could you please have a look at this issue once and kindly give me your >> inputs? This issue literally broke the confidence on Cassandra from our >> business team. >> >> Your inputs will be really helpful. >> >> Thank You, >> Regards, >> Srini >> > > > -- Thanks, Ryan Svihla

Re: TransportException - Consistency LOCAL_ONE - EC2

2017-03-15 Thread Ryan Svihla
y policy = new > TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build(), > false); > > Frank > > On 2017-03-15 13:45 (-), Ryan Svihla wrote: > > I don't see what getPolicy is retrieving but you want to use TokenAware > > with the shuffle false option in

Re: Change the IP of a live node

2017-03-15 Thread Ryan Svihla
10.179.xx.xx 3.45 TB256 ? >>> 3b07df3b-683b-4e2d-b307-3c48190c8f1c RAC1 >>> DN 192.168.xx.xx ? 256 ? >>> 19636f1e-9417-4354-8364-6617b8d3d20b r1 >>> DN 192.168.xx.xx? 256 ? >>> 9c65c71c-f5dd-4267-af9e-a20881cf3d48 r1 >>> DN 192.168.xx.xx ? 256 ? >>> ee75219f-0f2c-4be0-bd6d-038315212728 r1 >>> >>> Am I doing anything wrong? Thanks in advance >>> >>> Kind regards, >>> George >>> >> >> > -- Thanks, Ryan Svihla

Re: TransportException - Consistency LOCAL_ONE - EC2

2017-03-15 Thread Ryan Svihla
atedRanges = replicaCount.get(replica); > for(TokenRange tr : replicaCount.get(replica)){ > System.out.println(tr.getStart() + " to " + tr.getEnd()); > } > } > > //get a list of token ranges for this host > List tokenRangesForHost = replicaCount.get(lo

Re: TransportException - Consistency LOCAL_ONE - EC2

2017-03-15 Thread Ryan Svihla
- Should this happen when Im using consistency level LOCAL_ONE and just > doing reads ? > - Does this suggest non-local reads are happening ? > > Many thanks for any help/ideas. > > Frank > > > -- Thanks, Ryan Svihla

Re: HELP with bulk loading

2017-03-09 Thread Ryan Svihla
I suggest using cassandra loader https://github.com/brianmhess/cassandra-loader On Mar 9, 2017 5:30 PM, "Artur R" wrote: > Hello all! > > There are ~500gb of CSV files and I am trying to find the way how to > upload them to C* table (new empty C* cluster of 3 nodes, replication > factor 2) with

Re: A Single Dropped Node Fails Entire Read Queries

2017-03-09 Thread Ryan Svihla
e or any information herein. > If you have received this message in error, please advise the sender > immediately by reply email and delete this message. Thank you. > -- Thanks, Ryan Svihla

Re: Disconnecting two data centers

2017-03-08 Thread Ryan Svihla
gt; > We would like to keep both of them for a while but we have a need to > disconnect them. How can this be done? > -- Thanks, Ryan Svihla

Re: Isolation in case of Single Partition Writes and Batching with LWT

2016-09-12 Thread Ryan Svihla
It was just the first place google turned up, I made an answer late in the evening trying to help someone out on my own free time. Regards, Ryan Svihla > On Sep 12, 2016, at 6:34 AM, Mark Thomas wrote: > >> On 11/09/2016 23:07, Ryan Svihla wrote: >> 1. A batch with u

Re: Isolation in case of Single Partition Writes and Batching with LWT

2016-09-11 Thread Ryan Svihla
ut this is the standard wisdom. Regards, Ryan Svihla > On Sep 11, 2016, at 3:49 PM, Jens Rantil wrote: > > Hi, > > This might be off-topic, but you could always use Zookeeper locking and/or > Apache Kafka topic keys for doing things like this. > > Cheers, > Jens >

Re: Read timeouts on primary key queries

2016-09-01 Thread Ryan Svihla
Have you looked at cfhistograms/tablehistograms your data maybe just skewed (most likely explanation is probably the correct one here) Regard, Ryan Svihla _ From: Joseph Tech Sent: Wednesday, August 31, 2016 11:16 PM Subject: Re: Read timeouts on

Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-08-29 Thread Ryan Svihla
once you go through more thorough testing, all of which I said initially and I still think is a reasonable statement. -regards, Ryan Svihla On Sat, Aug 27, 2016 at 9:31 AM -0500, "Benedict Elliott Smith" wrote: I did not claim you had no evidence, only that your stat

Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-08-27 Thread Ryan Svihla
better documentation, so the nuance is accounted for. On Friday, 26 August 2016, Ryan Svihla wrote: Forgot the most important thing. LogsERROR you should investigateWARN you should have a list of known ones. Use case dependent. Ideally you change configuration accordingly.*PoolCleaner (slab or

Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-08-26 Thread Ryan Svihla
me at present but that's a good start -regards, Ryan Svihla On Fri, Aug 26, 2016 at 7:21 AM -0500, "Ryan Svihla" wrote: Thomas, Not all metrics are KPIs and are only useful when researching a specific issue or after a use case specific threshold has been set. The main &quo

Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-08-26 Thread Ryan Svihla
o establish a baseline for when these metrics start to indicate a serious issue is occurring in that particular app. Basically when people notice a problem, what did these numbers look like in the minutes, hours and days prior? That's the way to establish the levels consistently. Regar

Re: Failure when setting up cassandra in cluster

2016-08-22 Thread Ryan Svihla
;SUPERUSER;" > > Step 10 fails with this error: > > Connection error: ('Unable to connect to any servers', {'127.0.0.1': > AuthenticationFailed(u'Failed to authenticate to 127.0.0.1: code=0100 > [Bad credentials] > message="org.apache.cassandra.exceptions.UnavailableException: Cannot > achieve consistency level QUORUM"',)}) > > > What am I missing? > > > Cheers > > Raimund > > > -- Regards, Ryan Svihla

Re: A question to updatesstables

2016-08-19 Thread Ryan Svihla
broken and the version mismatch is a false signal. Regards, Ryan Svihla > On Aug 18, 2016, at 10:18 PM, Lu, Boying wrote: > > Thanks a lot. > > I’m a little bit of confusing. If the ‘nodetool updatesstable’ doesn’t work > without Cassandra server running, > and Cassa

Re: A question to updatesstables

2016-08-18 Thread Ryan Svihla
n't if you followed the upgrade instructions properly > >> 3. What’s the best practice to void this error occurs again (e.g. >> upgrading Cassandra next time)? >> > Upgrading SSTables is required or not depending on the upgrade you're > running, basically if the SSTables layout changes you'll need to run it and > not otherwise so there's nothing you can do to avoid it > >> >> >> Thanks >> >> >> >> Boying >> > > -- Regards, Ryan Svihla

Re: Replicating Cassandra data to HDFS

2016-08-09 Thread Ryan Svihla
nds really slow...I'd be curious of your thoughts on how to do that well..maybe I'm missing something. Regards, Ryan Svihla On Aug 9, 2016, 1:13 PM -0500, Jonathan Haddad , wrote: > I'm having a hard time seeing how anyone would be able to work with CDC in > it's current

Re: Replicating Cassandra data to HDFS

2016-08-09 Thread Ryan Svihla
it's detected (similar to an event sourcing pattern, but snapshotting data down to a single record when you encounter it on a read). Best of luck, this is a corner case that requires hard tradeoffs in all technology I've encountered. Regards, Ryan Svihla On Aug 9, 2016, 12:21 PM -0500, B

Re: Replicating Cassandra data to HDFS

2016-08-09 Thread Ryan Svihla
bring it up to save you the trouble in case you end up in the same path chasing for something more 'real time'. Regards, Ryan Svihla On Aug 9, 2016, 11:09 AM -0500, Ben Vogan , wrote: > Hi all, > > We are investigating using Cassandra in our data platform. We would like da

Re: a solution of getting cassandra cross-datacenter latency at a certain time

2016-08-08 Thread Ryan Svihla
The first issue I can think of is the Latency table, if I understand you correctly, has an unbounded size for the partition key of DC and will over time just get larger as more measurements are recorded. Regards, Ryan Svihla > On Aug 8, 2016, at 2:58 AM, Stone Fang wrote: > > obje

Re: Mutation of X bytes is too large for the maximum size of Y

2016-08-03 Thread Ryan Svihla
ow to find). Regards, Ryan Svihla > On Aug 3, 2016, at 4:21 PM, Jonathan Haddad wrote: > > I haven't verified, so i'm not 100% certain, but I believe you'd get back an > exception to the client. Yes, this belongs in the DB, but I don't think > you're

Re: Mutation of X bytes is too large for the maximum size of Y

2016-08-03 Thread Ryan Svihla
by the time he deploys hit production. Would save everyone a ton of brain cells if we just logged it. Regards, Ryan Svihla > On Aug 3, 2016, at 4:21 PM, Jonathan Haddad wrote: > > I haven't verified, so i'm not 100% certain, but I believe you'd get back an > ex

Re: Mutation of X bytes is too large for the maximum size of Y

2016-08-03 Thread Ryan Svihla
Made a Jira about it already https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-12231 Regards, Ryan Svihla > On Aug 3, 2016, at 2:58 PM, Kevin Burton wrote: > > It seems these are basically impossible to track down. > > https://support.datastax.com/hc

Re: Read gets stale data after failure of commit phase in CAS operation

2016-07-24 Thread Ryan Svihla
are you using one of the SERIAL Consistency Levels? -- Ryan Svihla On July 24, 2016 at 8:08:01 PM, Yuji Ito (y...@imagine-orb.com) wrote: > Hi, > > I have another question about CAS operation. > > Can a read get stale data after failure in commit phase? > > According to

Re: My cluster shows high system load without any apparent reason

2016-07-22 Thread Ryan Svihla
You aren't using counters by chance? regards, Ryan Svihla On Jul 22, 2016, 2:00 PM -0500, Mark Rose , wrote: > Hi Garo, > > Are you using XFS or Ext4 for data? XFS is much better at deleting > large files, such as may happen after a compaction. If you have 26 TB > in just t

Re: Questions about anti-entropy repair

2016-07-22 Thread Ryan Svihla
for 3 years in a wide variety of pretty crazy situations I'm not confident I could keep a cluster healthy without running repair consistently. regards, Ryan Svihla On Jul 20, 2016, 10:32 AM -0500, daemeon reiydelle , wrote: > I don't know if my perspective on this will assist, so YMM

Re: Is my cluster normal?

2016-07-07 Thread Ryan Svihla
what version of cassandra and java? Regards, Ryan Svihla > On Jul 7, 2016, at 4:51 PM, Yuan Fang wrote: > > Yes, here is my stress test result: > Results: > op rate : 12200 [WRITE:12200] > partition rate: 12200 [WRITE:12200] > row rate

Re: Is my cluster normal?

2016-07-07 Thread Ryan Svihla
Lots of variables you're leaving out. Depends on write size, if you're using logged batch or not, what consistency level, what RF, if the writes come in bursts, etc, etc. However, that's all sort of moot for determining "normal" really you need a baseline as all those variables end up mattering

Re: What is the best way to model this JSON ??

2016-03-28 Thread Ryan Svihla
Lokesh, The modeling will change a bit depending on your queries, the rate of update and your tooling (Spring-data-cassandra makes a mess of updating collections for example). I suggest asking the Cassandra users mailing list for help since this list is for development OF Cassandra. > On Mar

Re: Keyspaces not found in cqlsh

2016-02-11 Thread Ryan Svihla
Kedar, I recommend asking the user list user@cassandra.apache.org this list is for the development of cassandra and you're more likely to find someone on the user list who may have hit this issue. Curious issue though I haven't seen that myself. Regards, Ryan Svihla > On Feb 1

Re: Missing rows while scanning table using java driver

2016-02-02 Thread Ryan Svihla
leading to this situation, can I suggest you respond on the user list with the following: - Keyspace (RF especially), data center and table configuration. - Any errors in the logs on the Cassandra nodes. Regards, Ryan Svihla > On Feb 2, 2016, at 4:58 AM, Priyanka Gugale wrote: > > I

Re: Modeling nested collection with C* 2.0

2016-01-28 Thread Ryan Svihla
Ahmed, Just using text and serializing as Json is the easy way and a common approach. However, this list is for Cassandra commiter discussion, please be so kind as to use the regular user list for data modeling questions or for any future responses to this email thread. Regards, Ryan Svihla

Re: Help diagnosing performance issue

2015-11-30 Thread Ryan Svihla
gt;>>> - 015.621 % - Nov 16 14:26 >>>> * >>>> >>>> >>>> >>>> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5957-big-Data.db >>>> >>>> >>>> - 015.558 % - Nov 16 14:50 >>>> >>>> The SSTables that come before are all at about 0% and the >>>> ones that >>>> come after it are all at about 15%. >>>> >>>> As you can see the first SSTable at 15% date back from 24h. >>>> Given my >>>> application I'm pretty sure those are not from the reads >>>> (reads of >>>> data older than 1h is definitely under 0.1% of reads). >>>> Could it be >>>> that compaction is putting those in cache constantly ? >>>> If so, then I'm probably confused on the meaning/effect of >>>> max_sstable_age_days (set at 10 in my case) and >>>> base_time_seconds >>>> (not set in my case so the default of 3600 applies). I >>>> would not >>>> expect any compaction to happen beyond the first hour and >>>> the 10 >>>> days is here to make sure data still gets expired and >>>> SSTables >>>> removed (thus releasing disk space). I don't see where the >>>> 24h come >>>> from. >>>> If you guys can shed some light on this, it would be >>>> awesome. I'm >>>> sure I got something wrong. >>>> >>>> Regarding the heap configuration, both are very similar: >>>> * 32G machine: -Xms8049M -Xmx8049M -Xmn800M >>>> * 64G machine: -Xms8192M -Xmx8192M -Xmn1200M >>>> I think we can rule that out. >>>> >>>> Thanks again for you help, I truly appreciate it. >>>> >>>> A. >>>> >>>> On 11/17/2015 08:48 PM, Robert Coli wrote: >>>> >>>> On Tue, Nov 17, 2015 at 11:08 AM, Sebastian Estevez >>>> >>> <mailto:sebastian.este...@datastax.com> >>>> <mailto:sebastian.este...@datastax.com >>>> <mailto:sebastian.este...@datastax.com>> >>>> <mailto:sebastian.este...@datastax.com >>>> <mailto:sebastian.este...@datastax.com> >>>> <mailto:sebastian.este...@datastax.com >>>> <mailto:sebastian.este...@datastax.com>>>> >>>> wrote: >>>> >>>> You're sstables are probably falling out of page >>>> cache on the >>>> smaller nodes and your slow disks are killing your >>>> latencies. >>>> >>>> >>>> +1 most likely. >>>> >>>> Are the heaps the same size on both machines? >>>> >>>> =Rob >>>> >>>> >>>> -- >>>> Antoine Bonavita (anto...@stickyads.tv >>>> <mailto:anto...@stickyads.tv> >>>> <mailto:anto...@stickyads.tv >>>> <mailto:anto...@stickyads.tv>>) - CTO StickyADS.tv >>>> Tel: +33 6 34 33 47 36 >>>> /+33 9 50 >>>> 68 21 32 >>>> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | >>>> MADRID >>>> >>>> >>>> >>>> -- >>>> Antoine Bonavita (anto...@stickyads.tv >>>> <mailto:anto...@stickyads.tv>) - CTO StickyADS.tv >>>> Tel: +33 6 34 33 47 36 /+33 9 50 >>>> 68 21 32 >>>> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID >>>> >>>> >>>> >>> >> > -- > Antoine Bonavita (anto...@stickyads.tv) - CTO StickyADS.tv > Tel: +33 6 34 33 47 36/+33 9 50 68 21 32 > NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID > -- Thanks, Ryan Svihla

Re: Cassandra Object Mapper - Dynamically pass keyspace value

2015-10-25 Thread Ryan Svihla
f files it become a big maintenance > issue > > @UDT (keyspace = "complex", name = "address")public class Address { > private String street; > private String city; > private int zipCode; > > -- Thanks, Ryan Svihla

Re: Is replication possible with already existing data?

2015-10-25 Thread Ryan Svihla
>> >>> ### >>> >>> >>> I have already tried :: >>> >>> 1) >>> Increasing driver-read-timeout from 12 seconds to 30 seconds. >>> >>> 2) >>> Increasing driver-connect-timeout from 5 seconds to 30 seconds. >>> >>> 3) >>> I have also confirmed that each of the 4 nodes are telnet-able over >>> ports 9042 and 9160 each. >>> >>> >>> Definitely seems to be some driver-issue, since >>> data-persistence/replication works perfect (with any permutation) if >>> data-persistence is done via "cqlsh". >>> >>> >>> Kindly provide some pointers. >>> Ultimately, it is the Java-driver that will be used in production, so it >>> is imperative that data-persistence/replication happens for any downing of >>> any permutation of node(s). >>> >>> >>> Thanks and Regards, >>> Ajay >>> >> >> >> >> -- >> Regards, >> Ajay >> > > > > -- > Regards, > Ajay > -- Thanks, Ryan Svihla

Re: How to read data from local cassandra cluster

2015-10-18 Thread Ryan Svihla
Sorry I forgot one / cfs:///filename On Sun, Oct 18, 2015 at 3:14 PM -0700, "Ryan Svihla" wrote: Not a Cassandra question so this isn't the right list, but you can just upload the file to CFS and then access it by the path "cfs://filename". However, s

Re: How to read data from local cassandra cluster

2015-10-18 Thread Ryan Svihla
Not a Cassandra question so this isn't the right list, but you can just upload the file to CFS and then access it by the path "cfs://filename". However, since you have DSE you may want to contact support for help with pathing in DSE using CFS and Spark. -Ryan Svihla On Fri, Oc

Re: Advice for asymmetric reporting cluster architecture

2015-10-18 Thread Ryan Svihla
n the filtered dataset. - Ryan Svihla On Sat, Oct 17, 2015 at 7:12 PM -0700, "Jack Krupansky" wrote: Yes, you can have all your normal data centers with DSE configured for real-time data access and then have a data center that shares the same data but has DSE Search (Solr in

Re: Realtime data and (C)AP

2015-10-11 Thread Ryan Svihla
t;>> despite the theoretical consistency issues. >>>> >>>> Nit-picky comment : if consistency is something important then reading >>>> at QUORUM is important. If read is ONE then the read operation *may* >>>> not see important update. The safest option is QUORUM for both write and >>>> read. Then depending on the business or feature the consistency may be >>>> tuned. >>>> >>>> — Brice >>>> ​ >>>> >>> >>> >>> >>> -- >>> Steve Robenalt >>> Software Architect >>> sroben...@highwire.org >>> (office/cell): 916-505-1785 >>> >>> HighWire Press, Inc. >>> 425 Broadway St, Redwood City, CA 94063 >>> www.highwire.org >>> >>> Technology for Scholarly Communication >>> >>> >> >> >> -- >> Steve Robenalt >> Software Architect >> sroben...@highwire.org >> (office/cell): 916-505-1785 >> >> HighWire Press, Inc. >> 425 Broadway St, Redwood City, CA 94063 >> www.highwire.org >> >> Technology for Scholarly Communication >> >> -- Thanks, Ryan Svihla

Re: High read latency

2015-09-25 Thread Ryan Svihla
n Fri, Sep 25, 2015 at 7:54 AM, Ryan Svihla <mailto:r...@foundev.pro>> wrote: > if you run: > > nodetool cfhistograms > > On the given table and that will tell you how wide your rows are getting. At > some point you can get wide enough rows that just the physics of

Re: To batch or not to batch: A question for fast inserts

2015-09-25 Thread Ryan Svihla
ting performance, looking to measure and >> optimize our ingestion rate. >> >> I side-tracked some punctual benchmarks and stumbled on the observations of >> unlogged inserts being *A LOT* faster than the async counterparts. >> >> In our tests, unlogged batch shows increased throughput and lower cluster >> CPU usage, so I'm wondering where the tradeoff might be. >> >> I compiled those observations in this document that I'm sharing and opening >> up for comments. Are we observing some artifact or should we set the record >> straight for unlogged batches to achieve better insertion throughput? >> >> https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI >> >> <https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI> >> >> Let me know. >> >> Kind regards, >> >> Gerard. >> > > Regards, > > Ryan Svihla > >

Re: memory usage problem of Metadata.tokenMap.tokenToHost

2015-09-25 Thread Ryan Svihla
In practice there are not many good reasons to use that many keyspaces and tables. If the use case is multi tenancy then you’re almost always better off just using a combination of version tables and tenantId to give you flexibility as well as separation of client data. If you have that many dat

Re: High read latency

2015-09-25 Thread Ryan Svihla
tout dommage résultant d'un virus transmis. > > This e-mail and the documents attached are confidential and intended solely > for the addressee; it may also be privileged. If you receive this e-mail in > error, please notify the sender immediately and destroy it. As its integrity > cannot be secured on the Internet, the Worldline liability cannot be > triggered for the message content. Although the sender endeavours to maintain > a computer virus-free network, the sender does not warrant that this > transmission is virus-free and will not be liable for any damages resulting > from any virus transmitted. > > > Regards, Ryan Svihla

Re: Seeing null pointer exception 2.0.14 after purging gossip state

2015-09-25 Thread Ryan Svihla
missioned from the cluster, you can see > from below exception that 10.0.0.1 has been already decommissioned. Below is > the exception snippet. > > Have you done : > > nodetool gossipinfo |grep SCHEMA |sort | uniq -c | sort -n > > and checked for schema agreement... ? > > =Rob > Regards, Ryan Svihla

Re: To batch or not to batch: A question for fast inserts

2015-09-25 Thread Ryan Svihla
ieve better insertion throughput? > > https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI > > <https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI> > > Let me know. > > Kind regards, > > Gerard. > Regards, Ryan Svihla

Re: How to tune Cassandra or Java Driver to get lower latency when there are a lot of writes?

2015-09-25 Thread Ryan Svihla
ost time. I knew the Cassandra servers are busy in > writing, but I want to know what kinds of metrics can identify where is the > bottleneck so that I can tune it. > > I’m using Cassandra 2.1.8 and Cassandra Java Driver 2.1.5. > > > Regards, Ryan Svihla

Re: Querying on multiple columns

2015-09-07 Thread Ryan Svihla
> > Please let me know if a better solution is available. I am using 2.1.5 > version. > > Regards, > Sam > -- Thanks, Ryan Svihla

Re: How to prevent queries being routed to new DC?

2015-09-07 Thread Ryan Svihla
> routed to the new dc. >>> >> >> Other than CASSANDRA-9753, this is true. >> >> https://issues.apache.org/jira/browse/CASSANDRA-9753 (Unresolved; ): >> "LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch" >> >> =Rob >> >> > -- Regards, Ryan Svihla

Re: Data Size on each node

2015-09-07 Thread Ryan Svihla
on >>> each node of the cluster is 1.2TB with spinning disk. Minor and Major >>> compactions are slowing down our Read queries. It has been suggested that >>> replacing Spinning disks with SSD might help. Has anybody done something >>> similar? If so what has been the results? >>> Also if we go with SSD, how big can each node get for commercially >>> available SSDs? >>> Regards >>> Sachin >>> >> >> > -- Regards, Ryan Svihla

Re: cassandra scalability

2015-09-07 Thread Ryan Svihla
0.0.208 128.73 KB 248 68.8% >> 6e7788f9-56bf-4314-a23a-3bf1642d0606 RAC1 >> UN 40.0.0.209 114.59 KB 249 67.8% >> 84f6f0be-6633-4c36-b341-b968ff91a58f RAC1 >> UN 40.0.0.205 129.53 KB 245 63.5% >> aa233dc2-a8ae-4c00-af74-0a119825237f RAC1 >> >> the result of the query select * from service_dictionary.table1; gave me >> 70 rows from 40.0.0.205 >> 64 from 40.0.0.209 >> 54 from 40.0.0.208 >> >> 2015-09-07 11:13 GMT+02:00 Edouard COLE : >> Could you provide the result of : >> - nodetool status >> - nodetool status YOURKEYSPACE >> >> >> > -- Regards, Ryan Svihla

Re: Convert joins in RDBMS to Cassandra

2015-09-07 Thread Ryan Svihla
on 2: * >25. >26. 1) Create a map table for every possible join. >27. >28. Drawbacks with this aproach: >29. >30. I think, this is not a right approach. So join to table (map >table) mapping idea is not right. >31. >32. pastebin link for the same: http://pastebin.com/FRAyihPT >33. Please suggest me on this. > > > > -- Thanks, Ryan Svihla

Re: Is Cassandra really Strong consistency?

2015-09-07 Thread Ryan Svihla
no matter the database technology. On Mon, Sep 7, 2015 at 6:20 AM, ibrahim El-sanosi wrote: > ""It you need strong consistency and don't mind lower transaction rate, > you're better off with base"" > I wish you can explain more how this statment relate to the my post? > Regards, > -- Thanks, Ryan Svihla

Re: Does nodetool repair stop the node to answer requests ?

2015-01-23 Thread Ryan Svihla
rectly" ? >>> >> >> I mean that if you are operating near failure, repair might trip a node >> into failure. But if you are operating correctly, repair should not. >> >> =Rob >> >> > > > > -- > Morgan SEGALIS > -- Thanks, Ryan Svihla

Re: Is there a way to add a new node to a cluster but not sync old data?

2015-01-22 Thread Ryan Svihla
sting nodes will stream all the missing data to the new >>>>> node. This will create more pressure on your cluster than just normal >>>>> bootstrapping would have. >>>>> >>>>> I can't think of any reason you'd want to do that unless you needed to >>>>> grow your cluster really quickly, and were ok with corrupting your old >>>>> data. >>>>> >>>>> On Sat, Jan 10, 2015 at 12:39 AM, Yatong Zhang >>>>> wrote: >>>>> >>>>>> Hi there, >>>>>> >>>>>> I am using C* 2.0.10 and I was trying to add a new node to a >>>>>> cluster(actually replace a dead node). But after added the new node some >>>>>> other nodes in the cluster had a very high work-load and affected the >>>>>> whole >>>>>> performance of the cluster. >>>>>> So I am wondering is there a way to add a new node and this node only >>>>>> afford new data? >>>>>> >>>>> >>>>> >>>> >>> >> > -- Thanks, Ryan Svihla

Re: C* throws OOM error despite use of automatic paging

2015-01-12 Thread Ryan Svihla
nually increased the > heap size to 8GB just to see how much heap C* consumes. With 10-15 minutes, > the heap usage climbs up to 7.6GB. That does not make sense. Either > automatic paging is not working or we are missing something. > > > > Does anybody have insights as to what could be happening? Thanks. > > > > Mohammed > > > > > > > -- Thanks, Ryan Svihla

Re: How to bulkload into a specific data center?

2015-01-08 Thread Ryan Svihla
Just noticed you'd sent this to the dev list, this is a question for only the user list, and please do not send questions of this type to the developer list. On Thu, Jan 8, 2015 at 8:33 AM, Ryan Svihla wrote: > The nature of replication factor is such that writes will go wherever &g

Re: How to bulkload into a specific data center?

2015-01-08 Thread Ryan Svihla
ytics data center as > Intialial address. > > However, I found my jobs were connecting to the REST service data center. > > How can I specify the data center? > -- Thanks, Ryan Svihla

Re: Are Triggers in Cassandra 2.1.2 performace Hog??

2015-01-07 Thread Ryan Svihla
;>>> river plugin uses select * from any table it seems to be bad performance >>>> choice. So i was thinking of inserting into elasticsearch using Cassandra >>>> trigger. >>>> So i wanted your view does a Cassandra Trigger impacts the performance >>>> of read/Write of Cassandra. >>>> >>>> Also any other way you guys achieve this please guide me. I am struck >>>> on this . >>>> >>>> Regards >>>> Asit >>>> >>>> >>> >> >> >> >> > -- Thanks, Ryan Svihla

Re:

2015-01-07 Thread Ryan Svihla
tus. And > would like to see if there is better way to design using supercolumn > families or materialized views which I am yet to explore. > > Materialized views are your friend, use them freely but as always being mindful of real world constraints and goals. > Regards, > Nageswara

Re: Re: Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-07 Thread Ryan Svihla
h delete and update use the client side > timestamp. > > The update timestamp should be always bigger than the deletion timestamp. > > > I wonder why the update failed in some cases? > > > thank you. > > > - 原始邮件 - > 发件人:Ryan Svihla > 收件人:use

Re: deletedAt and localDeletion

2015-01-06 Thread Ryan Svihla
ryFilter log. > > SliceQueryFilter.java (line 225) Read 6 live and 2688 tombstoned cells in > ks.mytable (see tombstone_warn_threshold). 10 columns was requested, > slices=[-], delInfo={deletedAt=-9223372036854775808, localDeletion= > 2147483647} > > Thanks, > -- Thanks, Ryan Svihla

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
58 PM, Ryan Svihla wrote: > as long as they know how to handle node recovery and don't inflict return > data back from the dead that was deleted. > > On Tue, Jan 6, 2015 at 12:52 PM, Robert Coli wrote: > >> On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla wrote: >> &

Re: Question about `nodetool rebuild` finsh

2015-01-06 Thread Ryan Svihla
ng between two > datacenter, but there is few network traffic on my new data center nodes. > I want to konw _how could I konw when the rebuild finsh_. > Thanks all for your reply. > > -- > All the best! > > http://luolee.me > -- Thanks, Ryan Svihla

Re: Re: Cassandra update row after delete immediately, and read that, the data not right?

2015-01-06 Thread Ryan Svihla
ndra-cli to check the data, found that column is not > exist. It seems insert partly. > My test program has 20 threads. the QPS 800 about > > What's wrong with cassandra?? > > > Thanks! > > > -- Thanks, Ryan Svihla

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
as long as they know how to handle node recovery and don't inflict return data back from the dead that was deleted. On Tue, Jan 6, 2015 at 12:52 PM, Robert Coli wrote: > On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla wrote: > >> In general today, large amounts of hints still pr

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla
rows (changing the primary key) by > deleting and inserting as a new row. This is not something we would do on a > regular basis, but after or during the process a compact would greatly help > to clear out tombstones/rewritten data. > > @Ryan Svihla it also sounds like your suggestion

Re: Queries required before data modeling?

2015-01-06 Thread Ryan Svihla
to remodel the whole stuff when I get a query which I had not > thought off? > > Regards, > Seenu. > -- Thanks, Ryan Svihla

Re: Reload/resync system.peers table

2015-01-06 Thread Ryan Svihla
gt; > Probably best to be sure to specify initial_token here, though I'm not > sure if that's just FUD talking... > > =Rob > -- Thanks, Ryan Svihla

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla
way too much garbage, and major compaction can be a good response. > The docs' historic incoherent FUD notwithstanding. > > =Rob > > -- Thanks, Ryan Svihla

Re: Can't connect to cassandra node from different host

2015-01-06 Thread Ryan Svihla
t; >>> Thank You! >>> -- >>> *Chamila Dilshan Wijayarathna,* >>> SMIEEE, SMIESL, >>> Undergraduate, >>> Department of Computer Science and Engineering, >>> University of Moratuwa. >>> >> > > > -- > *Chamila Dilshan Wijayarathna,* > SMIEEE, SMIESL, > Undergraduate, > Department of Computer Science and Engineering, > University of Moratuwa. > -- Thanks, Ryan Svihla

Re:

2015-01-06 Thread Ryan Svihla
d status = 0) >> >> The design works fine, except for the last query . Cassandra not allowing >> to query on status unless I fix the product id. I think defining a super >> column family which has the key "PRIMARY KEY((prodgroup), staus, >> productid)" should work. Would like to get expert advice on other >> alternatives. >> -- >> Thanks, >> Nageswara Rao.V >> >> *"The LORD reigns"* >> > > -- Thanks, Ryan Svihla

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Ryan Svihla
8248 to resolve > this. > > > On Tuesday, January 6, 2015, Ryan Svihla wrote: > >> Btw side note here, you're using GIANT Batches, and the logs are >> indicating such, this will cause a signficant amount of heap pressure. >> >> The root cause fix is not to use g

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
ange that I haven’t >>> thought of? >>> >> >> Not really, though 24-48 hours of hints could be an awful lot of hints. I >> personally run with at least a 6 hour max_h_w_i_m. >> >> In older versions of Cassandra, 24-48 hours of hints could hose your node >> via ineffective constant compaction. >> >> =Rob >> > > -- Thanks, Ryan Svihla

Re: Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread Ryan Svihla
to implement a interface to > replace a row in cassandra.???\ > yeah all updates are this way. Inserts are actually "UPSERTS" and you can go ahead and do two updates instead of insert, delete, update. > > > Thanks. > > -- Thanks, Ryan Svihla

Re: ttl in collections

2015-01-06 Thread Ryan Svihla
we ran into > the "tombstone_failure_threshold" limit rather quickly, having thousands of > record updates per second. That left us with a CF containing millions of > records that we couldn't "select" the way we originally intended. > > Regards, > Jens > > -- Thanks, Ryan Svihla

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Ryan Svihla
j < >>> rahul.bhard...@indiamart.com> wrote: >>> >>>> Hi Joe, >>>> >>>> PFA heap dump >>>> >>>> >>>> regards: >>>> Rahul Bhardwaj >>>> >>>> >>>> >>>> On Tue, Jan 6, 2015 at 11:35 AM, Joe Ramsey wrote: >>>> >>>>> Did you try generating a heap dump so you can look through it to see >>>>> what’s actually happened? >>>>> >>>>> >>>>> On Jan 6, 2015, at 12:58 AM, Rahul Bhardwaj < >>>>> rahul.bhard...@indiamart.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> We are using cassandra 2.1 version in a cluster of three machines each >>>>> with 64 GB RAM >>>>> >>>>> The processes are killed by kernel, coz they are eating all memory >>>>> (oom-killer). We have set JAVA heap to default (i.e. it is using 8G) >>>>> because we have 64 GB RAM. >>>>> >>>>> Please help. >>>>> >>>>> >>>>> Regards: >>>>> Rahul Bhardwaj >>>>> >>>>> >>>>> Follow IndiaMART.com <http://www.indiamart.com/> for latest updates >>>>> on this and more: <https://plus.google.com/+indiamart> >>>>> <https://www.facebook.com/IndiaMART> <https://twitter.com/IndiaMART> >>>>> Mobile Channel: >>>>> <https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641&mt=8> >>>>> <https://play.google.com/store/apps/details?id=com.indiamart.m> >>>>> <http://m.indiamart.com/> >>>>> >>>>> <https://www.youtube.com/watch?v=DzORNbeSXN8&list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1&index=2> >>>>> Watch how Irrfan Khan gets his work done in no time on IndiaMART, >>>>> kyunki Kaam Yahin Banta Hai >>>>> <https://www.youtube.com/watch?v=hmS4Afl2bNU>!!! >>>>> >>>>> >>>>> >>>> >>> >>> >>> Follow IndiaMART.com <http://www.indiamart.com/> for latest updates on >>> this and more: <https://plus.google.com/+indiamart> >>> <https://www.facebook.com/IndiaMART> <https://twitter.com/IndiaMART> >>> Mobile Channel: >>> <https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641&mt=8> >>> <https://play.google.com/store/apps/details?id=com.indiamart.m> >>> <http://m.indiamart.com/> >>> >>> <https://www.youtube.com/watch?v=DzORNbeSXN8&list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1&index=2> >>> Watch how Irrfan Khan gets his work done in no time on IndiaMART, kyunki >>> Kaam >>> Yahin Banta Hai <https://www.youtube.com/watch?v=hmS4Afl2bNU>!!! >>> >>> >>> >>> >> >> >> Follow IndiaMART.com <http://www.indiamart.com/> for latest updates on >> this and more: <https://plus.google.com/+indiamart> >> <https://www.facebook.com/IndiaMART> <https://twitter.com/IndiaMART> >> Mobile Channel: >> <https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641&mt=8> >> <https://play.google.com/store/apps/details?id=com.indiamart.m> >> <http://m.indiamart.com/> >> >> <https://www.youtube.com/watch?v=DzORNbeSXN8&list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1&index=2> >> Watch how Irrfan Khan gets his work done in no time on IndiaMART, kyunki Kaam >> Yahin Banta Hai <https://www.youtube.com/watch?v=hmS4Afl2bNU>!!! >> >> >> > > > Follow IndiaMART.com <http://www.indiamart.com> for latest updates on > this and more: <https://plus.google.com/+indiamart> > <https://www.facebook.com/IndiaMART> <https://twitter.com/IndiaMART> > Mobile Channel: > <https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641&mt=8> > <https://play.google.com/store/apps/details?id=com.indiamart.m> > <http://m.indiamart.com/> > > <https://www.youtube.com/watch?v=DzORNbeSXN8&list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1&index=2> > Watch how Irrfan Khan gets his work done in no time on IndiaMART, kyunki Kaam > Yahin Banta Hai <https://www.youtube.com/watch?v=hmS4Afl2bNU>!!! > -- Thanks, Ryan Svihla

Re: CQL3 vs Thrift

2014-12-24 Thread Ryan Svihla
; of static information). >> >> More generally, do you find that tuned applications tend to use Thrift, a >> combination of Thrift and CQL3, or is CQL3 really expected to replace >> Thrift? >> >> Thanks again! >> >> On Mon, Dec 22, 2014 at 9:50 PM, Ryan S

Re: CQL3 vs Thrift

2014-12-24 Thread Ryan Svihla
cations tend to use Thrift, a > combination of Thrift and CQL3, or is CQL3 really expected to replace > Thrift? > > Thanks again! > > On Mon, Dec 22, 2014 at 9:50 PM, Ryan Svihla wrote: > >> Don't static columns get you what you want? >> >> >> ht

Re: [Cassandra] [Generation of SStableLoader slow]

2014-12-24 Thread Ryan Svihla
there > is other way to make it faster except adding CPUs and ram. > > *Best Regards!* > > > *Chao Yan--**My twitter:Andy Yan @yanchao727 > <https://twitter.com/yanchao727>* > > > *My Weibo:http://weibo.com/herewearenow > <http://weibo.com/herew

Re: Tombstones without DELETE

2014-12-24 Thread Ryan Svihla
; Thanks! > > To unsubscribe from this group and stop receiving emails from it, send an > email to java-driver-user+unsubscr...@lists.datastax.com. > -- [image: datastax_logo.png] <http://www.datastax.com/> Ryan Svihla Solution Architect [image: twitter.png] <https:

Re: 答复:

2014-12-24 Thread Ryan Svihla
Any suggestions? > > -- > > Atenciosamente, > Sávio S. Teles de Oliveira > > voice: +55 62 9136 6996 > http://br.linkedin.com/in/savioteles > > Mestrando em Ciências da Computação - UFG > Arquiteto de Software > > CUIA Internet Brasil > --

Re: [Merging data from memtables and 1 sstables] takes too much time.

2014-12-24 Thread Ryan Svihla
68aa211 and location_id = >> 2c269ea4-dbfd-32dd-9bd7-a5c22677d18b and user_serial_number = >> 'uI2201'; >> >> some times queries like above complete in 3-4 milliseconds, however >> few times they take around 80-90 milliseconds. The data is around 1 >> m

Re: [Cassandra] [Generation of SStableLoader slow]

2014-12-24 Thread Ryan Svihla
I think that'd be slow copying large files with just the cp command. Cassandra isn't doing anything amazingly strange here, you don't have a lot of RAM, nor CPU and I'm assuming the underlying disk is slow here as well. Without more parameters and details it's hard to define if there is an issue.

Re: CQL3 vs Thrift

2014-12-22 Thread Ryan Svihla
Don't static columns get you what you want? http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refStaticCol.html On Dec 22, 2014 10:50 PM, "David Broyles" wrote: > Although I used Cassandra 1.0.X extensively, I'm new to CQL3. Pages such > as http://wiki.apache.org/cassandra/Client

Re: Store counter with non-counter column in the same column family?

2014-12-22 Thread Ryan Svihla
gt; multiple tables for different query paths and solr. If I switch to Spark, > do I still needs to use counter or counting will be done by spark on > regular table? > > On Tue, Dec 23, 2014 at 11:31 AM, Ryan Svihla > wrote: > >> increment wouldn't be idempotent from the

Re: Store counter with non-counter column in the same column family?

2014-12-22 Thread Ryan Svihla
counts in different queries) I don't need a 100% accurate count > and strong consistency. Performance and application complexity is my main > concern. > > Thanks > > On Mon, Dec 22, 2014 at 10:37 PM, Ryan Svihla > wrote: > >> You can cheat it by using

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
th. So I didn't set rpc_addresa. Will double check > tomorrow. Thanks. > On Dec 22, 2014 9:17 PM, "Ryan Svihla" wrote: > >> if this helps..what did you change rpc_address to? >> >> On Mon, Dec 22, 2014 at 8:15 PM, Ryan Svihla >> wrote: >> >&

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
onnect 127.0.0.1:9042. > > On Mon, Dec 22, 2014 at 9:01 PM, Ryan Svihla wrote: > >> totally depends on how the implementation is handled in virtualbox, I'm >> assuming you're connecting to an IP that makes sense on the guest (ie >> nodetool -h 192.168.1.100 and

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
if this helps..what did you change rpc_address to? On Mon, Dec 22, 2014 at 8:15 PM, Ryan Svihla wrote: > right that's localhost, you have to change it to match the ip of whatever > you changed rpc_address too > > On Mon, Dec 22, 2014 at 8:07 PM, Kai Wang wrote: > >>

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
a summit 2014 but >> didn't realize it applied in my case. >> >> Thanks. >> >> On Mon, Dec 22, 2014 at 4:43 PM, Ryan Svihla >> wrote: >> >>> what is rpc_address set to in cassandra.yaml? my gut is localhost, set >>> it to the interf

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
age: datastax_logo.png] <http://www.datastax.com/> Ryan Svihla Solution Architect [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png] <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> DataStax is the fastest, most scalable distributed database technology,

Re: CF performance suddenly degraded

2014-12-22 Thread Ryan Svihla
There can be many root causes. Would need a lot more information such as node hardware specs, cf histograms on the table, tpstats,GC settings (Max heap, parnew, JVM version) and logs with specifically any ERROR, WARN, or GCInspector messages As a start a simple trace of the query in question is pr

Re: Multi DC informations (sync)

2014-12-22 Thread Ryan Svihla
cross regions is to small, the write will fail and so will the >>> HH, potentially. How to detect we are lacking of throughput cross DC for >>> example ? >>> >>> Repairs are indeed a good thing (we run them as a weekly routine, GC >>> grace period 10 sec),

Re: installing cassandra

2014-12-22 Thread Ryan Svihla
. > Those aren't easier until you are operating at a scale where you need to be > able to automate adding new nodes. > > > On Sun, Dec 21, 2014, 8:05 AM Ryan Svihla wrote: > >> Puppet, Chef, Ansible and I'm sure many others. I've personally worked >>

Re: Store counter with non-counter column in the same column family?

2014-12-22 Thread Ryan Svihla
You can cheat it by using the non counter column as part of your primary key (clustering column specifically) but the cases where this could work are limited and the places this is a good idea are even more rare. As for using counters in batches are already a not well regarded concept and counter

Re: Replacing nodes disks

2014-12-21 Thread Ryan Svihla
;>> replicas. >>>>> >>>>> What do you think should be the procedure here? >>>>> >>>>> I'm guessing it should be something like this but I'm pretty sure it's >>>>> not enough. >>>>>

Re: installing cassandra

2014-12-21 Thread Ryan Svihla
a on my > cluster or is there some sort of built in network wide deployment in the > install process already? > > B. > -- [image: datastax_logo.png] <http://www.datastax.com/> Ryan Svihla Solution Architect [image: twitter.png] <https://twitter.com/foundev> [image: li

  1   2   >