Re: [External]Cassandra 5.0: Any Official Tests Supporting 'Free Performance Gains'

2025-04-05 Thread Patrick McFadin
video here: > https://youtu.be/eKxj6s4vzmI?list=PLqcm6qE9lgKKls90MlpejceYUU_0qVnWa&t=2075 > > All told, Branimir's work here is a Big Deal. We really should invest the > time in a blog post with more clarity on how impactful these changes are > for data density and performance;

Re: [External]Cassandra 5.0: Any Official Tests Supporting 'Free Performance Gains'

2025-03-20 Thread Brebner, Paul via user
Jon, that’s cool! Please consider submitting to Community over Code – here’s the CFP for the Performance Engineering track just up: https://www.linkedin.com/posts/paul-brebner-0a547b4_call-for-papers-for-the-7th-community-over-activity-7308439109490917376-tugf?utm_source=share&utm_me

Re: [External]Cassandra 5.0: Any Official Tests Supporting 'Free Performance Gains'

2025-03-20 Thread Jon Haddad
I’m not sure if i shared this to the user list… I’m doing a massive series on C* 5.0 performance and how it relates to node density and cost. First post is up now. http://rustyrazorblade.com/post/2025/03-streaming/ The benefit any given feature depends on a lot of factors. Hardware and workload

Re: [External]Cassandra 5.0: Any Official Tests Supporting 'Free Performance Gains'

2025-03-20 Thread Josh McKenzie
clarity on how impactful these changes are for data density and performance; thanks for raising this question as it helps clarify that. ~Josh On Wed, Mar 19, 2025, at 10:35 AM, Jiri Steuer (EIT) wrote: > Hi FMH, > > I haven't seen these official tests and that was the reason I did the

RE: [External]Cassandra 5.0: Any Official Tests Supporting 'Free Performance Gains'

2025-03-19 Thread Jiri Steuer (EIT)
FMH Sent: Wednesday, March 19, 2025 3:14 PM To: Cassandra Support-user Subject: [External]Cassandra 5.0: Any Official Tests Supporting 'Free Performance Gains' This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links and attachments. Please report all suspicio

Cassandra 5.0: Any Official Tests Supporting 'Free Performance Gains'

2025-03-19 Thread FMH
nd storage efficiency, providing a "free" performance" I have only found a single doc show-casing empirical evidence for such performance gains. As per this document, compared to version 4.1, C* 5 had ... - 38% better performance and 26% better response time for write operations - 12% better

Re: Query on Performance Dip

2024-04-05 Thread Jon Haddad
Subroto Barua wrote: > follow up question on performance issue with 'counter writes'- is there a > parameter or condition that limits the allocation rate for > 'CounterMutationStage'? I see 13-18mb/s for 4.1.4 Vs 20-25mb/s for 4.0.5. > > The back-end infra is

Re: Query on Performance Dip

2024-04-05 Thread Subroto Barua via user
follow up question on performance issue with 'counter writes'- is there a parameter or condition that limits the allocation rate for 'CounterMutationStage'? I see 13-18mb/s for 4.1.4 Vs 20-25mb/s for 4.0.5. The back-end infra is same for both the clusters and same test case

Re: Query on Performance Dip

2024-03-30 Thread Jon Haddad
s / second you're executing and there's no historical information to suggest if what you're seeing now is an anomaly or business as usual. If you want to determine if your theory that speculative retries are causing your performance issue, then you could try changing speculati

Re: Query on Performance Dip

2024-03-30 Thread ranju goel
Hi All, On debugging the cluster for performance dip seen while using 4.1.4, i found high speculation retries Value in nodetool tablestats during read operation. I ran the below tablestats command and checked its output after every few secs and noticed that retries are on rising side. Also

Re: Query on Performance Dip

2024-03-27 Thread Subroto Barua via user
that if this could cause a performance degradation in 4.1 without changing compactionThroughput. As seeing performance dip in Read/Write after upgrading from 4.0 to 4.1. RegardsRanju

Query on Performance Dip

2024-03-27 Thread ranju goel
Hi All, Was going through this mail chain (https://www.mail-archive.com/user@cassandra.apache.org/msg63564.html) and was wondering that if this could cause a performance degradation in 4.1 without changing compactionThroughput. As seeing performance dip in Read/Write after upgrading from 4.0 to

CFP for the 2nd Performance Engineering track at Community over Code NA 2023

2023-07-03 Thread Brebner, Paul
Hi Apache Cassandra people - There are only 10 days left to submit a talk proposal (title and abstract only) for Community over Code NA 2023! The 2nd Performance Engineering track is on this year so any Apache project-related performance and scalability talks are welcome, here's the trac

RE: about the performance of select * from tbl

2022-04-26 Thread Durity, Sean R
. Durity From: Joe Obernberger Sent: Tuesday, April 26, 2022 1:10 PM To: user@cassandra.apache.org; 18624049226 <18624049...@163.com> Subject: [EXTERNAL] Re: about the performance of select * from tbl This would be a good use case for Spark + Cassandra. -Joe On 4/26/2022 8:48 AM, 18624

Re: about the performance of select * from tbl

2022-04-26 Thread Joe Obernberger
million or more, what methods or parameters can improve the performance of this CQL? -- This email has been checked for viruses by AVG. https://www.avg.com

Re: about the performance of select * from tbl

2022-04-26 Thread Jeff Jirsa
ss scenario. We must execute the following statement: >> >> select * from tbl; >> >> This CQL has no WHERE condition. >> >> What I want to ask is that if the data in this table is more than one >> million or more, what methods or parameters can improve the performance of >> this CQL? >> >

Re: about the performance of select * from tbl

2022-04-26 Thread 18624049226
* from tbl; This CQL has no WHERE condition. What I want to ask is that if the data in this table is more than one million or more, what methods or parameters can improve the performance of this CQL?

Re: about the performance of select * from tbl

2022-04-26 Thread Dor Laor
is CQL has no WHERE condition. > > What I want to ask is that if the data in this table is more than one > million or more, what methods or parameters can improve the performance of > this CQL? >

about the performance of select * from tbl

2022-04-26 Thread 18624049226
We have a business scenario. We must execute the following statement: select * from tbl; This CQL has no WHERE condition. What I want to ask is that if the data in this table is more than one million or more, what methods or parameters can improve the performance of this CQL?

Apache Cassandra performance tuning - call for contribution

2022-02-09 Thread Daniel Seybold
Dear Apache Cassandra community, we plan to run a large case performance study for Apache Cassandra and MongoDB where the focus is not to compare both systems directly but to answer the question: /how much performance can you get out each DBMS with an optimal configuration compared to the

Mutation dropped and Read-Repair performance issue

2020-12-19 Thread sunil pawar
Hi All, We are facing problems of failure of Read-Repair stages with error Digest Mismatch and count is 300+ per day per node. At the same time, we are experiencing node is getting overloaded for a quick couple of seconds due to long GC pauses (of around 7-8 seconds). We are not running a repair o

Re: Multi DCs vs Single DC performance

2020-07-28 Thread onmstester onmstester
onmstester <mailto:onmstes...@zoho.com.invalid> wrote: Hi, Logically, i do not need to use multiple DCs(cluster is not geographically separated), but i wonder if splitting the cluster to two half (two separate dc) would decrease overhead of node ack/communication and result in better (wri

Re: Multi DCs vs Single DC performance

2020-07-28 Thread Jeff Jirsa
uster to two half (two separate > dc) would decrease overhead of node ack/communication and result in better > (write) performance? > > Sent using Zoho Mail <https://www.zoho.com/mail/> > > > > > >

Multi DCs vs Single DC performance

2020-07-28 Thread onmstester onmstester
Hi, Logically, i do not need to use multiple DCs(cluster is not geographically separated), but i wonder if splitting the cluster to two half (two separate dc) would decrease overhead of node ack/communication and result in better (write) performance? Sent using https://www.zoho.com/mail/

Re: Impact of enabling authentication on performance

2020-06-04 Thread Sam Tunnicliffe
DSR> From: Jeff Jirsa mailto:jji...@gmail.com>> > DSR> Sent: Tuesday, June 2, 2020 2:39 AM > DSR> To: user@cassandra.apache.org <mailto:user@cassandra.apache.org> > DSR> Subject: [EXTERNAL] Re: Impact of enabling authentication on performance > > DS

Re: Impact of enabling authentication on performance

2020-06-03 Thread Gil Ganz
rmissions is picked up (usually less). > > > DSR> Sean Durity > > DSR> -Original Message- > DSR> From: Jeff Jirsa > DSR> Sent: Tuesday, June 2, 2020 2:39 AM > DSR> To: user@cassandra.apache.org > DSR> Subject: [EXTERNAL] Re: Impact of enabling

Re: Impact of enabling authentication on performance

2020-06-03 Thread Alex Ott
> Subject: [EXTERNAL] Re: Impact of enabling authentication on performance DSR> Set the Auth cache to a long validity DSR> Don’t go crazy with RF of system auth DSR> Drop bcrypt rounds if you see massive cpu spikes on reconnect storms >> On Jun 1, 2020, at 11:26 PM, Gil Ganz

RE: Impact of enabling authentication on performance

2020-06-02 Thread Durity, Sean R
authentication on performance Set the Auth cache to a long validity Don’t go crazy with RF of system auth Drop bcrypt rounds if you see massive cpu spikes on reconnect storms > On Jun 1, 2020, at 11:26 PM, Gil Ganz wrote: > >  > Hi > I have a production 3.11.6 cluster which I'm

Re: Impact of enabling authentication on performance

2020-06-01 Thread Jeff Jirsa
> authentication in, I'm trying to understand what will be the performance > impact, if any. > I understand each use case might be different, trying to understand if there > is a common % people usually see their performance hit, or if s

Impact of enabling authentication on performance

2020-06-01 Thread Gil Ganz
Hi I have a production 3.11.6 cluster which I'm might want to enable authentication in, I'm trying to understand what will be the performance impact, if any. I understand each use case might be different, trying to understand if there is a common % people usually see their performance

Re: Performance drop of current Java drivers

2020-05-07 Thread Matthias Pfau
Perfect, thanks for looking into this! Best, Matthias 5. Mai 2020, 20:01 von erik.mer...@datastax.com: > Matthias, > > Thanks for sharing your findings and test code. We were able to track this to > a regression in the underlying Netty library and already have a similar issue > reported here:

Re: Performance drop of current Java drivers

2020-05-05 Thread Erik Merkle
Matthias, Thanks for sharing your findings and test code. We were able to track this to a regression in the underlying Netty library and already have a similar issue reported here: https://datastax-oss.atlassian.net/browse/JAVA-2676 The regression seems to be with the upgrade to Netty version 4.1

Re: Performance drop of current Java drivers

2020-05-04 Thread Matthias Pfau
Hi Chris and Adam, thanks for looking into this! You can find my tests for old/new client here: https://gist.github.com/mpfau/7905cea3b73d235033e4f3319e219d15 https://gist.github.com/mpfau/a62cce01b83b56afde0dbb588470bc18 May 1, 2020, 16:22 by adam.holmb...@datastax.com: > Also, if you can shar

Re: Performance drop of current Java drivers

2020-05-01 Thread Adam Holmberg
Also, if you can share your schema and benchmark code, that would be a good start. On Fri, May 1, 2020 at 7:09 AM Chris Splinter wrote: > Hi Matthias, > > I have forwarded this to the developers that work on the Java driver and > they will be looking into this first thing next week. > > Will cir

Re: Performance drop of current Java drivers

2020-05-01 Thread Chris Splinter
Hi Matthias, I have forwarded this to the developers that work on the Java driver and they will be looking into this first thing next week. Will circle back here with findings, Chris On Fri, May 1, 2020 at 12:28 AM Erick Ramirez wrote: > Matthias, I don't have an answer to your question but I

Re: Performance drop of current Java drivers

2020-04-30 Thread Erick Ramirez
Matthias, I don't have an answer to your question but I just wanted to note that I don't believe the driver contributors actively watch this mailing list (I'm happy to be corrected 🙂 ) so I'd recommend you cross-post in the Java driver channels as well. Cheers!

Performance drop of current Java drivers

2020-04-30 Thread Matthias Pfau
Hi there, I just did some testing with latest 3.x and 4.x version of the java driver. While async performance seems to be fine, sync performance degraded significantly with version 4.x. Reading 10.000 small columns from a local cassandra instance took: * around 5 seconds with the old driver

Re: Performance of Data Types used for Primary keys

2020-03-06 Thread Reid Pinchback
ernal)" Reply-To: "user@cassandra.apache.org" Date: Friday, March 6, 2020 at 5:15 AM To: "user@cassandra.apache.org" Subject: Performance of Data Types used for Primary keys Message from External Sender Hi Cassandra folks, Is there any difference in performance of gene

RE: [EXTERNAL] Re: Performance of Data Types used for Primary keys

2020-03-06 Thread Durity, Sean R
I agree. Cassandra already hashes the partition key to a numeric token. Sean Durity From: Jon Haddad Sent: Friday, March 6, 2020 9:29 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Re: Performance of Data Types used for Primary keys It's not going to matter at all. On Fri, Mar 6,

Re: Performance of Data Types used for Primary keys

2020-03-06 Thread Jon Haddad
It's not going to matter at all. On Fri, Mar 6, 2020, 2:15 AM Hanauer, Arnulf, Vodacom South Africa (External) wrote: > Hi Cassandra folks, > > > > Is there any difference in performance of general operations if using a > TEXT based Primary key versus a BIGINT Primary ke

Performance of Data Types used for Primary keys

2020-03-06 Thread Hanauer, Arnulf, Vodacom South Africa (External)
Hi Cassandra folks, Is there any difference in performance of general operations if using a TEXT based Primary key versus a BIGINT Primary key. Our use-case requires low latency reads but currently the Primary key is TEXT based but the data could work on BIGINT. We are trying to optimise where

cassandra collection best practices and performance

2020-01-07 Thread onmstester onmstester
Sweet spot for set and list items count (in datastax's documents, the max is 2billions)? Write and read performance of Set vs List vs simple partition row? Thanks in advance

RE: [EXTERNAL] performance

2019-12-02 Thread Durity, Sean R
I’m not sure this is the fully correct question to ask. The size of the data will matter. The importance of high availability matters. Performance can be tuned by taking advantage of Cassandra’s design strengths. In general, you should not be doing queries with a where clause on non-key columns

performance

2019-11-29 Thread hahaha sc
Query based on a field with a non-primary key and a secondary index, and then update based on the primary key. Can it be more efficient than mysql?

Re: Configurations for better performance

2019-10-17 Thread Max C .
I haven’t watched it yet, but John Haddad did a talk on performance optimization at the Datastax accelerate conference (and another talk a year/two before): 10 Easy Ways to Tune Your Cassandra Cluster with John Haddad | DataStax Accelerate 2019 https://www.youtube.com/watch?v=swL7bCnolkU

Configurations for better performance

2019-10-16 Thread Ramnatthan Alagappan
ation settings people usually run or change to optimize for performance. Is there a specific configuration in which you set knobs to values other than the defaults for better performance? (e.g., higher row_cache_size_in_mb for read-heavy workloads etc.) Thank

Re: Performance impact with ALLOW FILTERING clause.

2019-08-17 Thread Devopam Mittra
gt; > > > I was going thru documentation and saw at many places saying ALLOW > FILTERING causes performance unpredictability. Our developers says ALLOW > FILTERING clause is implicitly added on bunch of queries by spark-Cassandra > connector and they cannot control it; ho

Re: Performance impact with ALLOW FILTERING clause.

2019-08-17 Thread Alex Ott
t; Date: Thursday 25 July 2019 at 15:49 JB> To: "user@cassandra.apache.org" JB> Subject: Performance impact with ALLOW FILTERING clause. JB> Hello Folks, JB> I was going thru documentation and saw at many places saying ALLOW FILTERING causes performance unpredictabil

Re: Cheat Sheet for Unix based OS, Performance troubleshooting

2019-07-30 Thread Julien Laurenceau
crit : > I have always found Amy's Cassandra 2.1 tuning guide great for the Linux > performance tuning: > https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html > > Sent from my iPhone > > On 26 Jul 2019, at 23:49, Krish Donald wrote: > > Any one has Cheat Sheet for Unix based OS, Performance troubleshooting ? > >

CDC enabled settings and performance impact

2019-07-29 Thread Krish Donald
any performance impact you have seen? Should we keep cdc_raw_directory on a different volume than data volume? Thanks Krish

Re: Cheat Sheet for Unix based OS, Performance troubleshooting

2019-07-28 Thread Jon Haddad
http://www.brendangregg.com/linuxperf.html On Sat, Jul 27, 2019 at 2:45 AM Paul Chandler wrote: > I have always found Amy's Cassandra 2.1 tuning guide great for the Linux > performance tuning: > https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html > > Sent from

Re: Cheat Sheet for Unix based OS, Performance troubleshooting

2019-07-27 Thread Paul Chandler
I have always found Amy's Cassandra 2.1 tuning guide great for the Linux performance tuning: https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html Sent from my iPhone > On 26 Jul 2019, at 23:49, Krish Donald wrote: > > Any one has Cheat Sheet for Unix based O

Cheat Sheet for Unix based OS, Performance troubleshooting

2019-07-26 Thread Krish Donald
Any one has Cheat Sheet for Unix based OS, Performance troubleshooting ?

Re: Performance impact with ALLOW FILTERING clause.

2019-07-26 Thread Christian Lorenz
atum: Donnerstag, 25. Juli 2019 um 20:05 An: "user@cassandra.apache.org" Betreff: RE: Performance impact with ALLOW FILTERING clause. Thank you all for your insights. When spark-connector adds allows filtering to a query, it makes the query to just ‘run’ no matter if it is expensive for lar

Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jon Haddad
luence connector not to use > allow filtering. > > > > Thanks again. > > Asad > > > > > > > > *From:* Jeff Jirsa [mailto:jji...@gmail.com] > *Sent:* Thursday, July 25, 2019 10:24 AM > *To:* cassandra > *Subject:* Re: Performance impact with ALLOW FILTERING cl

RE: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread ZAIDI, ASAD A
Jirsa [mailto:jji...@gmail.com] Sent: Thursday, July 25, 2019 10:24 AM To: cassandra Subject: Re: Performance impact with ALLOW FILTERING clause. "unpredictable" is such a loaded word. It's quite predictable, but it's often mispredicted by users. "ALLOW FILTERING"

Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jeff Jirsa
and do the filtering / joining / searching in-memory in spark, rather than relying on cassandra to do the scanning/searching on disk. On Thu, Jul 25, 2019 at 6:49 AM ZAIDI, ASAD A wrote: > Hello Folks, > > > > I was going thru documentation and saw at many places saying ALLOW > FIL

Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jacques-Henri Berthemet
. Regards, Jacques-Henri Berthemet From: "ZAIDI, ASAD A" Reply to: "user@cassandra.apache.org" Date: Thursday 25 July 2019 at 15:49 To: "user@cassandra.apache.org" Subject: Performance impact with ALLOW FILTERING clause. Hello Folks, I was going thru documentation

Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread ZAIDI, ASAD A
Hello Folks, I was going thru documentation and saw at many places saying ALLOW FILTERING causes performance unpredictability. Our developers says ALLOW FILTERING clause is implicitly added on bunch of queries by spark-Cassandra connector and they cannot control it; however at the same time

Re: Questions about C* performance related to tombstone

2019-04-10 Thread Li, George
= 'C' AND assignment_id = 'A2';" be > affected too? > > For query "SELECT * FROM myTable WHERE course_id = 'C';", to workaround > the tombstone problem, we are thinking about not doing hard deletes, > instead doing soft deletes. So instead

Re: Questions about C* performance related to tombstone

2019-04-10 Thread Alok Dwivedi
t;, to workaround the >> tombstone problem, we are thinking about not doing hard deletes, instead >> doing soft deletes. So instead of doing "DELETE FROM myTable WHERE course_id >> = 'C' AND assignment_id = 'A1';", we do "UPDATE myTable SET activ

Re: Questions about C* performance related to tombstone

2019-04-09 Thread Jon Haddad
und the > tombstone problem, we are thinking about not doing hard deletes, instead > doing soft deletes. So instead of doing "DELETE FROM myTable WHERE course_id > = 'C' AND assignment_id = 'A1';", we do "UPDATE myTable SET active = false > WHERE cour

Re: How to monitor datastax driver compression performance?

2019-04-09 Thread Jon Haddad
tlp-stress has support for customizing payloads, but it's not documented very well. For a given data model (say the KeyValue one), you can override what tlp-stress will send over. By default it's pretty small, a handful of bytes. If you pass --field.keyvalue.value (the table name + the field nam

Questions about C* performance related to tombstone

2019-04-09 Thread Li, George
in the application, we do query "SELECT * FROM myTable WHERE course_id = 'C';" and filter out records that have "active" equal to "false". I am not really sure this would improve performance because C* still has to scan through all records with the partition

Re: How to monitor datastax driver compression performance?

2019-04-09 Thread Gabriel Giussi
tlp-stress allow us to define size of rows? Because I will see the benefit of compression in terms of request rates only if the compression ratio is significant, i.e. requires less network round trips. This could be done generating bigger partitions with parameters -n and -p, i.e. decreasing the -p

Re: How to monitor datastax driver compression performance?

2019-04-08 Thread Jon Haddad
If it were me, I'd look at raw request rates (in terms of requests / second as well as request latency), network throughput and then some flame graphs of both the server and your application: https://github.com/jvm-profiling-tools/async-profiler. I've created an issue in tlp-stress to add compress

How to monitor datastax driver compression performance?

2019-04-08 Thread Gabriel Giussi
Hi, I'm trying to test if adding driver compression will bring me any benefit. I understand that the trade-off is less bandwidth but increased CPU usage in both cassandra nodes (compression) and client nodes (decompression) but I want to know what are the key metrics and how to monitor them to prob

Re: Does long latency affect Cassandra's performance

2018-12-15 Thread Nitan Kainth
e not local quorum. > does the long latency will block coordinate and affect the performance? > > dayu > > > dayu > 邮箱:sdycre...@163.com > Signature is customized by Netease Mail Master > > On 12/16/2018 02:00, Nitan Kainth wrote: > Dayu, > > If you

Re: Does long latency affect Cassandra's performance

2018-12-15 Thread dayu
Nitan, thanks for your reply. The new node in different DC is added as same DC, so i mean I use quorum or all for three replicate not local quorum. does the long latency will block coordinate and affect the performance? dayu | | dayu 邮箱:sdycre...@163.com | Signature is customized

Re: Does long latency affect Cassandra's performance

2018-12-15 Thread Nitan Kainth
; Hi all, > I am adding a new node to cassandra,but the new node is at different > DC,the ping latency between them is 2 ms. >I am wondering does the long latency would affect cluster's performance > or thoughput? > > Thank! > dayu > > dayu > 邮箱:sdycre...@163.c

Does long latency affect Cassandra's performance

2018-12-15 Thread dayu
Hi all, I am adding a new node to cassandra,but the new node is at different DC,the ping latency between them is 2 ms. I am wondering does the long latency would affect cluster's performance or thoughput? Thank! dayu | | dayu 邮箱:sdycre...@163.com | Signature is customiz

Re: High IO and poor read performance on 3.11.2 cassandra cluster

2018-09-11 Thread Elliott Sims
cache, and it rarely goes to disk Load is low enough that the read I/O amplification doesn't hurt performance Less likely but still possible is that there's a subtle difference in the way that 2.1 does reads vs 3.x that's affecting it. The less subtle explanation is that 3.x h

Re: High IO and poor read performance on 3.11.2 cassandra cluster

2018-09-09 Thread Laxmikant Upadhyay
Thank you so much Alexander ! Your doubt was right. It was due to the very high value of readahead only (4 mb). Although We had set readahead value to 8kb in our /etc/rc.local but some how this was not working. we are keeping the value to 64 kb as we this is giving better performance than 8kb

Re: High IO and poor read performance on 3.11.2 cassandra cluster

2018-09-04 Thread Alexander Dejanovski
>> -Simon >> >> *From:* Laxmikant Upadhyay >> *Date:* 2018-09-05 01:01 >> *To:* user >> *Subject:* High IO and poor read performance on 3.11.2 cassandra cluster >> >> We have 3 node cassandra cluster (3.11.2) in single dc. >> >> We

Re: High IO and poor read performance on 3.11.2 cassandra cluster

2018-09-04 Thread CPC
> *Subject:* High IO and poor read performance on 3.11.2 cassandra cluster > We have 3 node cassandra cluster (3.11.2) in single dc. > > We have written 450 million records on the table with LCS. The write > latency is fine. After write we perform read and update operations. >

Re: High IO and poor read performance on 3.11.2 cassandra cluster

2018-09-04 Thread wxn...@zjqunshuo.com
How large is your row? You may meet reading wide row problem. -Simon From: Laxmikant Upadhyay Date: 2018-09-05 01:01 To: user Subject: High IO and poor read performance on 3.11.2 cassandra cluster We have 3 node cassandra cluster (3.11.2) in single dc. We have written 450 million records on

High IO and poor read performance on 3.11.2 cassandra cluster

2018-09-04 Thread Laxmikant Upadhyay
the read latency and io usage is under control. However when we perform read+update on old 1 million records which are part of 450 million records we observe high read latency (The performance goes down by 4 times in comparison 1st case ). We have not observed major gc pauses. *system information

Re: Improve data load performance

2018-08-15 Thread Elliott Sims
>> things up. >> >> >> >> >> >> Sean Durity >> >> >> >> *From:* Elliott Sims >> *Sent:* Wednesday, August 15, 2018 1:13 PM >> *To:* user@cassandra.apache.org >> *Subject:* [EXTERNAL] Re: Improve data load performance &g

Re: Improve data load performance

2018-08-15 Thread Abdul Patel
latency?) > > > > On the client side, prepared statements and ExecuteAsync can really speed > things up. > > > > > > Sean Durity > > > > *From:* Elliott Sims > *Sent:* Wednesday, August 15, 2018 1:13 PM > *To:* user@cassandra.apache.org > *Subject

RE: [EXTERNAL] Re: Improve data load performance

2018-08-15 Thread Durity, Sean R
client side, prepared statements and ExecuteAsync can really speed things up. Sean Durity From: Elliott Sims Sent: Wednesday, August 15, 2018 1:13 PM To: user@cassandra.apache.org Subject: [EXTERNAL] Re: Improve data load performance Step one is always to measure your bottlenecks. Are you

Re: Improve data load performance

2018-08-15 Thread Abdul Patel
I didnt see any such bottlenecks , they are testing to write json file as an text in cassandra which is slow ..rest of performance looks good? Regarding write threads where i can chexk how many configured and if there is bittleneck? On Wednesday, August 15, 2018, Elliott Sims wrote: > Step

Re: Improve data load performance

2018-08-15 Thread Elliott Sims
> >> I hope I am making it clear. Don't take it personally. >> >> Thanks >> >> On Wed, Aug 15, 2018 at 8:25 AM Abdul Patel wrote: >> >>> How can we improve data load performance? >> >>

Re: Improve data load performance

2018-08-15 Thread Abdul Patel
> > Thanks > > On Wed, Aug 15, 2018 at 8:25 AM Abdul Patel wrote: > >> How can we improve data load performance? > >

Re: Improve data load performance

2018-08-14 Thread @Nandan@
you getting and all.. I hope I am making it clear. Don't take it personally. Thanks On Wed, Aug 15, 2018 at 8:25 AM Abdul Patel wrote: > How can we improve data load performance?

Improve data load performance

2018-08-14 Thread Abdul Patel
How can we improve data load performance?

Re: Performance impact of using NetworkTopology with 3 node cassandra cluster in One DC

2018-08-02 Thread Kyrylo Lebediev
ich may cause "hot spots" in your cluster = performance impact), all racks should have the same number of nodes with approximately the same capacity. Also, sometimes CL=QUORUM isn't used correctly and CL=LOCAL_QUORUM should be used instead. There are no differences between the tw

Performance impact of using NetworkTopology with 3 node cassandra cluster in One DC

2018-08-02 Thread Murtaza Talwari
NetworkTopology as replication strategy they might have multiple DataCenters configured. In our case we have only one DataCenter, * With that using the NetworkTopology as replication strategy will it cause any performance impact ? * As we are using QUORUM as Read/Write consistency which is

Re: Write performance degradation

2018-06-18 Thread onmstester onmstester
I think that could have pinpoint the problem, i have a table with a partition key related to timestamp so for one hour so many data would be inserted at one single node, this table creates a very big partitions (300MB-600MB), whatever node the current partition of that table would be inserted to

Re: Write performance degradation

2018-06-18 Thread DuyHai Doan
Maybe the disk I/O cannot keep up with the high mutation rate ? Check the number of pending compactions On Sun, Jun 17, 2018 at 9:24 AM, onmstester onmstester wrote: > Hi, > > I was doing 500K inserts + 100K counter update in seconds on my cluster of > 12 nodes (20 core/128GB ram/4 * 600 HDD 10

Write performance degradation

2018-06-17 Thread onmstester onmstester
Hi, I was doing 500K inserts + 100K counter update in seconds on my cluster of 12 nodes (20 core/128GB ram/4 * 600 HDD 10K) using batch statements with no problem. I saw a lot of warning show that most of batches not concerning a single node, so they should not be in a batch, on the other h

Re: Time Series schema performance

2018-05-30 Thread Haris Altaf
gt; timestamp timestamp, >> > metricName1 BigInt, >> > metricName2 BigInt. >> > ... >> > . >> > metricName300 BigInt, >> >> > Primary Key (( day, bucketid ) , id, timestamp) >> > ) >> >> > BucketId is just a

Re: Time Series schema performance

2018-05-29 Thread sujeet jog
t; > bucketid Int, > > date date, > > timestamp timestamp, > > metricName1 BigInt, > > metricName2 BigInt. > > ... > > . > > metricName300 BigInt, > > > Primary Key (( day, bucketid ) , id, timestamp) > > ) > > > BucketId is

Re: Time Series schema performance

2018-05-29 Thread Affan Syed
gt;> > date date, >> > timestamp timestamp, >> > metricName1 BigInt, >> > metricName2 BigInt. >> > ... >> > . >> > metricName300 BigInt, >> >> > Primary Key (( day, bucketid ) , id, timestamp) >> > ) >

Re: Time Series schema performance

2018-05-29 Thread Haris Altaf
hema 1 ) > > > create table ( > > id timeuuid, > > bucketid Int, > > date date, > > timestamp timestamp, > > metricName1 BigInt, > > metricName2 BigInt. > > ... > > . > > metricName300 BigInt, > > > Primary Key (( day, buck

Re: Time Series schema performance

2018-05-29 Thread Jonathan Haddad
t a murmur3 hash of the id which acts as a splitter to group id's in a partition > Pros : - > Efficient write performance, since data is written to minimal partitions > Cons : - > While the first schema works best when queried programmatically, but is a bit inflexible I

Re: Time Series schema performance

2018-05-29 Thread Jeff Jirsa
... > . > metricName300 BigInt, > > Primary Key (( day, bucketid ) , id, timestamp) > ) > > BucketId is just a murmur3 hash of the id which acts as a splitter to group > id's in a partition > > > Pros : - > > Efficient write performance, sinc

Time Series schema performance

2018-05-29 Thread sujeet jog
is just a murmur3 hash of the id which acts as a splitter to group id's in a partition Pros : - Efficient write performance, since data is written to minimal partitions Cons : - While the first schema works best when queried programmatically, but is a bit inflexible If it has to be integrated

Re: cassandra concurrent read performance problem

2018-05-28 Thread Alain RODRIGUEZ
Hi, Would you share some more context with us? - What Cassandra version do you use? - What is the data size per node? - How much RAM does the hardware have? - Does your client use paging? A few ideas to explore: - Try tracing the query, see what's taking time (and resources) - From the tracing,

cassandra concurrent read performance problem

2018-05-26 Thread onmstester onmstester
By reading 90 partitions concurrently(each having size > 200 MB), My single node Apache Cassandra became unresponsive, no read and write works for almost 10 minutes. I'm using this configs: memtable_allocation_type: offheap_buffers gc: G1GC heap: 128GB concurrent_reads: 128 (having more tha

Re: performance on reading only the specific nonPk column

2018-05-21 Thread sujeet jog
Thanks Kurt, that answers my question. @nandan, id, timestamp ensures unique primary-key. On Mon, May 21, 2018 at 2:23 PM, kurt greaves wrote: > Every column will be retrieved (that's populated) from disk and the > requested column will then be sliced out in memory and sent back. > > On 21 M

  1   2   3   4   5   6   7   8   9   10   >