ZSTD support for over-the-wire compression?

2025-03-27 Thread Steinmaurer, Thomas via user
Hello, any thoughts / reasons why ZSTD support for over-the-wire compression has been left out despite being a compression option for SSTable on-disk compression? Perhaps intentional, a technical reason behind that? Created https://issues.apache.org/jira/browse/CASSANDRA-20488, with the main

Re: Change the compression algorithm on a production table at runtime

2022-09-19 Thread C. Scott Andreas
Zstandard codec once you’ve upgraded to Cassandra 4.0+, which offers a great balance between compression ratio and throughout. > On Sep 19, 2022, at 11:34 PM, Eunsu Kim wrote: > > Hi all > > According to > https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlAlterTable.

Change the compression algorithm on a production table at runtime

2022-09-19 Thread Eunsu Kim
is it risky to change the compression algorithm to existing table in production? Currently the table is using DeflateCompressor but I want to change it to LZ4Compressor(for performance) Cassandra version is 3.11.10. Thank you in advance.

Re: Using zstd compression on Cassandra 3.x

2022-09-12 Thread onmstester onmstester via user
I patched this on 3.11.2 easily: 1. build jar file from src and put in cassandra/lib directory 2. restart cassandra service 3. alter table using compression zstd and rebuild sstables But it was in a time when 4.0 was not available yet and after that i upgraded to 4.0 immidiately. Sent

Re: Using zstd compression on Cassandra 3.x

2022-09-12 Thread Eunsu Kim
olutely upgrade you can extract the implementation > from 4.0 and use it. I would advise against this path though as zstd > implementation is nuanced. > > Dinesh > >> On Sep 12, 2022, at 7:09 PM, Eunsu Kim wrote: >> >> Hi all, >> >> Since zstd comp

Re: Using zstd compression on Cassandra 3.x

2022-09-12 Thread Dinesh Joshi
. Dinesh > On Sep 12, 2022, at 7:09 PM, Eunsu Kim wrote: > > Hi all, > > Since zstd compression is a very good compression algorithm, it is available > in Cassandra 4.0. Because the overall performance and ratio are excellent > > There is open source available fo

Using zstd compression on Cassandra 3.x

2022-09-12 Thread Eunsu Kim
Hi all, Since zstd compression is a very good compression algorithm, it is available in Cassandra 4.0. Because the overall performance and ratio are excellent There is open source available for Cassandra 3.x. https://github.com/MatejTymes/cassandra-zstd Do you have any experience applying this

Re: Cassandra on ZFS: disable compression?

2021-02-10 Thread Johnny Miller
I have done this several times i.e disabling compression at the table level. Never had any issues. On Wed, 27 Jan 2021 at 01:37, Elliott Sims wrote: > The main downside I see is that you're hitting a less-tested codepath. I > think very few installations have compression disabled to

Re: Cassandra on ZFS: disable compression?

2021-02-09 Thread Bowen Song
I'm running Cassandra 3.11 on ZFS on Linux. I use the ZFS compression and have disabled the Cassandra SSTable compression. I can't comment on the possible pros you've mentioned, because I didn't do enough tests on them. I know the compression ratio is marginally

Re: Cassandra on ZFS: disable compression?

2021-01-26 Thread Elliott Sims
The main downside I see is that you're hitting a less-tested codepath. I think very few installations have compression disabled today. On Mon, Jan 25, 2021 at 7:06 AM Lapo Luchini wrote: > Hi, > I'm using a fairly standard install of Cassandra 3.11 on FreeBSD > 12, by

Cassandra on ZFS: disable compression?

2021-01-25 Thread Lapo Luchini
ually): would it be a nice idea to disable Cassandra compression and rely only on ZFS one? In principle I can see some pros: 1. it's done in kernel, might be slightly faster 2. can (probably) compress more data, as I see a 1.02 compression factor on filesystem even if I have compress

Re: How to monitor datastax driver compression performance?

2019-04-09 Thread Jon Haddad
ok the poor docs in the area I think it might meet your needs. Regarding compression at the query level vs not, I think you should look at the overhead first. I'm betting you'll find it's insignificant. That said, you can always create two cluster objects with two radically different s

Re: How to monitor datastax driver compression performance?

2019-04-09 Thread Gabriel Giussi
tlp-stress allow us to define size of rows? Because I will see the benefit of compression in terms of request rates only if the compression ratio is significant, i.e. requires less network round trips. This could be done generating bigger partitions with parameters -n and -p, i.e. decreasing the

Re: How to monitor datastax driver compression performance?

2019-04-08 Thread Jon Haddad
ess to add compression options for the driver: https://github.com/thelastpickle/tlp-stress/issues/67. If you're interested in contributing the feature I think tlp-stress will more or less solve the remainder of the problem for you (the load part, not the os numbers). Jon On Mon, Apr 8, 2019 at

How to monitor datastax driver compression performance?

2019-04-08 Thread Gabriel Giussi
Hi, I'm trying to test if adding driver compression will bring me any benefit. I understand that the trade-off is less bandwidth but increased CPU usage in both cassandra nodes (compression) and client nodes (decompression) but I want to know what are the key metrics and how to monitor th

Re: SSTable Compression Ratio -1.0

2018-08-28 Thread Vitaliy Semochkin
Thank you ZAIDI, can you please explain why mentioned ratio is negative? On Tue, Aug 28, 2018 at 8:18 PM ZAIDI, ASAD A wrote: > > Compression ratio is ratio of compression to its original size - smaller is > better; see it like compressed/uncompressed > 1 would mean no change i

RE: SSTable Compression Ratio -1.0

2018-08-28 Thread ZAIDI, ASAD A
Compression ratio is ratio of compression to its original size - smaller is better; see it like compressed/uncompressed 1 would mean no change in size after compression! -Original Message- From: Vitaliy Semochkin [mailto:vitaliy...@gmail.com] Sent: Tuesday, August 28, 2018 12:03 PM

SSTable Compression Ratio -1.0

2018-08-28 Thread Vitaliy Semochkin
Hello, nodetool tablestats my_kespace returns SSTable Compression Ratio -1.0 Can someone explain, what does -1.0 mean? Regards, Vitaliy - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e

Re: Compression Tuning Tutorial

2018-08-13 Thread Kyrylo Lebediev
Thank you, Jon! From: Jonathan Haddad Sent: Thursday, August 9, 2018 7:29:24 PM To: user Subject: Re: Compression Tuning Tutorial There's a discussion about direct I/O here you might find interesting: https://issues.apache.org/jira/browse/CASSANDRA-144

Re: Compression Tuning Tutorial

2018-08-09 Thread Jonathan Haddad
stack wasn't > implemented, except lack of resources to do this? > > > Regards, > > Kyrill > -- > *From:* Eric Plowe > *Sent:* Wednesday, August 8, 2018 9:39:44 PM > *To:* user@cassandra.apache.org > *Subject:* Re: Compression Tuning Tutoria

Re: Compression Tuning Tutorial

2018-08-09 Thread Kyrylo Lebediev
asn't implemented, except lack of resources to do this? Regards, Kyrill From: Eric Plowe Sent: Wednesday, August 8, 2018 9:39:44 PM To: user@cassandra.apache.org Subject: Re: Compression Tuning Tutorial Great post, Jonathan! Thank you very much. ~Eric On Wed, A

Re: Compression Tuning Tutorial

2018-08-08 Thread Eric Plowe
Great post, Jonathan! Thank you very much. ~Eric On Wed, Aug 8, 2018 at 2:34 PM Jonathan Haddad wrote: > Hey folks, > > We've noticed a lot over the years that people create tables usually > leaving the default compression parameters, and have spent a lot of time > helping

Compression Tuning Tutorial

2018-08-08 Thread Jonathan Haddad
Hey folks, We've noticed a lot over the years that people create tables usually leaving the default compression parameters, and have spent a lot of time helping teams figure out the right settings for their cluster based on their workload. I finally managed to write some thoughts down along

Re: Merging cells in compaction / compression?

2016-08-05 Thread Jonathan Haddad
y Hai was suggesting Spark Streaming, which gives you the tools > to build exactly what you asked for. A custom compression system for > packing batches of values for a partition into an optimized byte array. > > On Fri, Aug 5, 2016 at 7:46 AM Michael Burman wrote: > >> Hi,

Re: Merging cells in compaction / compression?

2016-08-05 Thread Jonathan Haddad
I think Duy Hai was suggesting Spark Streaming, which gives you the tools to build exactly what you asked for. A custom compression system for packing batches of values for a partition into an optimized byte array. On Fri, Aug 5, 2016 at 7:46 AM Michael Burman wrote: > Hi, > > For sto

Re: Merging cells in compaction / compression?

2016-08-05 Thread Michael Burman
, wouldn't it make sense to also improve the storage efficiency? Cassandra 3.x's one of the key improvements was that improved storage engine - but it's still far away from being efficient with time series data. Efficient compression methods for both floating points & int

Re: Merging cells in compaction / compression?

2016-08-05 Thread Jonathan Haddad
Hadoop and Cassandra have very different use cases. If the ability to write a custom compression system is the primary factor in how you choose your database I suspect you may run into some trouble. Jon On Fri, Aug 5, 2016 at 6:14 AM Michael Burman wrote: > Hi, > > As Spark is an e

Re: Merging cells in compaction / compression?

2016-08-05 Thread Michael Burman
mplement a compression method. Might as well run Hadoop. - Micke - Original Message - From: "DuyHai Doan" To: user@cassandra.apache.org Sent: Thursday, August 4, 2016 11:26:09 PM Subject: Re: Merging cells in compaction / compression? Look like you're asking for some sort of ETL

Re: Merging cells in compaction / compression?

2016-08-04 Thread DuyHai Doan
Stevens" > To: user@cassandra.apache.org > Sent: Thursday, August 4, 2016 10:26:30 PM > Subject: Re: Merging cells in compaction / compression? > > When you say merge cells, do you mean re-aggregating the data into courser > time buckets? > > On Thu, Aug 4, 2016 at

Re: Merging cells in compaction / compression?

2016-08-04 Thread Michael Burman
day, August 4, 2016 10:26:30 PM Subject: Re: Merging cells in compaction / compression? When you say merge cells, do you mean re-aggregating the data into courser time buckets? On Thu, Aug 4, 2016 at 5:59 AM Michael Burman wrote: > Hi, > > Considering the following example structur

Re: Merging cells in compaction / compression?

2016-08-04 Thread Eric Stevens
p, > PRIMARY KEY((metric), time) > ) WITH CLUSTERING ORDER BY (time DESC) > > The natural inserting order is metric, value, timestamp pairs, one > metric/value pair per second for example. That means creating more and more > cells to the same partition, which creates a large amount

Merging cells in compaction / compression?

2016-08-04 Thread Michael Burman
means creating more and more cells to the same partition, which creates a large amount of overhead and reduces the compression ratio of LZ4 & Deflate (LZ4 reaches ~0.26 and Deflate ~0.10 ratios in some of the examples I've run). Now, to improve compression ratio, how could I merge the

Re: compression cpu overhead

2015-11-04 Thread Dan Kinder
To clarify, writes have no *immediate* cpu cost from adding the write to the memtable, however the compression overhead cost is paid when writing out a new SSTable (whether from flushing a memtable or compacting), correct? So it sounds like when reads >> writes then Tushar's comments a

Re: compression cpu overhead

2015-11-03 Thread Jon Haddad
igible. For reads it makes a significant difference for > high tps and low latency workload. You would see up to 3x higher cpu with LZ4 > vs no compression. It would be different for different h/w configurations. > > > Thanks, > Tushar > (Sent from iPhone) > > On Nov 3, 201

Re: compression cpu overhead

2015-11-03 Thread Tushar Agrawal
For writes it's negligible. For reads it makes a significant difference for high tps and low latency workload. You would see up to 3x higher cpu with LZ4 vs no compression. It would be different for different h/w configurations. Thanks, Tushar (Sent from iPhone) > On Nov 3, 2015, at

Re: compression cpu overhead

2015-11-03 Thread Dan Kinder
should make some difference, I didn’t immediately find perf numbers though. > > On Nov 3, 2015, at 5:42 PM, Dan Kinder wrote: > > Hey all, > > Just wondering if anyone has done seen or done any benchmarking for the > actual CPU overhead added by various compression algorith

Re: compression cpu overhead

2015-11-03 Thread Graham Sanderson
all, > > Just wondering if anyone has done seen or done any benchmarking for the > actual CPU overhead added by various compression algorithms in Cassandra (at > least LZ4) vs no compression. Clearly this is going to be workload dependent > but even a rough gauge would b

compression cpu overhead

2015-11-03 Thread Dan Kinder
Hey all, Just wondering if anyone has done seen or done any benchmarking for the actual CPU overhead added by various compression algorithms in Cassandra (at least LZ4) vs no compression. Clearly this is going to be workload dependent but even a rough gauge would be helpful (ex. "Turning o

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-05 Thread Janne Jalkanen
I’ve never used Astyanax, so it’s difficult to say, but if you can find the snappy-java in the classpath, it’s quite possible that compression is enabled for S1 and S2 automatically. You could try removing the snappy jar from S1 and see if that changes the latencies compared to S2. ;-) It

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-04 Thread Sachin Nikam
es S1 and S2 is 50% of the write > latency (tp99) for Service S3. I also noticed that S1 and S2, which also > use astyanax client library also have compress-lzf.jar on their class path. > Although the table is defined to use Snappy Compression. Is this > compression library or some other

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-04 Thread Sachin Nikam
. Although the table is defined to use Snappy Compression. Is this compression library or some other transitive dependency pulled in by Astyanax enabling compression of the payload i.e. sent over the wire and account for the difference in tp99? Regards Sachin On Mon, Aug 3, 2015 at 12:14 AM, Janne

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-03 Thread Janne Jalkanen
production environments, if you choose not to enable compression now. /Janne > On 3 Aug 2015, at 08:40, Sachin Nikam wrote: > > Thanks Janne... > To clarify, Service S3 should not run in to any issues and I may choose to > not fix the issue? > Regards > Sachin > > O

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-02 Thread Sachin Nikam
Disk. > > To fix, make sure that the Snappy libraries are in the classpath of your > S3 service application. As always, there’s no guarantee that this improves > your performance, since if your app is already CPU-heavy, the extra CPU > overhead of compression *may* be a problem. So

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-01 Thread Janne Jalkanen
he extra CPU overhead of compression *may* be a problem. So measure :-) /Janne > On 02 Aug 2015, at 02:17 , Sachin Nikam wrote: > > I am currently running a Cassandra 1.2 cluster. This cluster has 2 tables i.e. > TableA and TableB. > > TableA is read and written to by Service

Cassandra Data Stax java driver & Snappy Compression library

2015-08-01 Thread Sachin Nikam
=com.datastax.driver.core.FrameCompressor;Cannot find Snappy class, you should make sure the Snappy library is in the classpath if you intend to use it. Snappy compression will not be available for the protocol. *** My questions are as follows-- #1. Does the compression happen on the cassandra client side or within

Re: Client-side compression, cassandra or both?

2014-11-03 Thread graham sanderson
I wouldn’t do both. Unless a little server CPU or (and you’d have to measure it - I imagine it is probably not significant - as you say C* has more context, and hopefully most things can compress “0, “ repeatedly) disk space are an issue, I wouldn’t bother to compress yourself. Compression

Re: Client-side compression, cassandra or both?

2014-11-03 Thread DuyHai Doan
Hello Robin You have many options for compression in C*: 1) Serialized in bytes instead of JSON, to save a lot of space due to String encoding. Of course the data will be opaque and not human readable 2) Activate client-node data compression. In this case, do not forget to ship LZ4 or SNAPPY

Client-side compression, cassandra or both?

2014-11-03 Thread Robin Verlangen
Hi there, We're working on a project which is going to store a lot of JSON objects in Cassandra. A large piece of this (90%) consists of an array of integers, of which in a lot of cases there are a bunch of zeroes. The average JSON is 4KB in size, and once GZIP (default compression) just

Re: Compression during bootstrap

2014-08-18 Thread Ruchir Jha
On Wednesday, August 13, 2014, Robert Coli <> wrote: > On Wed, Aug 13, 2014 at 5:53 AM, Ruchir Jha > wrote: > >> We are adding nodes currently and it seems like compression is falling >> behind. I judge that by the fact that the new node which has a 4.5T disk &g

Re: Compression during bootstrap

2014-08-13 Thread Robert Coli
On Wed, Aug 13, 2014 at 5:53 AM, Ruchir Jha wrote: > We are adding nodes currently and it seems like compression is falling > behind. I judge that by the fact that the new node which has a 4.5T disk > fills up to 100% while its bootstrapping. Can we avoid this problem with > the LZ

Compression during bootstrap

2014-08-13 Thread Ruchir Jha
Hello, We currently are at C* 1.2 and are using the SnappyCompressor for all our CFs. Total data size is at 24 TB, and its a 12 node cluster. Avg node size is 2 TB. We are adding nodes currently and it seems like compression is falling behind. I judge that by the fact that the new node which has

Re: SSTable compression ratio… percentage or 0.0 -> 1.0???

2014-06-30 Thread Robert Coli
Sent:* Sunday, June 29, 2014 12:33 AM > *To:* user@cassandra.apache.org > *Subject:* SSTable compression ratio… percentage or 0.0 -> 1.0??? > > I can't find documentation on this... > > SSTable Compression Ratio: 0.31685324166491696 > > One entry on the datastax si

Re: SSTable compression ratio… percentage or 0.0 -> 1.0???

2014-06-29 Thread Jack Krupansky
: SSTable compression ratio… percentage or 0.0 -> 1.0??? I can't find documentation on this... SSTable Compression Ratio: 0.31685324166491696 One entry on the datastax site says that it's the "percentage" but is it 0.31% or 31% ? I think it's 31% … but I don&#

SSTable compression ratio… percentage or 0.0 -> 1.0???

2014-06-28 Thread Kevin Burton
I can't find documentation on this... SSTable Compression Ratio: 0.31685324166491696 One entry on the datastax site says that it's the "percentage" but is it 0.31% or 31% ? I think it's 31% … but I don't see where this is specified.. -- Founder/CEO Spinn3r.co

RE: Turn off compression (1.2.11)

2014-02-23 Thread Plotnik, Alexey
...@gmail.com] Sent: 19 февраля 2014 г. 10:21 To: user@cassandra.apache.org Subject: Re: Turn off compression (1.2.11) I am new and trying to learn Cassandra. Based on my understanding of the problem, almost 2Gb is taken up just for the compression headers heap. And 100MB per SSTable, and about 30,000

Re: Turn off compression (1.2.11)

2014-02-18 Thread Robert Coli
On Tue, Feb 18, 2014 at 2:51 PM, Plotnik, Alexey wrote: > My SSTable size is 100Mb. Last time I removed leveled manifest compaction > was running for 3 months > At 3TB per node, you are at, and probably exceeding the maximum size anyone suggests for Cassandra 1.2.x. Add more nodes? =Rob

Re: Turn off compression (1.2.11)

2014-02-18 Thread Yogi Nerella
I am new and trying to learn Cassandra. Based on my understanding of the problem, almost 2Gb is taken up just for the compression headers heap. And 100MB per SSTable, and about 30,000 files gives about 3TB of data? What is the hardware and memory configuration you are using to provide this

RE: Turn off compression (1.2.11)

2014-02-18 Thread Plotnik, Alexey
Compression buffers are located in Heap, I saw them in Heapdump. That is: == public class CompressedRandomAccessReader extends RandomAccessReader { ….. private ByteBuffer compressed; // <-- THAT IS == From: Robert Coli [mailto:rc...@eventbrite.com] S

RE: Turn off compression (1.2.11)

2014-02-18 Thread Plotnik, Alexey
My SSTable size is 100Mb. Last time I removed leveled manifest compaction was running for 3 months From: Robert Coli [mailto:rc...@eventbrite.com] Sent: 19 февраля 2014 г. 6:24 To: user@cassandra.apache.org Subject: Re: Turn off compression (1.2.11) On Mon, Feb 17, 2014 at 4:35 PM, Plotnik

Re: Turn off compression (1.2.11)

2014-02-18 Thread Edward Capriolo
Personally I think having compression on by default is the wrong choice. Depending on your access patterns and row sizes the overhead of compression can create more Garbage Collection and become your bottleneck before your potentially bottleneck your disk (ssd disk) On Tue, Feb 18, 2014 at 2:23

Re: Turn off compression (1.2.11)

2014-02-18 Thread Robert Coli
On Mon, Feb 17, 2014 at 4:35 PM, Plotnik, Alexey wrote: > After analyzing Heap I saw this buffer has a size about 70KB per SSTable. > I have more than 30K SSTables per node. > I'm thinking your problem is not compression, it's using the old 5mb default for Level Compactio

Turn off compression (1.2.11)

2014-02-17 Thread Plotnik, Alexey
Each compressed SSTable uses additional transfer buffer in CompressedRandomAccessReader instance. After analyzing Heap I saw this buffer has a size about 70KB per SSTable. I have more than 30K SSTables per node. I want to turn off a compression for this column family to save some Heap. How can

Re: reads and compression

2013-11-29 Thread Edward Capriolo
The big * in the explanation: Smaller file size footprint leads to better disk cache, however decompression adds work for the JVM to do and increases the churn of objects in the JVM. Additionally compression block sizes might be 4KB while for some use cases a small row may be 200bytes. This means

Re: reads and compression

2013-11-29 Thread Artur Kronenberg
Hi John, I am trying again :) The way I understand it is that compression gives you the advantage of having to use way less IO and rather use CPU. The bottleneck of reads is usually the IO time you need to read the data from disk. As a figure, we had about 25 reads/s reading from disk, while

reads and compression

2013-11-28 Thread John Sanda
This article[1] cites gains in read performance can be achieved when compression is enabled. The more I thought about it, even after reading the DataStax docs about reads[2], I realized I do not understand how compression improves read performance. Can someone provide some details on this? Is the

Re: about compression enabled by default in Cassandra 1.1.

2013-10-22 Thread Tyler Hobbs
On Tue, Oct 22, 2013 at 9:29 AM, DE VITO Dominique < dominique.dev...@thalesgroup.com> wrote: > ** > > Is compression working for whatever column value type ? in all cases ? > > ** ** > > For example, if my CF has columns with value type of byte[] (or “blob” > w

about compression enabled by default in Cassandra 1.1.

2013-10-22 Thread DE VITO Dominique
Hi, Is compression working for whatever column value type ? in all cases ? For example, if my CF has columns with value type of byte[] (or "blob" when speaking CQL), is C* still doing compression ? Thanks. Regards, Dominique

How can someone change compression type on system keyspace

2013-10-13 Thread David Beneš
Hello, I'm trying to change compression on keyspace "system" from snappy to LZ4 in cassandra 1.2.10 This keyspace is read-only and command: USE system; ALTER TABLE "hints" WITH compression = { 'sstable_compression' : 'LZ4Compressor'}; fails w

Re: sstable compression

2013-09-12 Thread Robert Coli
On Thu, Sep 12, 2013 at 2:13 AM, Christopher Wirt wrote: > I would like to switch to using LZ4 compression for my SStables. Would > simply altering the table definition mean that all newly written tables are > LZ4 and can live in harmony with the existing Snappy SStables? > Yes, pe

sstable compression

2013-09-12 Thread Christopher Wirt
I current use Snappy for my SSTable compression on Cassandra 1.2.8. I would like to switch to using LZ4 compression for my SStables. Would simply altering the table definition mean that all newly written tables are LZ4 and can live in harmony with the existing Snappy SStables? Then

Re: Compression ratio

2013-07-12 Thread cem
AM, cem wrote: > > Hi All, > > > > Can anyone explain the compression ratio? > > > > Is it the "compressed data / original" or "original/ compressed" ? Or > > something else. > > > > thanks a lot. > > > > Best Regards, > > Cem > > > > -- > Yuki Morishita > t:yukim (http://twitter.com/yukim) >

Re: Compression ratio

2013-07-12 Thread Yuki Morishita
it's compressed/original. https://github.com/apache/cassandra/blob/cassandra-1.1.11/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L124 On Fri, Jul 12, 2013 at 10:02 AM, cem wrote: > Hi All, > > Can anyone explain the compression ratio? > > Is it the "c

Compression ratio

2013-07-12 Thread cem
Hi All, Can anyone explain the compression ratio? Is it the "compressed data / original" or "original/ compressed" ? Or something else. thanks a lot. Best Regards, Cem

Re: Cassanrda 1.1.11 compression: how to tell if it works ?

2013-05-10 Thread Robert Coli
On Thu, May 9, 2013 at 3:38 PM, aaron morton wrote: >> At what point does compression start ? > It starts for new SSTables created after the schema was altered. @OP : If you want to compress all existing SSTables, use "upgradesstables" or "cleanup", both of which

Re: Cassanrda 1.1.11 compression: how to tell if it works ?

2013-05-09 Thread aaron morton
> At what point does compression start ? It starts for new SSTables created after the schema was altered. > How can I confirm it is working ? Compressed SSTables include a -CompressionInfo.db component on disk. Cheers - Aaron Morton Freelance Cassandra Consultant New Z

Cassanrda 1.1.11 compression: how to tell if it works ?

2013-05-07 Thread Oleg Dulin
; command to view its contents. But it seems like I can view the contents like this: strings *-Data.db At what point does compression start ? How can I confirm it is working ? -- Regards, Oleg Dulin NYC Java Big Data Engineer http://www.olegdulin.com/

Re: Cassandra Compression and Wide Rows

2013-03-20 Thread aaron morton
Yes. The block size is specified as part of the compression options for the CF / Table. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 20/03/2013, at 5:31 AM, Drew Kutcharian wrote: > Thanks Sylvain. S

Re: Cassandra Compression and Wide Rows

2013-03-19 Thread Drew Kutcharian
Thanks Sylvain. So C* compression is block based and has nothing to do with format of the rows. On Mar 19, 2013, at 1:31 AM, Sylvain Lebresne wrote: > That's just describing what compression is about. Compression (not in C*, in > general) is based on recognizing repeated pattern.

Re: Cassandra Compression and Wide Rows

2013-03-19 Thread Sylvain Lebresne
That's just describing what compression is about. Compression (not in C*, in general) is based on recognizing repeated pattern. So yes, in that sense, static column families are more likely to yield better compression ratio because it is more likely to have repeated patterns in the compr

Re: Cassandra Compression and Wide Rows

2013-03-18 Thread Drew Kutcharian
Edward/Sylvain, I also came across this post on DataStax's blog: > When to use compression > Compression is best suited for ColumnFamilies where there are many rows, with > each row having the same columns, or at least many columns in common. For > example, a ColumnFamily con

Re: Cassandra Compression and Wide Rows

2013-03-18 Thread Edward Capriolo
I feel this has come up before. I believe the compression is block based, so just because no two column names are the same does not mean the compression will not be effective. Possibly in their case the compression was not effective. On Mon, Mar 18, 2013 at 9:08 PM, Drew Kutcharian wrote

Re: Cassandra Compression and Wide Rows

2013-03-18 Thread Drew Kutcharian
3, Sylvain Lebresne wrote: > > The way compression is implemented, it is oblivious to the CF being > > wide-row or narrow-row. There is nothing intrinsically less efficient in > > the compression for wide-rows. > > -- > > Sylvain > > > > On Fri, Mar 15, 2013

Re: Cassandra Compression and Wide Rows

2013-03-18 Thread Edward Capriolo
Imho it is probably more efficient for wide. When you decompress 8k blocks to get at a 200 byte row you create overhead , particularly young gen. On Monday, March 18, 2013, Sylvain Lebresne wrote: > The way compression is implemented, it is oblivious to the CF being wide-row or narrow-row. Th

Re: Cassandra Compression and Wide Rows

2013-03-18 Thread Sylvain Lebresne
The way compression is implemented, it is oblivious to the CF being wide-row or narrow-row. There is nothing intrinsically less efficient in the compression for wide-rows. -- Sylvain On Fri, Mar 15, 2013 at 11:53 PM, Drew Kutcharian wrote: > Hey Guys, > > I remember reading somewhe

Cassandra Compression and Wide Rows

2013-03-15 Thread Drew Kutcharian
Hey Guys, I remember reading somewhere that C* compression is not very effective when most of the CFs are in wide-row format and some folks turn the compression off and use disk level compression as a workaround. Considering that wide rows with composites are "first class citizens" i

Re: Enable compression Cassandra 1.1.2

2012-11-15 Thread Alain RODRIGUEZ
> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 14/11/2012, at 11:39 PM, Alain RODRIGUEZ wrote: >> >> Hi, I am running C* 1.1.2 and there is no way to turn the compression on >> for a CF. >> >> Here is

Re: Enable compression Cassandra 1.1.2

2012-11-14 Thread Alain RODRIGUEZ
1.1.6 ? > > Cheers > >- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 14/11/2012, at 11:39 PM, Alain RODRIGUEZ wrote: > > Hi, I am running C* 1.1.2 and there is no way to turn the compre

Re: Enable compression Cassandra 1.1.2

2012-11-14 Thread aaron morton
1.2 and there is no way to turn the compression on for > a CF. > > Here is the command I ran in the CLI: > > UPDATE COLUMN FAMILY data_action WITH > compression_options={sstable_compression:SnappyCompressor, chunk_length_kb : > 64}; > > Show schema : > > creat

Re: compression

2012-10-29 Thread aaron morton
I decide to change another CF to use compression > I will have that issue again. Any clue how to avoid it? > > Thanks. > > Tamar Fraenkel > Senior Software Engineer, TOK Media > > > > ta...@tok-media.com > Tel: +972 2 6409736 > Mob: +972 54 8356490 &g

Re: compression

2012-10-29 Thread Alain RODRIGUEZ
event and date separated by a sharp as I am doing right now ? 4 - Would compression be a good idea in this case ? Thanks for your help on any of these 4 points :). Alain 2012/10/29 Tamar Fraenkel > Hi! > Thanks Aaron! > Today I restarted Cassandra on that node and ran scrub again, now

Re: compression

2012-10-29 Thread Tamar Fraenkel
Hi! Thanks Aaron! Today I restarted Cassandra on that node and ran scrub again, now it is fine. I am worried though that if I decide to change another CF to use compression I will have that issue again. Any clue how to avoid it? Thanks. *Tamar Fraenkel * Senior Software Engineer, TOK Media

Re: compression

2012-10-24 Thread aaron morton
ub of > SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-340-Data.db') > complete: 7037 rows in new sstable and 0 empty (tombstoned) rows dropped > > I don't see any CompressionInfo.db files and compression ratio is still 0.0 > on this node only, on

Re: compression

2012-10-24 Thread Tamar Fraenkel
ew sstable and 0 empty (tombstoned) rows dropped I don't see any CompressionInfo.db files and compression ratio is still 0.0 on this node only, on other nodes it is almost 0.5... Any idea? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-

Re: compression

2012-09-28 Thread Tamar Fraenkel
rote: > >> Check the logs on nodes 2 and 3 to see if the scrub started. The logs on >> 1 will be a good help with that. >> >> Cheers >> >> - >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com &g

Re: compression

2012-09-27 Thread Tamar Fraenkel
_name > > I have replication factor 3. The size of the data on disk was cut in half > in the first node and in the jmx I can see that indeed the compression > ration is 0.46. But on nodes 2 and 3 nothing happened. In the jmx I can see > that compression ratio is 0 and the size

Re: Cassandra compression not working?

2012-09-25 Thread aaron morton
Nothing jumps out. Are you able to reproduce the fault on a test node ? There were some schema change problems in the early 1.1X releases. Did you enable compression via a schema change ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On

Re: compression

2012-09-25 Thread aaron morton
gt; the first node and in the jmx I can see that indeed the compression ration is > 0.46. But on nodes 2 and 3 nothing happened. In the jmx I can see that > compression ratio is 0 and the size of the files of disk stayed the same. > > In cli > >

Re: Cassandra compression not working?

2012-09-24 Thread Fred Groen
er for some time, with > compression enabled on one column family in which text documents are > stored. We enabled compression on the column family, utilizing the > SnappyCompressor and a 64k chunk length. > > It was recently discovered that Cassandra was reporting a compression > ratio

Re: Cassandra compression not working?

2012-09-24 Thread Mike
er for some time, with > compression enabled on one column family in which text documents are stored. > We enabled compression on the column family, utilizing the SnappyCompressor > and a 64k chunk length. > > It was recently discovered that Cassandra was reporting a compression ratio &

Cassandra compression not working?

2012-09-24 Thread Michael Theroux
Hello, We are running into an unusual situation that I'm wondering if anyone has any insight on. We've been running a Cassandra cluster for some time, with compression enabled on one column family in which text documents are stored. We enabled compression on the column family, uti

  1   2   >