Re: Compression Tuning Tutorial

2018-08-13 Thread Kyrylo Lebediev
Thank you, Jon! From: Jonathan Haddad Sent: Thursday, August 9, 2018 7:29:24 PM To: user Subject: Re: Compression Tuning Tutorial There's a discussion about direct I/O here you might find interesting: https://issues.apache.org/jira/browse/CASSANDRA-144

Re: Compression Tuning Tutorial

2018-08-09 Thread Jonathan Haddad
stack wasn't > implemented, except lack of resources to do this? > > > Regards, > > Kyrill > -- > *From:* Eric Plowe > *Sent:* Wednesday, August 8, 2018 9:39:44 PM > *To:* user@cassandra.apache.org > *Subject:* Re: Compression Tuning Tutoria

Re: Compression Tuning Tutorial

2018-08-09 Thread Kyrylo Lebediev
asn't implemented, except lack of resources to do this? Regards, Kyrill From: Eric Plowe Sent: Wednesday, August 8, 2018 9:39:44 PM To: user@cassandra.apache.org Subject: Re: Compression Tuning Tutorial Great post, Jonathan! Thank you very much. ~Eric On Wed, A

Re: Compression Tuning Tutorial

2018-08-08 Thread Eric Plowe
Great post, Jonathan! Thank you very much. ~Eric On Wed, Aug 8, 2018 at 2:34 PM Jonathan Haddad wrote: > Hey folks, > > We've noticed a lot over the years that people create tables usually > leaving the default compression parameters, and have spent a lot of time > helping teams figure out the

Re: compression cpu overhead

2015-11-04 Thread Dan Kinder
To clarify, writes have no *immediate* cpu cost from adding the write to the memtable, however the compression overhead cost is paid when writing out a new SSTable (whether from flushing a memtable or compacting), correct? So it sounds like when reads >> writes then Tushar's comments are accurate,

Re: compression cpu overhead

2015-11-03 Thread Jon Haddad
You won't see any overhead on writes because you don't actually write to sstables when performing a write. Just the commit log & memtable. Memtables are flushes asynchronously. > On Nov 4, 2015, at 1:57 AM, Tushar Agrawal wrote: > > For writes it's negligible. For reads it makes a significan

Re: compression cpu overhead

2015-11-03 Thread Tushar Agrawal
For writes it's negligible. For reads it makes a significant difference for high tps and low latency workload. You would see up to 3x higher cpu with LZ4 vs no compression. It would be different for different h/w configurations. Thanks, Tushar (Sent from iPhone) > On Nov 3, 2015, at 5:51 PM, D

Re: compression cpu overhead

2015-11-03 Thread Dan Kinder
Most concerned about write since that's where most of the cost is, but perf numbers for a any workload mix would be helpful. On Tue, Nov 3, 2015 at 3:48 PM, Graham Sanderson wrote: > On read or write? > > https://issues.apache.org/jira/browse/CASSANDRA-7039 and friends in 2.2 > should make some

Re: compression cpu overhead

2015-11-03 Thread Graham Sanderson
On read or write? https://issues.apache.org/jira/browse/CASSANDRA-7039 and friends in 2.2 should make some difference, I didn’t immediately find perf numbers though. > On Nov 3, 2015, at 5:42 PM, Dan Kinder wrote: > > Hey all, > > Just w

Re: Compression during bootstrap

2014-08-18 Thread Ruchir Jha
On Wednesday, August 13, 2014, Robert Coli <> wrote: > On Wed, Aug 13, 2014 at 5:53 AM, Ruchir Jha > wrote: > >> We are adding nodes currently and it seems like compression is falling >> behind. I judge that by the fact that the new node which has a 4.5T disk >> fills up to 100% while its bootstr

Re: Compression during bootstrap

2014-08-13 Thread Robert Coli
On Wed, Aug 13, 2014 at 5:53 AM, Ruchir Jha wrote: > We are adding nodes currently and it seems like compression is falling > behind. I judge that by the fact that the new node which has a 4.5T disk > fills up to 100% while its bootstrapping. Can we avoid this problem with > the LZ4 compressor be

Re: Compression ratio

2013-07-12 Thread cem
Thank you very much! On Fri, Jul 12, 2013 at 5:59 PM, Yuki Morishita wrote: > it's compressed/original. > > > https://github.com/apache/cassandra/blob/cassandra-1.1.11/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L124 > > On Fri, Jul 12, 2013 at 10:02 AM, cem wrote: > > Hi All

Re: Compression ratio

2013-07-12 Thread Yuki Morishita
it's compressed/original. https://github.com/apache/cassandra/blob/cassandra-1.1.11/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L124 On Fri, Jul 12, 2013 at 10:02 AM, cem wrote: > Hi All, > > Can anyone explain the compression ratio? > > Is it the "compressed data / original" o

Re: compression

2012-10-29 Thread aaron morton
Fax: +972 2 5612956 >> >> >> >> >> >> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean wrote: >> As well as your unlimited column names may all have the same prefix, right? >> Like "accounts".rowkey56, "accounts".rowkey78

Re: compression

2012-10-29 Thread Alain RODRIGUEZ
Media >>> >>> >>> >>> >>> ta...@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 >>> >>> >>> >>> >>> >>> On Sun, Sep 23, 2012

Re: compression

2012-10-29 Thread Tamar Fraenkel
may all have the same prefix, >>> right? Like "accounts".rowkey56, "accounts".rowkey78, etc. etc. so the >>> "accounts gets a ton of compression then. >>> >>> Later, >>> Dean >>> >>> From: Tyler Hobbs mailto:ty...@d

Re: compression

2012-10-24 Thread aaron morton
>> effect >> >> >> Tamar Fraenkel >> Senior Software Engineer, TOK Media >> >> >> >> >> ta...@tok-media.com >> Tel: +972 2 6409736 >> Mob: +972 54 8356490 >> Fax: +972 2 5612956 >> >> >>

Re: compression

2012-10-24 Thread Tamar Fraenkel
> right? Like "accounts".rowkey56, "accounts".rowkey78, etc. etc. so the >>> "accounts gets a ton of compression then. >>> >>> Later, >>> Dean >>> >>> From: Tyler Hobbs mailto:ty...@datastax.com>> >>>

Re: compression

2012-09-28 Thread Tamar Fraenkel
> Tel: +972 2 6409736 >> Mob: +972 54 8356490 >> Fax: +972 2 5612956 >> >> >> >> >> >> On Mon, Sep 24, 2012 at 8:37 AM, Tamar Fraenkel wrote: >> >>> Thanks all, that helps. Will start with one - two CFs and let you know >>> th

Re: compression

2012-09-27 Thread Tamar Fraenkel
612956 >> >> >> >> >> >> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean wrote: >> >>> As well as your unlimited column names may all have the same prefix, >>> right? Like "accounts".rowkey56, "accounts".rowkey78, etc. etc. so t

Re: compression

2012-09-25 Thread aaron morton
e same prefix, right? > Like "accounts".rowkey56, "accounts".rowkey78, etc. etc. so the "accounts > gets a ton of compression then. > > Later, > Dean > > From: Tyler Hobbs mailto:ty...@datastax.com>> > Reply-To: "user@cassandra.apache.org<

Re: compression

2012-09-24 Thread Tamar Fraenkel
er, >> Dean >> >> From: Tyler Hobbs mailto:ty...@datastax.com>> >> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < >> user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >> Date: Sunday, September 23, 2012 11:46 AM

Re: compression

2012-09-23 Thread Tamar Fraenkel
bbs mailto:ty...@datastax.com>> > Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Date: Sunday, September 23, 2012 11:46 AM > To: "user@cassandra.apache.org<mailto:use

Re: compression

2012-09-23 Thread Hiller, Dean
ser@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Sunday, September 23, 2012 11:46 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: compr

Re: compression

2012-09-23 Thread Tyler Hobbs
Due to repetition in the column metadata, you're still likely to get a reasonable amount of compression. This is especially true if there is some amount of repetition in the column names, values, or TTLs in wide rows. Compression will almost always be beneficial unless you're already somehow CPU b

Re: Compression on client side vs server side

2012-04-03 Thread Віталій Тимчишин
We are using client-side compression because of next points. Can you confirm they are valid? 1) Server-side compression uses replication factor more CPU (3 times more with replication factor of 3). 2) Network is used more by compression factor (as you are sending uncompressed data over the wire). 4

Re: Compression on client side vs server side

2012-04-02 Thread Ben McCann
Thanks Jeremiah, that's what I has suspected. I appreciate the confirmation. Martin, there's not built-in support for doing compression client side, but it'd be easy for me to do manually since I just have one column with all my serialized data, which is why I was considering it. On Mon, Apr 2,

Re: Compression on client side vs server side

2012-04-02 Thread Martin Junghanns
Hi, how do you select between client- and serverside compression? i'm using hector and i set compression when creating a cf, so the compression executes when inserting the data "on the server" oO greetings, martin Am 02.04.2012 17:42, schrieb Ben McCann: Hi, I was curious if I compress my

RE: Compression on client side vs server side

2012-04-02 Thread Jeremiah Jordan
The server side compression can compress across columns/rows so it will most likely be more efficient. Whether you are CPU bound or IO bound depends on your application and node setup. Unless your working set fits in memory you will be IO bound, and in that case server side compression helps be

Re: Compression on secondary indexes

2012-03-31 Thread aaron morton
I've not checked the code but (reading https://issues.apache.org/jira/browse/CASSANDRA-3877) I would guess it is not possible to set compression on secondary indexes pre 1.1. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/03/2012, at

Re: compression for regular column names?

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 3:41 PM, E R wrote: > Hi all, > > As a way of gaining familiarity with Cassandra I am migrating a table > that is currently stored in a relational database and mapping it into > a Cassandra column family. We add about 700,000 new rows a day to this > table, and the average

Re: Compression in Cassandra

2011-01-20 Thread Stu Hood
Also note that an improved and compressible file format has been in the works for a while now. https://issues.apache.org/jira/browse/CASSANDRA-674 I am endlessly optimistic that it will make it into the 'next' version; in particular, the current hope is 0.8 On Jan 20, 2011 6:34 AM, "Terje Marthi

Re: Compression in Cassandra

2011-01-20 Thread Terje Marthinussen
Perfectly normal with 3-7x increase in data size depending on you data schema. Regards, Terje On 20 Jan 2011, at 23:17, "akshatbakli...@gmail.com" wrote: > I just did a du -h DataDump which showed 40G > and du -h CassandraDataDump which showed 170G > > am i doing something wrong. > have you o

Re: Compression in Cassandra

2011-01-20 Thread akshatbakli...@gmail.com
I just did a du -h DataDump which showed 40G and du -h CassandraDataDump which showed 170G am i doing something wrong. have you observed some compression in it. On Thu, Jan 20, 2011 at 6:57 PM, Javier Canillas wrote: > How do you calculate your 40g data? When you insert it into Cassandra, you >

Re: Compression in Cassandra

2011-01-20 Thread Javier Canillas
How do you calculate your 40g data? When you insert it into Cassandra, you need to convert the data into a Byte[], maybe your problem is there. On Thu, Jan 20, 2011 at 10:02 AM, akshatbakli...@gmail.com < akshatbakli...@gmail.com> wrote: > Hi all, > > I am experiencing a unique situation. I loade

Re: Re: compression

2010-04-01 Thread casablinca126.com
hi, Great! thanks to Rao and Tatu :) I will test them and let you know what I found. regards, Cao Jiguang - 发件人:Tatu Saloranta 发送日期:2010-04-02 01:08:52 收件人:u...@cassandra.apache.org 抄送: 主题:Re: compression

Re: compression

2010-04-01 Thread Tatu Saloranta
On Thu, Apr 1, 2010 at 8:27 AM, Rao Venugopal wrote: > To Cao Jiguang > > I was watching this presentation on bigtable yesterday > http://video.google.com/videoplay?docid=7278544055668715642# > > and Jeff mentioned that they compared three different compression libraries > BMDiff, LZO and gzip.  

Re: compression

2010-04-01 Thread Rao Venugopal
To Cao Jiguang I was watching this presentation on bigtable yesterday http://video.google.com/videoplay?docid=7278544055668715642# and Jeff mentioned that they compared three different compression libraries BMDiff, LZO and gzip. Apparently, gzip was the most cpu intensive and they ended up goin

RE: compression

2010-04-01 Thread Weijun Li
Thrift client doesn’t seem to compress anything unless you change thrift protocol or use a transport that support compression. I modified TSocket to support compression but it occasionally has broken pipe error due to crappy Java zlib support (so that clients has to reconnect to get around the s

Re: compression

2010-04-01 Thread casablinca126.com
hi Ran, I think there's no compression on the sever end. I am doing the gzip compression on the client side myself. cheers, Cao Jiguang 2010-04-01 casablinca126.com 发件人: Ran Tavory 发送时间: 2010-04-01 14:37:59 收件人: user@cassandra.apache.org 抄送: 主题: compression What sort of compres