IMO slightly bigger memory requirements for substantial improvements is a
good exchange, especially for a 4.0 release of the database. Optane and
lots of other memory are coming down the hardware pipeline, and risk-wise
almost all cassandra people know to testbed the major versions, so major
versio
Looks straightforward, I can review today.
On Mon, Oct 29, 2018 at 12:25 PM Ariel Weisberg wrote:
> Hi,
>
> Seeing too many -'s for changing the representation and essentially no +1s
> so I submitted a patch for just changing the default. I could use a
> reviewer for https://issues.apache.org/ji
Hi,
Seeing too many -'s for changing the representation and essentially no +1s so I
submitted a patch for just changing the default. I could use a reviewer for
https://issues.apache.org/jira/browse/CASSANDRA-13241
I created https://issues.apache.org/jira/browse/CASSANDRA-14857 "Use a more
spa
+1. I use the smiley to let you know I'm mostly just giving you shit. ;)
On Wed, Oct 24, 2018 at 11:43 AM Benedict Elliott Smith
wrote:
> If you undertake sufficiently many low risk things, some will bite you, I
> think everyone understands that. It’s still valuable to factor a risk
> assessmen
If you undertake sufficiently many low risk things, some will bite you, I think
everyone understands that. It’s still valuable to factor a risk assessment
into the equation, I think?
Either way, somebody asked who didn’t have the context to easily answer, so I
did my best to offer them that in
| The risk from such a patch is very low
If I had a nickel for every time I've heard that... ;)
I'm neutral on the default change, -.5 (i.e. don't agree with it but won't
die on that hill) on the data structure change post-freeze. We put this in,
and that's a slippery slope as I'm sure we can find
My objection (-0.5) is based on freeze not in code complexity
--
Jeff Jirsa
> On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith
> wrote:
>
> To discuss the concerns about the patch for a more efficient representation:
>
> The risk from such a patch is very low. It’s a very simple in-me
To discuss the concerns about the patch for a more efficient representation:
The risk from such a patch is very low. It’s a very simple in-memory data
structure, that we can introduce thorough fuzz tests for. The reason to
exclude it would be for reasons of wanting to begin strictly enforcing
Hi,
I just asked Jeff. He is -0 and -0.5 respectively.
Ariel
On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
> I’m +1 change of default. I think Jeff was -1 on that though.
>
>
> > On 23 Oct 2018, at 16:46, Ariel Weisberg wrote:
> >
> > Hi,
> >
> > To summarize who we have
I’m +1 change of default. I think Jeff was -1 on that though.
> On 23 Oct 2018, at 16:46, Ariel Weisberg wrote:
>
> Hi,
>
> To summarize who we have heard from so far
>
> WRT to changing just the default:
>
> +1:
> Jon Haddadd
> Ben Bromhead
> Alain Rodriguez
> Sankalp Kohli (not explicit)
Hi,
To summarize who we have heard from so far
WRT to changing just the default:
+1:
Jon Haddadd
Ben Bromhead
Alain Rodriguez
Sankalp Kohli (not explicit)
-0:
Sylvaine Lebresne
Jeff Jirsa
Not sure:
Kurt Greaves
Joshua Mckenzie
Benedict Elliot Smith
WRT to change the representation:
+1:
Ther
Sorry, to be clear - I'm +1 on changing the configuration default, but I
think changing the compression in memory representations warrants further
discussion and investigation before making a case for or against it yet.
An optimization that reduces in memory cost by over 50% sounds pretty good
and
I think we should try to do the right thing for the most people that we
can. The number of folks impacted by 64KB is huge. I've worked on a lot
of clusters created by a lot of different teams, going from brand new to
pretty damn knowledgeable. I can't think of a single time over the last 2
years
(We should definitely harden the definition for freeze in a separate thread)
My thinking is that this is the best time to do this change as we have not even
cut alpha or beta. All the people involved in the test will definitely be
testing it again when we have these releases.
> On Oct 19, 2018
On 10/19/18 9:16 AM, Joshua McKenzie wrote:
>
> At the risk of hijacking this thread, when are we going to transition from
> "no new features, change whatever else you want including refactoring and
> changing years-old defaults" to "ok, we think we have something that's
> stable, time to start te
Shall we move this discussion to a separate thread? I agree it needs to be
had, but this will definitely derail this discussion.
To respond only to the relevant portion for this thread:
> changing years-old defaults
I don’t see how age is relevant? This isn’t some ‘battle hardened’ feature
w
>
> The predominant phrased used in that thread was 'feature freeze'.
At the risk of hijacking this thread, when are we going to transition from
"no new features, change whatever else you want including refactoring and
changing years-old defaults" to "ok, we think we have something that's
stable,
Hi,
I ran some benchmarks on my laptop
https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16656821&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16656821
For a random read workload, varying chunk size:
Chunk size Time
64k 25:20
The change of default property doesn’t seem to violate the freeze? The
predominant phrased used in that thread was 'feature freeze'. A lot of people
are now interpreting it more broadly, so perhaps we need to revisit, but that’s
probably a separate discussion?
The current default is really ba
Agree with Sylvain (and I think Benedict) - there’s no compelling reason to
violate the freeze here. We’ve had the wrong default for years - add a note to
the docs that we’ll be changing it in the future, but let’s not violate the
freeze now.
--
Jeff Jirsa
> On Oct 19, 2018, at 10:06 AM, Syl
Fwiw, as much as I agree this is a change worth doing in general, I do am
-0 for 4.0. Both the "compact sequencing" and the change of default really.
We're closing on 2 months within the freeze, and for me a freeze do include
not changing defaults, because changing default ideally imply a decent
am
Hi,
For those who were asking about the performance impact of block size on
compression I wrote a microbenchmark.
https://pastebin.com/RHDNLGdC
[java] Benchmark Mode Cnt
Score Error Units
[java] CompactIntegerSequenceB
FWIW, I’m not -0, just think that long after the freeze date a change like this
needs a strong mandate from the community. I think the change is a good one.
> On 17 Oct 2018, at 22:09, Ariel Weisberg wrote:
>
> Hi,
>
> It's really not appreciably slower compared to the decompression we ar
Hi,
It's really not appreciably slower compared to the decompression we are going
to do which is going to take several microseconds. Decompression is also going
to be faster because we are going to do less unnecessary decompression and the
decompression itself may be faster since it may fit in
I think if we're going to drop it to 16k, we should invest in the compact
sequencing as well. Just lowering it to 16k will have potentially a painful
impact on anyone running low memory nodes, but if we can do it without the
memory impact I don't think there's any reason to wait another major
versi
+1
I would guess a lot of C* clusters/tables have this option set to the
default value, and not many of them are having the need for reading so big
chunks of data.
I believe this will greatly limit disk overreads for a fair amount (a big
majority?) of new users. It seems fair enough to change this
Hi,
This would only impact new tables, existing tables would get their
chunk_length_in_kb from the existing schema. It's something we record in a
system table.
I have an implementation of a compact integer sequence that only requires 37%
of the memory required today. So we could do this with o
> On Oct 12, 2018, at 6:46 AM, Pavel Yaskevich wrote:
>
>> On Thu, Oct 11, 2018 at 4:31 PM Ben Bromhead wrote:
>>
>> This is something that's bugged me for ages, tbh the performance gain for
>> most use cases far outweighs the increase in memory usage and I would even
>> be in favor of chan
I think 16k is a better default, but it should only affect new tables. Whoever
changes it, please make sure you think about the upgrade path.
> On Oct 12, 2018, at 2:31 AM, Ben Bromhead wrote:
>
> This is something that's bugged me for ages, tbh the performance gain for
> most use cases fa
On Thu, Oct 11, 2018 at 4:31 PM Ben Bromhead wrote:
> This is something that's bugged me for ages, tbh the performance gain for
> most use cases far outweighs the increase in memory usage and I would even
> be in favor of changing the default now, optimizing the storage cost later
> (if it's foun
This is something that's bugged me for ages, tbh the performance gain for
most use cases far outweighs the increase in memory usage and I would even
be in favor of changing the default now, optimizing the storage cost later
(if it's found to be worth it).
For some anecdotal evidence:
4kb is usuall
Hi,
This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241
This ticket has languished for a while. IMO it's too late in 4.0 to implement a
more memory efficient representation for compressed chunk offsets. However I
don't think we should put out another release with the current
32 matches
Mail list logo