Re: [DISCUSS] Next release date

2023-04-19 Thread Henrik Ingo
I'm going to repeat the point from my own thread: rather than thinking of
this as some kind of concession to two exceptional CEPs, could we rather
take the point of view that they get their own space and time precisely
because they are large and invasive and both the merge and testing of them
will benefit from everything else in the branch quieting down?

I'm also not particularly interested in a long feature freeze beyond 1-3
months that would serve the above purpose well.

In short: the proposal should not be that everyone else just have to sit
still and wait for two late stragglers. The proposal is merely to organise
work such that we maximise velocity and quality for merging cep-15&21.
Anything beyond that should be judged differently.

On Tue, 18 Apr 2023, 23:48 J. D. Jordan,  wrote:

> I also don’t really see the value in “freezing with exceptions for two
> giant changes to come after the freeze”.
>
> -Jeremiah
>
> On Apr 18, 2023, at 1:08 PM, Caleb Rackliffe 
> wrote:
>
> 
> > Caleb, you appear to be the only one objecting, and it does not appear
> that you have made any compromises in this thread.
>
> All I'm really objecting to is making special exceptions for particular
> CEPs in relation to our freeze date. In other words, let's not have a
> pseudo-freeze date and a "real" freeze date, when the thing that makes the
> latter supposedly necessary is a very invasive change to the database that
> risks our desired GA date. Also, again, I don't understand how cutting a
> 5.0 branch makes anything substantially easier to start testing. Perhaps
> I'm the only one who thinks this. If so, I'm not going to make further
> noise about it.
>
> On Tue, Apr 18, 2023 at 7:26 AM Henrik Ingo 
> wrote:
>
>> I forgot one last night:
>>
>> From Benjamin we have a question that I think went unanswered?
>>
>> *> Should it not facilitate the work if the branch stops changing
>> heavily?*
>>
>> This is IMO a good perspective. To me it seems weird to be too hung up on
>> a "hard limit" on a specific day, when we are talking about merges where a
>> single merge / rebase takes more than one day. We will have to stop merging
>> smaller work to trunk anyway, when CEP-21 is being merged. No?
>>
>> henrik
>>
>> On Tue, Apr 18, 2023 at 3:24 AM Henrik Ingo 
>> wrote:
>>
>>> Trying to collect a few loose ends from across this thread
>>>
>>> *> I'm receptive to another definition of "stabilize", *
>>>
>>> I think the stabilization period implies more than just CI, which is
>>> mostly a function of unit tests working correctly. For example, at Datastax
>>> we have run a "large scale" test with >100 nodes, over several weeks, both
>>> for 4.0 and 4.1. For obvious reasons such tests can't run in nightly CI
>>> builds.
>>>
>>> Also it is not unusual that during the testing phase developers or
>>> specialized QA engineers can develop new tests (which are possibly added to
>>> CI) to improve coverage for and especially targeting new features in the
>>> release. For example the fixes to Paxos v2 were found by such work before
>>> 4.1.
>>>
>>> Finally, maybe it's a special case relevant only for  this release, but
>>> as a significant part of the Datastax team has been focused on porting
>>> these large existing features from DSE, and to get them merged before the
>>> original May date, we also have tens of bug fixes waiting to be upstreamed
>>> too. (It used to be an even 100, but I'm unsure what the count is today.)
>>>
>>> In fact! If you are worried about how to occupy yourself between a May
>>> "soft freeze" and September'ish hard freeze, you are welcome to chug on
>>> that backlog. The bug fixes are already public and ASL licensed, in the 4.0
>>> based branch here
>>> .
>>> Failed with an unknown error.
>>>
>>> *> 3a. If we allow merge of CEP-15 / CEP-21 after branch, we risk
>>> invalidating stabilization and risk our 2023 GA date*
>>>
>>> I think this is the assumption that I personally disagree with. If this
>>> is true, why do we even bother running any CI before the CEP-21 merge? It
>>> will all be invalidated anyway, right?
>>>
>>> In my experience, it is beneficial to test as early as possible, and at
>>> different checkpoints during development. If we wouldn't  do it, and we
>>> find some issue in late November, then the window to search for the commit
>>> that introduced the regression is all the way back to the 4.1 GA. If on the
>>> other hand the same test was already rune during the soft freeze, then we
>>> can know that we may focus our search onto CEP-15 and CEP-21.
>>>
>>>
>>> *> get comfortable with cutting feature previews or snapshot alphas like
>>> we agreed to for earlier access to new stuff*
>>>
>>> Snapshots are in fact a valid compromise proposal: A snapshot would
>>> provide a constant version / point in time to focus testing on, but on the
>>> other hand would allow trunk (or the 5.0 branch, in other proposals) to
>>> remain open to new commits. Somewh

[COMPRESSION PARAMETERS] Question

2023-04-19 Thread Claude Warren, Jr via dev
Currently the compression parameters has an option called enable.  When
enable=false all the other options have to be removed.  But it seems to me
that we should support enabled=false without removing all the other
parameters so that users can disable the compression for testing or problem
resolution without losing an of the other parameter settings.  So to be
clear:

The following is valid:
hints_compression:
- class_name: foo
  parameters:
   - chunk_length_in_kb : 16 ;

But this is not:
hints_compression:
- class_name: foo
  parameters:
   - chunk_length_in_kb : 16 ;
  enabled : false ;

Currently when enabled is set false is constructs a default
CompressionParam object with the class name set to null.

Is there a reason to keep this or should we accept the parameters and
construct a CompressionParam with the parameters while continuing to set
the class name to null?

Claude


Re: [DISCUSS] Next release date

2023-04-19 Thread Josh McKenzie
Let me try to break this down another way:

I see a few competing concerns, each with QA related time requirements 
(asserting 8 weeks minimum, 16 weeks maximum we should plan for to stabilize a 
GA):
 1. A freeze to a branch to stabilize for release (8-16 weeks of QA required 
after we branch)
 2. A freeze to a branch to make room for large complex work to have increased 
velocity on merge due to having a more stable destination (8-16 weeks of QA 
required after they merge)
 3. A commitment to release once a year (for our purposes, we've defined this 
as calendar year) (8-16 weeks of QA required *before*).
If we walk backwards from Dec 1, that means our latest date to freeze and 
validate a 5.0 branch would be Friday August 11; let's go with 1st Friday in 
August for simplicity, 2023-08-04. That would give us just over 16 weeks 
worst-case to stabilize.

So we branch for 5.0 *at the latest* on 2023-08-04; I think we can all agree on 
this?

So the next question: when do we branch for 5.0 *at the earliest*? Pros and 
cons of an earlier branch:
Pros:
 • Earlier start of validation testing on a more stable base (no improvements 
or new features excepting CEP-15 and CEP-21)
 • Theoretically higher velocity of completion of CEP-15 and CEP-21 (the team 
doing this can speak to the degree to which this is true)
Cons:
 • Smaller amount of improvements and new features go into 5.0
 • The rest of the dev community has another branch they need to target with 
bugfixes (annoying but not _too_ costly since bugfixes are often a bit smaller 
in scope)

Through this lens, we are weighing the belief that CEP-15 and CEP-21 will land 
by August 1st and be accelerated by branching early against the belief that 
other improvements and features will go in if we branch later; if we freeze 
today and neither CEP-15 nor CEP-21 land for unforeseen reasons, we will have a 
GA release that had a shortened amount of time for new features and 
improvements to be merged in.

Lastly, as input data to the discussion, here's a list of all the new features 
and improvements in 5.0 as of today; hypothetically were we to freeze 5.0 today 
and worst-case unforeseen things lead to CEP-21 and CEP-15 not landing by 
cutoff, this would be our feature-set for our next GA: 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20and%20fixversion%20%3D%205.0%20and%20fixversion%20not%20in%20(4.0.x%2C%204.1.x%2C%204.0%2C%204.1%2C%204.1.1%2C%204.1.2%2C%204.0.8%2C%204.1-alpha1%2C%204.1-alpha1%2C%204.1-beta2%2C%204.1-beta1%2C%204.1-rc1)%20and%20type%20in%20(%22New%20Feature%22%2C%20%22Improvement%22)%20and%20component%20!%3D%20Accord%20order%20by%20type%20desc%2C%20resolved%20desc

Phew. Ok, so using the above framework, I'm personally ok with us freezing 5.0 
earlier than August 1st if the engineers actively on CEP-15 and CEP-21 indicate 
that it will appreciably increase their velocity. The list of improvements and 
features is substantial enough that an earlier freeze would still have enough 
in it to be "meaty" in my opinion; especially given the likelihood of CEP-25 
(Trie-indexed SSTable format) landing relatively soon.

So the next question to me is: "when"? On that I defer to Sam, Alex, Benedict, 
Blake, David, et. al: how much would freezing 5.0 early help in terms of your 
development velocity on TrM and Accord?


On Wed, Apr 19, 2023, at 6:22 AM, Henrik Ingo wrote:
> I'm going to repeat the point from my own thread: rather than thinking of 
> this as some kind of concession to two exceptional CEPs, could we rather take 
> the point of view that they get their own space and time precisely because 
> they are large and invasive and both the merge and testing of them will 
> benefit from everything else in the branch quieting down?
> 
> I'm also not particularly interested in a long feature freeze beyond 1-3 
> months that would serve the above purpose well.
> 
> In short: the proposal should not be that everyone else just have to sit 
> still and wait for two late stragglers. The proposal is merely to organise 
> work such that we maximise velocity and quality for merging cep-15&21. 
> Anything beyond that should be judged differently.
> 
> On Tue, 18 Apr 2023, 23:48 J. D. Jordan,  wrote:
>> 
>> I also don’t really see the value in “freezing with exceptions for two giant 
>> changes to come after the freeze”.
>> 
>> -Jeremiah
>> 
>>> On Apr 18, 2023, at 1:08 PM, Caleb Rackliffe  
>>> wrote:
>>> 
>>> > Caleb, you appear to be the only one objecting, and it does not appear 
>>> > that you have made any compromises in this thread.
>>> 
>>> All I'm really objecting to is making special exceptions for particular 
>>> CEPs in relation to our freeze date. In other words, let's not have a 
>>> pseudo-freeze date and a "real" freeze date, when the thing that makes the 
>>> latter supposedly necessary is a very invasive change to the database that 
>>> risks our desired GA date. Also, again, I don't understand how cutting a 
>>> 5.0 branch

Re: [COMPRESSION PARAMETERS] Question

2023-04-19 Thread Maxim Muzafarov
Hello Claude,

I have seen two options and the option you mentioned is probably the
third from ways of disabling a feature :-)

So, we have

1.
public class TransparentDataEncryptionOptions
{
public boolean enabled = false;
public ParameterizedClass key_provider;
}

2.
public boolean cdc_enabled = false;
public boolean materialized_views_enabled = false;


So, in my humble opinion, I guess both approaches are used for now and
as the discussion [1] is not finished we can probably use one of them
for the case you mentioned, so either create a nested wrapper class or
keep it plain with the right prefix e.g. hints_compression_enabled.


Move cassandra.yaml toward a nested structure around major database concepts
[1] https://issues.apache.org/jira/browse/CASSANDRA-17292

On Wed, 19 Apr 2023 at 14:07, Claude Warren, Jr via dev
 wrote:
>
> Currently the compression parameters has an option called enable.  When 
> enable=false all the other options have to be removed.  But it seems to me 
> that we should support enabled=false without removing all the other 
> parameters so that users can disable the compression for testing or problem 
> resolution without losing an of the other parameter settings.  So to be clear:
>
> The following is valid:
> hints_compression:
> - class_name: foo
>   parameters:
>- chunk_length_in_kb : 16 ;
>
> But this is not:
> hints_compression:
> - class_name: foo
>   parameters:
>- chunk_length_in_kb : 16 ;
>   enabled : false ;
>
> Currently when enabled is set false is constructs a default CompressionParam 
> object with the class name set to null.
>
> Is there a reason to keep this or should we accept the parameters and 
> construct a CompressionParam with the parameters while continuing to set the 
> class name to null?
>
> Claude


Re: [COMPRESSION PARAMETERS] Question

2023-04-19 Thread Miklosovic, Stefan
> But it seems to me that we should support enabled=false without removing all 
> the other parameters so that users can disable the compression for testing or 
> problem resolution without losing an of the other parameter settings.

Yes, I agree with this. Mere "enabled: false" should be just enough.

If this is false, we might default to NoOp compressor.


From: Claude Warren, Jr via dev 
Sent: Wednesday, April 19, 2023 14:06
To: dev
Subject: [COMPRESSION PARAMETERS] Question

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



Currently the compression parameters has an option called enable.  When 
enable=false all the other options have to be removed.  But it seems to me that 
we should support enabled=false without removing all the other parameters so 
that users can disable the compression for testing or problem resolution 
without losing an of the other parameter settings.  So to be clear:

The following is valid:
hints_compression:
- class_name: foo
  parameters:
   - chunk_length_in_kb : 16 ;

But this is not:
hints_compression:
- class_name: foo
  parameters:
   - chunk_length_in_kb : 16 ;
  enabled : false ;

Currently when enabled is set false is constructs a default CompressionParam 
object with the class name set to null.

Is there a reason to keep this or should we accept the parameters and construct 
a CompressionParam with the parameters while continuing to set the class name 
to null?

Claude