Re: [EXTERNAL] [DISCUSS] Next release date

2023-04-02 Thread Mick Semb Wever
>
> I'd be happier with something concrete like the following expected release
> flow:
>
> 1) We freeze a branch
> 2) To hit RC, we need green circle + no regression on ASF (or green ASF in
> the future when stable)
> 3) We need N weeks in this frozen state for people to test it out
> 4) Once we have both 2 and 3, we RC and GA
>


Yeah, I was thinking (1) would include the beta1 release if we're already
green, i.e. meaning we'd skip (2), or alpha1 if not yet green.
3) would still hold, but would be N weeks from first beta to first rc.

That is, something like…

1) branch. if green cut beta1 else cut alpha1
  1a) when green then cut beta1
2) wait N weeks from beta1. if no blockers cut rc1
3) wait 2 weeks. if no blockers cut GA

As evident from both 4.0 and 4.1  the alpha to beta timeframe hurts, and
our Stable Trunk (and CI) efforts are to minimise/remove this.  That is,
this incentivises us to get on top of CI issues and flakies ahead of branch
time.


Is it too prescriptive to say "we'll be frozen on a branch for at least 8
> weeks so folks can test out the betas"? (I ask because I know I can get a
> little "structure-happy" at times).
>


6-8 weeks feels right, if we want to be prescriptive. And there needs to be
a sense of urgency when we make this call to action to downstream testers.
As a release manager I know that an error margin of two weeks is typical.


Re: CASSANDRA-14227 removing the 2038 limit

2023-04-02 Thread Berenguer Blasi

Hi all,

assuming lazy consensus here

Regards

On 22/3/23 15:55, Berenguer Blasi wrote:


Hi all,

14227 has undergone review and perf numbers look ok. Now I have to 
tackle the downgradability issue and hopefully then merge. This is 
what I have gathered from the many conversations, please help me let 
me know if this is correct or if I am missing sthg:


- Everything will be based off a feature flag. I will add a transient 
feature flag while waiting for CASSANDRA-18301 to land. I will merge 
to trunk and when CASSANDRA-18301 lands it should replace it. That 
makes CASSANDRA-18301 a release blocker (think multiple feature flags, 
avoid future feature flag deprecations,...). If the effort for the TTL 
feature flag is comparable to implementing CASSANDRA-18301 I might 
just do that (TBD).


- My code will have to behave as has always done and produce sstables 
_not_ in the new format. Once that feature flag toggles I can write 
sstables in the _new_ format with the new behavior. I will add testing 
for both behaviors and synthetically emulate the flag toggle.


- Providing a tool to downgrade sstables already written in the _new_ 
format in the _previous_ format is not in scope for 14227. That would 
be CASSANDRA-8928 in any case.


Is this correct?

Thx in advance.

On 3/2/23 15:24, Henrik Ingo wrote:
In that case I agree that increasing from 20 years is an interesting 
opportunity but clearly out of scope for your current ticket.


On Fri, Feb 3, 2023 at 3:48 PM Berenguer Blasi 
 wrote:


Hi,

20y is the current and historic value. 68y is what an integer can
accommodate hence the current 2038 limit since the 1970 Unix
epoch. I wouldn't make it a configurable value, off the top of my
head it would make for some interesting bugs and debugging
sessions when nodes had different values. Food for another ticket
in any case imo.

Regards

On 3/2/23 14:18, Henrik Ingo wrote:

Naive PHB questions to follow...

Why are 68y and 20y special? Could you pick any value? Could we
allow it to be configurable? (Last one probably overkill, just
asking to understand...)

If we can pick any values we want, instinctively I would
personally suggest to have TTL higher than 20 years, but also
kicking the can further than 2035, which is only 13 years from
now. Just to suggest a specific number, why not 35y and 2071?

henrik

On Fri, Feb 3, 2023 at 12:32 PM Berenguer Blasi
 wrote:

Hi All,

a version using Uints, 20y max TTL and kicking the can down
the road until 2086 has been put up for review #justfyi

Regards

On 15/11/22 7:06, Berenguer Blasi wrote:


Hi all,

thanks for your answers!.

To Benedict's point: In terms of the uvint enconding of
deletionTime i.e. it is true it happens here

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/SerializationHeader.java#L170.
But we also have a DeletionTime serializer here

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/DeletionTime.java#L166
that is writing an int and a long that would now write 2 longs.

TTL itself (the delta) remains an int in the new PR so it
should have no effect in size.

Did I reference the correct parts of the codebase? No
sstable expert here.

On 14/11/22 19:28, Josh McKenzie wrote:

in 2035 we'd hit the same problem again.

In terms of "kicking a can down the road", this would be a
pretty vigorous kick. I wouldn't push back against this
deferral. :)

On Mon, Nov 14, 2022, at 9:28 AM, Benedict wrote:


I’m confused why we see *any* increase in sstable size -
TTLs and deletion times are already written as unsigned
vints as offsets from an sstable epoch for each value.

I would dig in more carefully to explore why you’re
seeing this increase? For the same data there should be
no change to size on disk.



On 14 Nov 2022, at 06:36, C. Scott Andreas
  wrote:
A 2-3% increase in storage volume is roughly equivalent
to giving up the gain from LZ4 -> LZ4HC, or a one to
two-level bump in Zstandard compression levels. This
regression could be very expensive for storage-bound use
cases.

From the perspective of storage overhead, the unsigned
int approach sounds preferable.


On Nov 13, 2022, at 10:13 PM, Berenguer Blasi

 wrote:


Hi all,

We have done some more research on c14227. The current
patch for CASSANDRA-14227 solves the TTL limit issue by
switching TTL to long instead of int. This approach
does not have a negative impact on memtable memory
usage, as C* controles the memory used b