Re: [DISCUSS] CASSANDRA-13994
I think our "pre-beta" criteria should also be our "not in a major" criteria. If work is prohibited because it invalidates our pre-release verification, then it should not land until we next perform pre-release verification, which only currently happens once per major. This could mean either landing less in a major, or permitting more in beta etc. On 26/05/2020, 19:24, "Joshua McKenzie" wrote: I think an interesting question that informs when to stop accepting specific changes in a release is when we expect any extensive pre-release testing to take place. If we go by our release lifecycle, gutting deprecated code seems compatible w/Alpha but I wouldn't endorse merging it into Beta: https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle. Since almost all of the 40_quality_testing epic stuff is also beta phase and hasn't really taken off yet, it also seems like there will be extensive testing after this phase transition. All that being said, I'd advocate for marking FixVer 4.x to indicate optionality and disallow merge of tickets like this after we're done w/alpha phase in keeping w/our lifecycle doc in general. Does that make sense? Should we consider revisiting and revising the lifecycle doc re: larger deprecation / changes and cycle stages? On Tue, May 26, 2020 at 12:53 PM Oleksandr Petrov < oleksandr.pet...@gmail.com> wrote: > > 1) Would you block the release over this ticket? > > I would definitely not block the release on this ticket. > > > 2) Would you prioritize this ticket over testing? > > Same here, I would prioritise testing. > > > 3) Does fixing this ticket make 4.0 a more stable release? > > I wanted to give some context: I wrote that in August 2018. While I still > believe it is important to get rid of this code, I'm disinclined to merge > it into 4.0. > > Given that the patch is rather big (421 additions and 1,480 deletions) and > touches many important places, including parser, I would be extremely > cautious to merge it that late in release cycle. It would be great to also > hear arguments that would justify the risk. > > Thank you for starting this discussion, > -- Alex > > > > On Tue, May 26, 2020 at 5:20 PM Ekaterina Dimitrova < > ekaterina.dimitr...@datastax.com> wrote: > > > Dear all, > > > > Following the ticket review sent on 12th May I wanted to bring up > > https://issues.apache.org/jira/browse/CASSANDRA-13994: Remove COMPACT > > > > STORAGE internals before 4.0 release. > > > > It is already under review by Dinesh Joshi and Alex Petrov. Not a > > blocker but already under review. > > > > Below are my responses to the questions brought up. > > > > > > 1) Would you block the release over this > > > > ticket? - probably not > > > > 2) Would you prioritize this ticket over testing? - already > > implemented but if there are some big changes needed after the review, > > I doubt it we will want to prioritize over the testing > > > > 3) Does fixing > > this ticket make 4.0 a more stable release? - I will just cite Alex > > Petrov who reported this Jira and I think the rest of us would agree > > with him here. > > > > "I would say it's quite important to clean up compact storage > > internals in 4.0 before the release. It should have no visible > > side-effects, but it'd be very good to have as it simplifies multiple > > code paths." > > > > > > Ekaterina Dimitrova > > e. ekaterina.dimitr...@datastax.com > > w. www.datastax.com > > > > > -- > alex p > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Fwd: [DISCUSS] CASSANDRA-13994
Thank you all for your input. I think an important topic is again to revise the lifecycle and ensure we really have the vision on what is left until beta. I will start a separate thread on the flaky tests situation soon. For this particular ticket I see a couple of things: - There are a lot of deletions of already not used code - I implemented it still in alpha as per our agreement that this will give us enough time for testing. Probably Dinesh as a reviewer can give some valuable feedback/opinion on the patch. - It definitely touches around important places but the important thing is to see how exactly it touches, I think - Considering it for alpha before the major testing in beta sounds reasonable to me but I guess it also depends on people availability to review it in detail and the exact test plans afterwards On Wed, 27 May 2020 at 7:14, Benedict Elliott Smith wrote: > I think our "pre-beta" criteria should also be our "not in a major" > criteria. > > If work is prohibited because it invalidates our pre-release verification, > then it should not land until we next perform pre-release verification, > which only currently happens once per major. > > This could mean either landing less in a major, or permitting more in beta > etc. > > On 26/05/2020, 19:24, "Joshua McKenzie" wrote: > > I think an interesting question that informs when to stop accepting > specific changes in a release is when we expect any extensive > pre-release > testing to take place. > > If we go by our release lifecycle, gutting deprecated code seems > compatible > w/Alpha but I wouldn't endorse merging it into Beta: > > https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle. > Since almost all of the 40_quality_testing epic stuff is also beta > phase > and hasn't really taken off yet, it also seems like there will be > extensive > testing after this phase transition. > > All that being said, I'd advocate for marking FixVer 4.x to indicate > optionality and disallow merge of tickets like this after we're done > w/alpha phase in keeping w/our lifecycle doc in general. > > Does that make sense? Should we consider revisiting and revising the > lifecycle doc re: larger deprecation / changes and cycle stages? > > > > On Tue, May 26, 2020 at 12:53 PM Oleksandr Petrov < > oleksandr.pet...@gmail.com> wrote: > > > > 1) Would you block the release over this ticket? > > > > I would definitely not block the release on this ticket. > > > > > 2) Would you prioritize this ticket over testing? > > > > Same here, I would prioritise testing. > > > > > 3) Does fixing this ticket make 4.0 a more stable release? > > > > I wanted to give some context: I wrote that in August 2018. While I > still > > believe it is important to get rid of this code, I'm disinclined to > merge > > it into 4.0. > > > > Given that the patch is rather big (421 additions and 1,480 > deletions) and > > touches many important places, including parser, I would be extremely > > cautious to merge it that late in release cycle. It would be great > to also > > hear arguments that would justify the risk. > > > > Thank you for starting this discussion, > > -- Alex > > > > > > > > On Tue, May 26, 2020 at 5:20 PM Ekaterina Dimitrova < > > ekaterina.dimitr...@datastax.com> wrote: > > > > > Dear all, > > > > > > Following the ticket review sent on 12th May I wanted to bring up > > > https://issues.apache.org/jira/browse/CASSANDRA-13994: Remove > COMPACT > > > > > > STORAGE internals before 4.0 release. > > > > > > It is already under review by Dinesh Joshi and Alex Petrov. Not a > > > blocker but already under review. > > > > > > Below are my responses to the questions brought up. > > > > > > > > > 1) Would you block the release over this > > > > > > ticket? - probably not > > > > > > 2) Would you prioritize this ticket over testing? - already > > > implemented but if there are some big changes needed after the > review, > > > I doubt it we will want to prioritize over the testing > > > > > > 3) Does fixing > > > this ticket make 4.0 a more stable release? - I will just cite Alex > > > Petrov who reported this Jira and I think the rest of us would > agree > > > with him here. > > > > > > "I would say it's quite important to clean up compact storage > > > internals in 4.0 before the release. It should have no visible > > > side-effects, but it'd be very good to have as it simplifies > multiple > > > code paths." > > > > > > > > > Ekaterina Dimitrova > > > e. ekaterina.dimitr...@datastax.com > > > w. www.datastax.com > > > > > > > > > -- > > alex p > > > > > > - > To unsubscribe,
Re: [DISCUSS] CASSANDRA-13994
> > because it invalidates our pre-release verification, then it should not > land until we next perform pre-release verification At least for me there's a little softness around our collective alignment on when pre-release verification takes place. If it's between alpha-1 and ga we don't want changes that would invalidate those changes to land during that time frame. Different for beta-1 to ga. We also risk invalidating testing if we do any of that testing before wherever that cutoff is, and a lack of clarity on that cutoff further muddies those waters. My very loosely held perspective is that beta-1 to ga is the window in which we apply the "don't do things that will invalidate verification", and we plan to do that verification during the beta phase. I *think* this is consistent w/the current framing of the lifecycle doc. That being said, I don't have strong religion on this so if we collectively want to call it "don't majorly disrupt from alpha-1 to ga", we can formalize that in the docs and go ahead and triage current open scope for 4.0 and move things out. On Wed, May 27, 2020 at 12:59 PM Ekaterina Dimitrova < ekaterina.dimitr...@datastax.com> wrote: > Thank you all for your input. > I think an important topic is again to revise the lifecycle and ensure we > really have the vision on what is left until beta. I will start a separate > thread on the flaky tests situation soon. > > For this particular ticket I see a couple of things: > - There are a lot of deletions of already not used code > - I implemented it still in alpha as per our agreement that this will give > us enough time for testing. Probably Dinesh as a reviewer can give some > valuable feedback/opinion on the patch. > - It definitely touches around important places but the important thing is > to see how exactly it touches, I think > - Considering it for alpha before the major testing in beta sounds > reasonable to me but I guess it also depends on people availability to > review it in detail and the exact test plans afterwards > > On Wed, 27 May 2020 at 7:14, Benedict Elliott Smith > wrote: > > > I think our "pre-beta" criteria should also be our "not in a major" > > criteria. > > > > If work is prohibited because it invalidates our pre-release > verification, > > then it should not land until we next perform pre-release verification, > > which only currently happens once per major. > > > > This could mean either landing less in a major, or permitting more in > beta > > etc. > > > > On 26/05/2020, 19:24, "Joshua McKenzie" wrote: > > > > I think an interesting question that informs when to stop accepting > > specific changes in a release is when we expect any extensive > > pre-release > > testing to take place. > > > > If we go by our release lifecycle, gutting deprecated code seems > > compatible > > w/Alpha but I wouldn't endorse merging it into Beta: > > > > https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle. > > Since almost all of the 40_quality_testing epic stuff is also beta > > phase > > and hasn't really taken off yet, it also seems like there will be > > extensive > > testing after this phase transition. > > > > All that being said, I'd advocate for marking FixVer 4.x to indicate > > optionality and disallow merge of tickets like this after we're done > > w/alpha phase in keeping w/our lifecycle doc in general. > > > > Does that make sense? Should we consider revisiting and revising the > > lifecycle doc re: larger deprecation / changes and cycle stages? > > > > > > > > On Tue, May 26, 2020 at 12:53 PM Oleksandr Petrov < > > oleksandr.pet...@gmail.com> wrote: > > > > > > 1) Would you block the release over this ticket? > > > > > > I would definitely not block the release on this ticket. > > > > > > > 2) Would you prioritize this ticket over testing? > > > > > > Same here, I would prioritise testing. > > > > > > > 3) Does fixing this ticket make 4.0 a more stable release? > > > > > > I wanted to give some context: I wrote that in August 2018. While I > > still > > > believe it is important to get rid of this code, I'm disinclined to > > merge > > > it into 4.0. > > > > > > Given that the patch is rather big (421 additions and 1,480 > > deletions) and > > > touches many important places, including parser, I would be > extremely > > > cautious to merge it that late in release cycle. It would be great > > to also > > > hear arguments that would justify the risk. > > > > > > Thank you for starting this discussion, > > > -- Alex > > > > > > > > > > > > On Tue, May 26, 2020 at 5:20 PM Ekaterina Dimitrova < > > > ekaterina.dimitr...@datastax.com> wrote: > > > > > > > Dear all, > > > > > > > > Following the ticket review sent on 12th May I wanted to bring up > > > > https://issues.apache.org/jira/browse/CASSANDRA-13994: Remove >
Re: [DISCUSS] CASSANDRA-13994
I'm not sure if I communicated my point very well. I mean to say that if the reason we are prohibiting a patch to land post-beta is because it invalidates work we only perform pre-ga, then it probably should not be permitted to land post-ga either, since it must also invalidate the same work? That is to say, if we're comfortable with work landing post-ga because we believe it to be safe to release without our pre-major-release verification, we should be comfortable with it landing at any time pre-ga too. Anything else seems inconsistent to me, and we should examine what assumptions we're making that permit this inconsistency to arise. On 27/05/2020, 18:49, "Joshua McKenzie" wrote: > > because it invalidates our pre-release verification, then it should not > land until we next perform pre-release verification At least for me there's a little softness around our collective alignment on when pre-release verification takes place. If it's between alpha-1 and ga we don't want changes that would invalidate those changes to land during that time frame. Different for beta-1 to ga. We also risk invalidating testing if we do any of that testing before wherever that cutoff is, and a lack of clarity on that cutoff further muddies those waters. My very loosely held perspective is that beta-1 to ga is the window in which we apply the "don't do things that will invalidate verification", and we plan to do that verification during the beta phase. I *think* this is consistent w/the current framing of the lifecycle doc. That being said, I don't have strong religion on this so if we collectively want to call it "don't majorly disrupt from alpha-1 to ga", we can formalize that in the docs and go ahead and triage current open scope for 4.0 and move things out. On Wed, May 27, 2020 at 12:59 PM Ekaterina Dimitrova < ekaterina.dimitr...@datastax.com> wrote: > Thank you all for your input. > I think an important topic is again to revise the lifecycle and ensure we > really have the vision on what is left until beta. I will start a separate > thread on the flaky tests situation soon. > > For this particular ticket I see a couple of things: > - There are a lot of deletions of already not used code > - I implemented it still in alpha as per our agreement that this will give > us enough time for testing. Probably Dinesh as a reviewer can give some > valuable feedback/opinion on the patch. > - It definitely touches around important places but the important thing is > to see how exactly it touches, I think > - Considering it for alpha before the major testing in beta sounds > reasonable to me but I guess it also depends on people availability to > review it in detail and the exact test plans afterwards > > On Wed, 27 May 2020 at 7:14, Benedict Elliott Smith > wrote: > > > I think our "pre-beta" criteria should also be our "not in a major" > > criteria. > > > > If work is prohibited because it invalidates our pre-release > verification, > > then it should not land until we next perform pre-release verification, > > which only currently happens once per major. > > > > This could mean either landing less in a major, or permitting more in > beta > > etc. > > > > On 26/05/2020, 19:24, "Joshua McKenzie" wrote: > > > > I think an interesting question that informs when to stop accepting > > specific changes in a release is when we expect any extensive > > pre-release > > testing to take place. > > > > If we go by our release lifecycle, gutting deprecated code seems > > compatible > > w/Alpha but I wouldn't endorse merging it into Beta: > > > > https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle. > > Since almost all of the 40_quality_testing epic stuff is also beta > > phase > > and hasn't really taken off yet, it also seems like there will be > > extensive > > testing after this phase transition. > > > > All that being said, I'd advocate for marking FixVer 4.x to indicate > > optionality and disallow merge of tickets like this after we're done > > w/alpha phase in keeping w/our lifecycle doc in general. > > > > Does that make sense? Should we consider revisiting and revising the > > lifecycle doc re: larger deprecation / changes and cycle stages? > > > > > > > > On Tue, May 26, 2020 at 12:53 PM Oleksandr Petrov < > > oleksandr.pet...@gmail.com> wrote: > > > > > > 1) Would you block the release over this ticket? > > > > > > I would definitely not block the release on this ticket. > > > > > > > 2) Would you prioritize this ticket over testing? > > > > > > Same here, I would p
Re: [DISCUSS] CASSANDRA-13994
I'm being told this still isn't clear, so let me try in a bullet-point timeline: * 4.0 Beta * 4.0 Verification Work * [Merge Window] * 4.0 GA * 4.0 Minor Releases * ... * 5.0 Dev * ... * 5.0 Verification Work * GA 5.0 I think that anything that is prohibited from "[Merge Window]" because it invalidates "4.0 Verification Work" must also be prohibited until "5.0 Dev" because the next equivalent work that can now validate it occurs only at "5.0 Verification Work" On 27/05/2020, 19:05, "Benedict Elliott Smith" wrote: I'm not sure if I communicated my point very well. I mean to say that if the reason we are prohibiting a patch to land post-beta is because it invalidates work we only perform pre-ga, then it probably should not be permitted to land post-ga either, since it must also invalidate the same work? That is to say, if we're comfortable with work landing post-ga because we believe it to be safe to release without our pre-major-release verification, we should be comfortable with it landing at any time pre-ga too. Anything else seems inconsistent to me, and we should examine what assumptions we're making that permit this inconsistency to arise. On 27/05/2020, 18:49, "Joshua McKenzie" wrote: > > because it invalidates our pre-release verification, then it should not > land until we next perform pre-release verification At least for me there's a little softness around our collective alignment on when pre-release verification takes place. If it's between alpha-1 and ga we don't want changes that would invalidate those changes to land during that time frame. Different for beta-1 to ga. We also risk invalidating testing if we do any of that testing before wherever that cutoff is, and a lack of clarity on that cutoff further muddies those waters. My very loosely held perspective is that beta-1 to ga is the window in which we apply the "don't do things that will invalidate verification", and we plan to do that verification during the beta phase. I *think* this is consistent w/the current framing of the lifecycle doc. That being said, I don't have strong religion on this so if we collectively want to call it "don't majorly disrupt from alpha-1 to ga", we can formalize that in the docs and go ahead and triage current open scope for 4.0 and move things out. On Wed, May 27, 2020 at 12:59 PM Ekaterina Dimitrova < ekaterina.dimitr...@datastax.com> wrote: > Thank you all for your input. > I think an important topic is again to revise the lifecycle and ensure we > really have the vision on what is left until beta. I will start a separate > thread on the flaky tests situation soon. > > For this particular ticket I see a couple of things: > - There are a lot of deletions of already not used code > - I implemented it still in alpha as per our agreement that this will give > us enough time for testing. Probably Dinesh as a reviewer can give some > valuable feedback/opinion on the patch. > - It definitely touches around important places but the important thing is > to see how exactly it touches, I think > - Considering it for alpha before the major testing in beta sounds > reasonable to me but I guess it also depends on people availability to > review it in detail and the exact test plans afterwards > > On Wed, 27 May 2020 at 7:14, Benedict Elliott Smith > wrote: > > > I think our "pre-beta" criteria should also be our "not in a major" > > criteria. > > > > If work is prohibited because it invalidates our pre-release > verification, > > then it should not land until we next perform pre-release verification, > > which only currently happens once per major. > > > > This could mean either landing less in a major, or permitting more in > beta > > etc. > > > > On 26/05/2020, 19:24, "Joshua McKenzie" wrote: > > > > I think an interesting question that informs when to stop accepting > > specific changes in a release is when we expect any extensive > > pre-release > > testing to take place. > > > > If we go by our release lifecycle, gutting deprecated code seems > > compatible > > w/Alpha but I wouldn't endorse merging it into Beta: > > > > https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle. > > Since almost all of the 40_quality_testing epic stuff is also beta > > phase > > and hasn't really taken off yet, it also seems like there will be > > extensive > > testing after this phase transit
Re: [DISCUSS] CASSANDRA-13994
In this hypothetical, certainly not post-ga, and I'd argue we shouldn't allow it post-beta1 and we need a clear demarcation of "this type of work is ok to merge before X, it's not ok after X. Validation testing *will not occur* before X, and will start after X". It's a bit rigid, but it's the only way to have a clear inflection point where you know subsequent work won't be invalidated. Otherwise we end up in "I'm pretty sure the validation for disruptive thing X hasn't occurred so I'm going to merge it now" hell. (does what I'm typing here make sense given the context of what you said? Had a little trouble parsing, I think because there's fuzziness around "alpha 1 release" vs. "alpha phase" on how we label things in the project. Maybe) On Wed, May 27, 2020 at 2:05 PM Benedict Elliott Smith wrote: > I'm not sure if I communicated my point very well. I mean to say that if > the reason we are prohibiting a patch to land post-beta is because it > invalidates work we only perform pre-ga, then it probably should not be > permitted to land post-ga either, since it must also invalidate the same > work? > > That is to say, if we're comfortable with work landing post-ga because we > believe it to be safe to release without our pre-major-release > verification, we should be comfortable with it landing at any time pre-ga > too. Anything else seems inconsistent to me, and we should examine what > assumptions we're making that permit this inconsistency to arise. > > > On 27/05/2020, 18:49, "Joshua McKenzie" wrote: > > > > > because it invalidates our pre-release verification, then it should > not > > land > > until we next perform pre-release verification > > At least for me there's a little softness around our collective > alignment > on when pre-release verification takes place. If it's between alpha-1 > and > ga we don't want changes that would invalidate those changes to land > during > that time frame. Different for beta-1 to ga. We also risk invalidating > testing if we do any of that testing before wherever that cutoff is, > and a > lack of clarity on that cutoff further muddies those waters. > > My very loosely held perspective is that beta-1 to ga is the window in > which we apply the "don't do things that will invalidate > verification", and > we plan to do that verification during the beta phase. I *think* this > is > consistent w/the current framing of the lifecycle doc. That being > said, I > don't have strong religion on this so if we collectively want to call > it > "don't majorly disrupt from alpha-1 to ga", we can formalize that in > the > docs and go ahead and triage current open scope for 4.0 and move > things out. > > > > On Wed, May 27, 2020 at 12:59 PM Ekaterina Dimitrova < > ekaterina.dimitr...@datastax.com> wrote: > > > Thank you all for your input. > > I think an important topic is again to revise the lifecycle and > ensure we > > really have the vision on what is left until beta. I will start a > separate > > thread on the flaky tests situation soon. > > > > For this particular ticket I see a couple of things: > > - There are a lot of deletions of already not used code > > - I implemented it still in alpha as per our agreement that this > will give > > us enough time for testing. Probably Dinesh as a reviewer can give > some > > valuable feedback/opinion on the patch. > > - It definitely touches around important places but the important > thing is > > to see how exactly it touches, I think > > - Considering it for alpha before the major testing in beta sounds > > reasonable to me but I guess it also depends on people availability > to > > review it in detail and the exact test plans afterwards > > > > On Wed, 27 May 2020 at 7:14, Benedict Elliott Smith < > bened...@apache.org> > > wrote: > > > > > I think our "pre-beta" criteria should also be our "not in a major" > > > criteria. > > > > > > If work is prohibited because it invalidates our pre-release > > verification, > > > then it should not land until we next perform pre-release > verification, > > > which only currently happens once per major. > > > > > > This could mean either landing less in a major, or permitting more > in > > beta > > > etc. > > > > > > On 26/05/2020, 19:24, "Joshua McKenzie" > wrote: > > > > > > I think an interesting question that informs when to stop > accepting > > > specific changes in a release is when we expect any extensive > > > pre-release > > > testing to take place. > > > > > > If we go by our release lifecycle, gutting deprecated code > seems > > > compatible > > > w/Alpha but I wouldn't endorse merging it into Beta: > > > > > > > https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle. > > > Since almost all
Re: [DISCUSS] CASSANDRA-13994
+1 strongly agree. If we aren’t going to let something go into 4.0.0 because it would "invalidate testing” then we can not let such a thing go into 4.0.1 unless we plan to re-do said testing for the patch release. > On May 27, 2020, at 1:31 PM, Benedict Elliott Smith > wrote: > > I'm being told this still isn't clear, so let me try in a bullet-point > timeline: > > * 4.0 Beta > * 4.0 Verification Work > * [Merge Window] > * 4.0 GA > * 4.0 Minor Releases > * ... > * 5.0 Dev > * ... > * 5.0 Verification Work > * GA 5.0 > > I think that anything that is prohibited from "[Merge Window]" because it > invalidates "4.0 Verification Work" must also be prohibited until "5.0 Dev" > because the next equivalent work that can now validate it occurs only at "5.0 > Verification Work" > > On 27/05/2020, 19:05, "Benedict Elliott Smith" wrote: > >I'm not sure if I communicated my point very well. I mean to say that if > the reason we are prohibiting a patch to land post-beta is because it > invalidates work we only perform pre-ga, then it probably should not be > permitted to land post-ga either, since it must also invalidate the same work? > >That is to say, if we're comfortable with work landing post-ga because we > believe it to be safe to release without our pre-major-release verification, > we should be comfortable with it landing at any time pre-ga too. Anything > else seems inconsistent to me, and we should examine what assumptions we're > making that permit this inconsistency to arise. > > >On 27/05/2020, 18:49, "Joshua McKenzie" wrote: > >> >> because it invalidates our pre-release verification, then it should not >> land > >until we next perform pre-release verification > >At least for me there's a little softness around our collective > alignment >on when pre-release verification takes place. If it's between alpha-1 > and >ga we don't want changes that would invalidate those changes to land > during >that time frame. Different for beta-1 to ga. We also risk invalidating >testing if we do any of that testing before wherever that cutoff is, > and a >lack of clarity on that cutoff further muddies those waters. > >My very loosely held perspective is that beta-1 to ga is the window in >which we apply the "don't do things that will invalidate > verification", and >we plan to do that verification during the beta phase. I *think* this > is >consistent w/the current framing of the lifecycle doc. That being > said, I >don't have strong religion on this so if we collectively want to call > it >"don't majorly disrupt from alpha-1 to ga", we can formalize that in > the >docs and go ahead and triage current open scope for 4.0 and move > things out. > > > >On Wed, May 27, 2020 at 12:59 PM Ekaterina Dimitrova < >ekaterina.dimitr...@datastax.com> wrote: > >> Thank you all for your input. >> I think an important topic is again to revise the lifecycle and ensure we >> really have the vision on what is left until beta. I will start a separate >> thread on the flaky tests situation soon. >> >> For this particular ticket I see a couple of things: >> - There are a lot of deletions of already not used code >> - I implemented it still in alpha as per our agreement that this will give >> us enough time for testing. Probably Dinesh as a reviewer can give some >> valuable feedback/opinion on the patch. >> - It definitely touches around important places but the important thing is >> to see how exactly it touches, I think >> - Considering it for alpha before the major testing in beta sounds >> reasonable to me but I guess it also depends on people availability to >> review it in detail and the exact test plans afterwards >> >> On Wed, 27 May 2020 at 7:14, Benedict Elliott Smith >> wrote: >> >>> I think our "pre-beta" criteria should also be our "not in a major" >>> criteria. >>> >>> If work is prohibited because it invalidates our pre-release >> verification, >>> then it should not land until we next perform pre-release verification, >>> which only currently happens once per major. >>> >>> This could mean either landing less in a major, or permitting more in >> beta >>> etc. >>> >>> On 26/05/2020, 19:24, "Joshua McKenzie" wrote: >>> >>>I think an interesting question that informs when to stop accepting >>>specific changes in a release is when we expect any extensive >>> pre-release >>>testing to take place. >>> >>>If we go by our release lifecycle, gutting deprecated code seems >>> compatible >>>w/Alpha but I wouldn't endorse merging it into Beta: >>> >>> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle. >>>Since almost all of the 40_quality_testing epic stuff is also beta >>> phase >>>and hasn't really taken off yet, it also seems like there will be >>> extensive >>>testing after this ph
Re: [DISCUSS] CASSANDRA-13994
That makes sense to me, yep. My hope and expectation is that the time required for "verification work" will shrink dramatically in the not too distant future - ideally to a period of less than a month. In this world, the cost of missing one train is reduced to catching the next one. One of the main goals in shifting focus from "testing" and "test plans" to "test engineering" is automating as many aspects of release qualification as possible, with an asymptotic ideal as a function of compute capacity and time. While such automation will never be complete (it's likely that development of new features will/must include qualification infra changes to exercise them), if we're able to apply the same rigor to major releases as we are to patchlevel builds with little incremental effort, I'd be thrilled. This is mostly a way of saying: – I like the cadence/sequencing Benedict proposes below. – I think improvements in test engineering can reduce/eliminate invalidation and may increase the scope of what can be a candidate for merge on a given branch – And if not, the cost of missing the train is lower because we'll be able to deliver major releases more often. Scott From: Jeremiah D Jordan Sent: Wednesday, May 27, 2020 11:54 AM To: Cassandra DEV Subject: Re: [DISCUSS] CASSANDRA-13994 +1 strongly agree. If we aren’t going to let something go into 4.0.0 because it would "invalidate testing” then we can not let such a thing go into 4.0.1 unless we plan to re-do said testing for the patch release. > On May 27, 2020, at 1:31 PM, Benedict Elliott Smith > wrote: > > I'm being told this still isn't clear, so let me try in a bullet-point > timeline: > > * 4.0 Beta > * 4.0 Verification Work > * [Merge Window] > * 4.0 GA > * 4.0 Minor Releases > * ... > * 5.0 Dev > * ... > * 5.0 Verification Work > * GA 5.0 > > I think that anything that is prohibited from "[Merge Window]" because it > invalidates "4.0 Verification Work" must also be prohibited until "5.0 Dev" > because the next equivalent work that can now validate it occurs only at "5.0 > Verification Work" > > On 27/05/2020, 19:05, "Benedict Elliott Smith" wrote: > >I'm not sure if I communicated my point very well. I mean to say that if > the reason we are prohibiting a patch to land post-beta is because it > invalidates work we only perform pre-ga, then it probably should not be > permitted to land post-ga either, since it must also invalidate the same work? > >That is to say, if we're comfortable with work landing post-ga because we > believe it to be safe to release without our pre-major-release verification, > we should be comfortable with it landing at any time pre-ga too. Anything > else seems inconsistent to me, and we should examine what assumptions we're > making that permit this inconsistency to arise. > > >On 27/05/2020, 18:49, "Joshua McKenzie" wrote: > >> >> because it invalidates our pre-release verification, then it should not >> land > >until we next perform pre-release verification > >At least for me there's a little softness around our collective > alignment >on when pre-release verification takes place. If it's between alpha-1 > and >ga we don't want changes that would invalidate those changes to land > during >that time frame. Different for beta-1 to ga. We also risk invalidating >testing if we do any of that testing before wherever that cutoff is, > and a >lack of clarity on that cutoff further muddies those waters. > >My very loosely held perspective is that beta-1 to ga is the window in >which we apply the "don't do things that will invalidate > verification", and >we plan to do that verification during the beta phase. I *think* this > is >consistent w/the current framing of the lifecycle doc. That being > said, I >don't have strong religion on this so if we collectively want to call > it >"don't majorly disrupt from alpha-1 to ga", we can formalize that in > the >docs and go ahead and triage current open scope for 4.0 and move > things out. > > > >On Wed, May 27, 2020 at 12:59 PM Ekaterina Dimitrova < >ekaterina.dimitr...@datastax.com> wrote: > >> Thank you all for your input. >> I think an important topic is again to revise the lifecycle and ensure we >> really have the vision on what is left until beta. I will start a separate >> thread on the flaky tests situation soon. >> >> For this particular ticket I see a couple of things: >> - There are a lot of deletions of already not used code >> - I implemented it still in alpha as per our agreement that this will give >> us enough time for testing. Probably Dinesh as a reviewer can give some >> valuable feedback/opinion on the patch. >> - It definitely touches around important places but the important thing is >> to see how exactly it touches, I think >> - Considering it for
Re: [DISCUSS] CASSANDRA-13994
I think we're all on the same page here; I was focusing more on the release lifecycles and sequencing than the entire version cycle. Good to broaden scope I think. One thing we're not considering is the separation of API changes from major changes and how that intersects with release milestones. Meaning: 1. alpha phase 2. Milestone: API freeze (all API changes pushed to next major) 3. beta phase 4. Verification phase (all major disruptive pushed to next major) A clear point to cut RC's doesn't surface from the above for me. Releasing an RC before broad verification seems wrong, and cutting an RC after the 4 points above may as well be GA because it's all known scope. Thoughts? On Wed, May 27, 2020 at 3:28 PM Scott Andreas wrote: > That makes sense to me, yep. > > My hope and expectation is that the time required for "verification work" > will shrink dramatically in the not too distant future - ideally to a > period of less than a month. In this world, the cost of missing one train > is reduced to catching the next one. > > One of the main goals in shifting focus from "testing" and "test plans" to > "test engineering" is automating as many aspects of release qualification > as possible, with an asymptotic ideal as a function of compute capacity and > time. While such automation will never be complete (it's likely that > development of new features will/must include qualification infra changes > to exercise them), if we're able to apply the same rigor to major releases > as we are to patchlevel builds with little incremental effort, I'd be > thrilled. > > This is mostly a way of saying: > – I like the cadence/sequencing Benedict proposes below. > – I think improvements in test engineering can reduce/eliminate > invalidation and may increase the scope of what can be a candidate for > merge on a given branch > – And if not, the cost of missing the train is lower because we'll be able > to deliver major releases more often. > > Scott > > > From: Jeremiah D Jordan > Sent: Wednesday, May 27, 2020 11:54 AM > To: Cassandra DEV > Subject: Re: [DISCUSS] CASSANDRA-13994 > > +1 strongly agree. If we aren’t going to let something go into 4.0.0 > because it would "invalidate testing” then we can not let such a thing go > into 4.0.1 unless we plan to re-do said testing for the patch release. > > > On May 27, 2020, at 1:31 PM, Benedict Elliott Smith > wrote: > > > > I'm being told this still isn't clear, so let me try in a bullet-point > timeline: > > > > * 4.0 Beta > > * 4.0 Verification Work > > * [Merge Window] > > * 4.0 GA > > * 4.0 Minor Releases > > * ... > > * 5.0 Dev > > * ... > > * 5.0 Verification Work > > * GA 5.0 > > > > I think that anything that is prohibited from "[Merge Window]" because > it invalidates "4.0 Verification Work" must also be prohibited until "5.0 > Dev" because the next equivalent work that can now validate it occurs only > at "5.0 Verification Work" > > > > On 27/05/2020, 19:05, "Benedict Elliott Smith" > wrote: > > > >I'm not sure if I communicated my point very well. I mean to say > that if the reason we are prohibiting a patch to land post-beta is because > it invalidates work we only perform pre-ga, then it probably should not be > permitted to land post-ga either, since it must also invalidate the same > work? > > > >That is to say, if we're comfortable with work landing post-ga > because we believe it to be safe to release without our pre-major-release > verification, we should be comfortable with it landing at any time pre-ga > too. Anything else seems inconsistent to me, and we should examine what > assumptions we're making that permit this inconsistency to arise. > > > > > >On 27/05/2020, 18:49, "Joshua McKenzie" > wrote: > > > >> > >> because it invalidates our pre-release verification, then it should not > >> land > > > >until we next perform pre-release verification > > > >At least for me there's a little softness around our collective > alignment > >on when pre-release verification takes place. If it's between > alpha-1 and > >ga we don't want changes that would invalidate those changes to > land during > >that time frame. Different for beta-1 to ga. We also risk > invalidating > >testing if we do any of that testing before wherever that cutoff > is, and a > >lack of clarity on that cutoff further muddies those waters. > > > >My very loosely held perspective is that beta-1 to ga is the > window in > >which we apply the "don't do things that will invalidate > verification", and > >we plan to do that verification during the beta phase. I *think* > this is > >consistent w/the current framing of the lifecycle doc. That being > said, I > >don't have strong religion on this so if we collectively want to > call it > >"don't majorly disrupt from alpha-1 to ga", we can formalize that > in the > >docs and go ah
Re: [DISCUSS] CASSANDRA-13994
> A clear point to cut RC's doesn't surface from the above for me. Releasing > an RC before broad verification seems wrong, and cutting an RC after the 4 > points above may as well be GA because it's all known scope. Isn’t the whole point of an RC is that it could be the GA? It is a “release candidate”, meaning if no one finds any issues with it, that can them become the release? So that seems like exactly the right time to make RC releases? > On May 27, 2020, at 2:45 PM, Joshua McKenzie wrote: > > I think we're all on the same page here; I was focusing more on the release > lifecycles and sequencing than the entire version cycle. Good to broaden > scope I think. > > One thing we're not considering is the separation of API changes from major > changes and how that intersects with release milestones. > > Meaning: > 1. alpha phase > 2. Milestone: API freeze (all API changes pushed to next major) > 3. beta phase > 4. Verification phase (all major disruptive pushed to next major) > > A clear point to cut RC's doesn't surface from the above for me. Releasing > an RC before broad verification seems wrong, and cutting an RC after the 4 > points above may as well be GA because it's all known scope. > > Thoughts? > > On Wed, May 27, 2020 at 3:28 PM Scott Andreas wrote: > >> That makes sense to me, yep. >> >> My hope and expectation is that the time required for "verification work" >> will shrink dramatically in the not too distant future - ideally to a >> period of less than a month. In this world, the cost of missing one train >> is reduced to catching the next one. >> >> One of the main goals in shifting focus from "testing" and "test plans" to >> "test engineering" is automating as many aspects of release qualification >> as possible, with an asymptotic ideal as a function of compute capacity and >> time. While such automation will never be complete (it's likely that >> development of new features will/must include qualification infra changes >> to exercise them), if we're able to apply the same rigor to major releases >> as we are to patchlevel builds with little incremental effort, I'd be >> thrilled. >> >> This is mostly a way of saying: >> – I like the cadence/sequencing Benedict proposes below. >> – I think improvements in test engineering can reduce/eliminate >> invalidation and may increase the scope of what can be a candidate for >> merge on a given branch >> – And if not, the cost of missing the train is lower because we'll be able >> to deliver major releases more often. >> >> Scott >> >> >> From: Jeremiah D Jordan >> Sent: Wednesday, May 27, 2020 11:54 AM >> To: Cassandra DEV >> Subject: Re: [DISCUSS] CASSANDRA-13994 >> >> +1 strongly agree. If we aren’t going to let something go into 4.0.0 >> because it would "invalidate testing” then we can not let such a thing go >> into 4.0.1 unless we plan to re-do said testing for the patch release. >> >>> On May 27, 2020, at 1:31 PM, Benedict Elliott Smith >> wrote: >>> >>> I'm being told this still isn't clear, so let me try in a bullet-point >> timeline: >>> >>> * 4.0 Beta >>> * 4.0 Verification Work >>> * [Merge Window] >>> * 4.0 GA >>> * 4.0 Minor Releases >>> * ... >>> * 5.0 Dev >>> * ... >>> * 5.0 Verification Work >>> * GA 5.0 >>> >>> I think that anything that is prohibited from "[Merge Window]" because >> it invalidates "4.0 Verification Work" must also be prohibited until "5.0 >> Dev" because the next equivalent work that can now validate it occurs only >> at "5.0 Verification Work" >>> >>> On 27/05/2020, 19:05, "Benedict Elliott Smith" >> wrote: >>> >>> I'm not sure if I communicated my point very well. I mean to say >> that if the reason we are prohibiting a patch to land post-beta is because >> it invalidates work we only perform pre-ga, then it probably should not be >> permitted to land post-ga either, since it must also invalidate the same >> work? >>> >>> That is to say, if we're comfortable with work landing post-ga >> because we believe it to be safe to release without our pre-major-release >> verification, we should be comfortable with it landing at any time pre-ga >> too. Anything else seems inconsistent to me, and we should examine what >> assumptions we're making that permit this inconsistency to arise. >>> >>> >>> On 27/05/2020, 18:49, "Joshua McKenzie" >> wrote: >>> because it invalidates our pre-release verification, then it should not land >>> >>> until we next perform pre-release verification >>> >>> At least for me there's a little softness around our collective >> alignment >>> on when pre-release verification takes place. If it's between >> alpha-1 and >>> ga we don't want changes that would invalidate those changes to >> land during >>> that time frame. Different for beta-1 to ga. We also risk >> invalidating >>> testing if we do any of that testing before wherever that cutoff >> is, and a >
Re: [DISCUSS] CASSANDRA-13994
Absolutely my understanding. On Wed, May 27, 2020, 2:49 PM Jeremiah D Jordan wrote: > > A clear point to cut RC's doesn't surface from the above for me. > Releasing > > an RC before broad verification seems wrong, and cutting an RC after the > 4 > > points above may as well be GA because it's all known scope. > > Isn’t the whole point of an RC is that it could be the GA? It is a > “release candidate”, meaning if no one finds any issues with it, that can > them become the release? So that seems like exactly the right time to make > RC releases? > > > On May 27, 2020, at 2:45 PM, Joshua McKenzie > wrote: > > > > I think we're all on the same page here; I was focusing more on the > release > > lifecycles and sequencing than the entire version cycle. Good to broaden > > scope I think. > > > > One thing we're not considering is the separation of API changes from > major > > changes and how that intersects with release milestones. > > > > Meaning: > > 1. alpha phase > > 2. Milestone: API freeze (all API changes pushed to next major) > > 3. beta phase > > 4. Verification phase (all major disruptive pushed to next major) > > > > A clear point to cut RC's doesn't surface from the above for me. > Releasing > > an RC before broad verification seems wrong, and cutting an RC after the > 4 > > points above may as well be GA because it's all known scope. > > > > Thoughts? > > > > On Wed, May 27, 2020 at 3:28 PM Scott Andreas > wrote: > > > >> That makes sense to me, yep. > >> > >> My hope and expectation is that the time required for "verification > work" > >> will shrink dramatically in the not too distant future - ideally to a > >> period of less than a month. In this world, the cost of missing one > train > >> is reduced to catching the next one. > >> > >> One of the main goals in shifting focus from "testing" and "test plans" > to > >> "test engineering" is automating as many aspects of release > qualification > >> as possible, with an asymptotic ideal as a function of compute capacity > and > >> time. While such automation will never be complete (it's likely that > >> development of new features will/must include qualification infra > changes > >> to exercise them), if we're able to apply the same rigor to major > releases > >> as we are to patchlevel builds with little incremental effort, I'd be > >> thrilled. > >> > >> This is mostly a way of saying: > >> – I like the cadence/sequencing Benedict proposes below. > >> – I think improvements in test engineering can reduce/eliminate > >> invalidation and may increase the scope of what can be a candidate for > >> merge on a given branch > >> – And if not, the cost of missing the train is lower because we'll be > able > >> to deliver major releases more often. > >> > >> Scott > >> > >> > >> From: Jeremiah D Jordan > >> Sent: Wednesday, May 27, 2020 11:54 AM > >> To: Cassandra DEV > >> Subject: Re: [DISCUSS] CASSANDRA-13994 > >> > >> +1 strongly agree. If we aren’t going to let something go into 4.0.0 > >> because it would "invalidate testing” then we can not let such a thing > go > >> into 4.0.1 unless we plan to re-do said testing for the patch release. > >> > >>> On May 27, 2020, at 1:31 PM, Benedict Elliott Smith < > bened...@apache.org> > >> wrote: > >>> > >>> I'm being told this still isn't clear, so let me try in a bullet-point > >> timeline: > >>> > >>> * 4.0 Beta > >>> * 4.0 Verification Work > >>> * [Merge Window] > >>> * 4.0 GA > >>> * 4.0 Minor Releases > >>> * ... > >>> * 5.0 Dev > >>> * ... > >>> * 5.0 Verification Work > >>> * GA 5.0 > >>> > >>> I think that anything that is prohibited from "[Merge Window]" because > >> it invalidates "4.0 Verification Work" must also be prohibited until > "5.0 > >> Dev" because the next equivalent work that can now validate it occurs > only > >> at "5.0 Verification Work" > >>> > >>> On 27/05/2020, 19:05, "Benedict Elliott Smith" > >> wrote: > >>> > >>> I'm not sure if I communicated my point very well. I mean to say > >> that if the reason we are prohibiting a patch to land post-beta is > because > >> it invalidates work we only perform pre-ga, then it probably should not > be > >> permitted to land post-ga either, since it must also invalidate the same > >> work? > >>> > >>> That is to say, if we're comfortable with work landing post-ga > >> because we believe it to be safe to release without our > pre-major-release > >> verification, we should be comfortable with it landing at any time > pre-ga > >> too. Anything else seems inconsistent to me, and we should examine what > >> assumptions we're making that permit this inconsistency to arise. > >>> > >>> > >>> On 27/05/2020, 18:49, "Joshua McKenzie" > >> wrote: > >>> > > because it invalidates our pre-release verification, then it should > not > land > >>> > >>> until we next perform pre-release verification > >>> > >>> At least for me there's a little softness around our collective >
Re: [DISCUSS] CASSANDRA-13994
Maybe. Do we just time box, say we're going to cut an RC and give it 4 weeks, if nothing awful surfaces we GA? On Wed, May 27, 2020 at 4:12 PM Brandon Williams wrote: > Absolutely my understanding. > > On Wed, May 27, 2020, 2:49 PM Jeremiah D Jordan > > wrote: > > > > A clear point to cut RC's doesn't surface from the above for me. > > Releasing > > > an RC before broad verification seems wrong, and cutting an RC after > the > > 4 > > > points above may as well be GA because it's all known scope. > > > > Isn’t the whole point of an RC is that it could be the GA? It is a > > “release candidate”, meaning if no one finds any issues with it, that can > > them become the release? So that seems like exactly the right time to > make > > RC releases? > > > > > On May 27, 2020, at 2:45 PM, Joshua McKenzie > > wrote: > > > > > > I think we're all on the same page here; I was focusing more on the > > release > > > lifecycles and sequencing than the entire version cycle. Good to > broaden > > > scope I think. > > > > > > One thing we're not considering is the separation of API changes from > > major > > > changes and how that intersects with release milestones. > > > > > > Meaning: > > > 1. alpha phase > > > 2. Milestone: API freeze (all API changes pushed to next major) > > > 3. beta phase > > > 4. Verification phase (all major disruptive pushed to next major) > > > > > > A clear point to cut RC's doesn't surface from the above for me. > > Releasing > > > an RC before broad verification seems wrong, and cutting an RC after > the > > 4 > > > points above may as well be GA because it's all known scope. > > > > > > Thoughts? > > > > > > On Wed, May 27, 2020 at 3:28 PM Scott Andreas > > wrote: > > > > > >> That makes sense to me, yep. > > >> > > >> My hope and expectation is that the time required for "verification > > work" > > >> will shrink dramatically in the not too distant future - ideally to a > > >> period of less than a month. In this world, the cost of missing one > > train > > >> is reduced to catching the next one. > > >> > > >> One of the main goals in shifting focus from "testing" and "test > plans" > > to > > >> "test engineering" is automating as many aspects of release > > qualification > > >> as possible, with an asymptotic ideal as a function of compute > capacity > > and > > >> time. While such automation will never be complete (it's likely that > > >> development of new features will/must include qualification infra > > changes > > >> to exercise them), if we're able to apply the same rigor to major > > releases > > >> as we are to patchlevel builds with little incremental effort, I'd be > > >> thrilled. > > >> > > >> This is mostly a way of saying: > > >> – I like the cadence/sequencing Benedict proposes below. > > >> – I think improvements in test engineering can reduce/eliminate > > >> invalidation and may increase the scope of what can be a candidate for > > >> merge on a given branch > > >> – And if not, the cost of missing the train is lower because we'll be > > able > > >> to deliver major releases more often. > > >> > > >> Scott > > >> > > >> > > >> From: Jeremiah D Jordan > > >> Sent: Wednesday, May 27, 2020 11:54 AM > > >> To: Cassandra DEV > > >> Subject: Re: [DISCUSS] CASSANDRA-13994 > > >> > > >> +1 strongly agree. If we aren’t going to let something go into 4.0.0 > > >> because it would "invalidate testing” then we can not let such a thing > > go > > >> into 4.0.1 unless we plan to re-do said testing for the patch release. > > >> > > >>> On May 27, 2020, at 1:31 PM, Benedict Elliott Smith < > > bened...@apache.org> > > >> wrote: > > >>> > > >>> I'm being told this still isn't clear, so let me try in a > bullet-point > > >> timeline: > > >>> > > >>> * 4.0 Beta > > >>> * 4.0 Verification Work > > >>> * [Merge Window] > > >>> * 4.0 GA > > >>> * 4.0 Minor Releases > > >>> * ... > > >>> * 5.0 Dev > > >>> * ... > > >>> * 5.0 Verification Work > > >>> * GA 5.0 > > >>> > > >>> I think that anything that is prohibited from "[Merge Window]" > because > > >> it invalidates "4.0 Verification Work" must also be prohibited until > > "5.0 > > >> Dev" because the next equivalent work that can now validate it occurs > > only > > >> at "5.0 Verification Work" > > >>> > > >>> On 27/05/2020, 19:05, "Benedict Elliott Smith" > > > >> wrote: > > >>> > > >>> I'm not sure if I communicated my point very well. I mean to say > > >> that if the reason we are prohibiting a patch to land post-beta is > > because > > >> it invalidates work we only perform pre-ga, then it probably should > not > > be > > >> permitted to land post-ga either, since it must also invalidate the > same > > >> work? > > >>> > > >>> That is to say, if we're comfortable with work landing post-ga > > >> because we believe it to be safe to release without our > > pre-major-release > > >> verification, we should be comfortable with it landing at any time > > pre-ga > > >> too. Anything e
[DISCUSSION] Flaky tests
Dear all, I spent some time these days looking into the Release Lifecycle document. As we keep on saying we approach Beta based on the Jira board, I was curious what is the exact borderline to cut it. Looking at all the latest reports (thanks to everyone who was working on that; I think having an overview on what's going on is always a good thing), I have the feeling that the thing that prevents us primarily from cutting beta at the moment is flaky tests. According to the lifecycle document: - No flaky tests - All tests (Unit Tests and DTests) should pass consistently. A failing test, upon analyzing the root cause of failure, may be “ignored in exceptional cases”, if appropriate, for the release, after discussion in the dev mailing list." Now the related questions that popped up into my mind: - "ignored in exceptional cases" - examples? - No flaky tests according to Jenkins or CircleCI? Also, some people run the free tier, others take advantage of premium CircleCI. What should be the framework? - Furthermore, flaky tests with what frequency? (This is a tricky question, I know) In different conversations with colleagues from the C* community I got the impression that canonical suite (in this case Jenkins) might be the right direction to follow. To be clear, I am always checking any failures seen in any environment and I truly believe that they are worth it to be checked. Not advocating to skip anything! But also, sometimes I feel in many cases CircleCI could provide input worth tracking but less likely to be product flakes. Am I right? In addition, different people use different CircleCI config and see different output. Not to mention flaky tests on Mac running with two cores... Yes, this is sometimes the only way to reproduce some of the reported tests' issues... So my idea was to suggest to start tracking an exact Jenkins report maybe? Anything reported out of it also to be checked but potentially to be able to leave it for Beta in case we don't feel it shows a product defect. One more thing to consider is that the big Test epic is primarily happening in beta. Curious to hear what the community thinks about this topic. Probably people also have additional thoughts based on experience from the previous releases. How those things worked in the past? Any lessons learned? What is our "plan Beta"? Ekaterina Dimitrova e. ekaterina.dimitr...@datastax.com w. www.datastax.com
Re: [DISCUSS] CASSANDRA-13994
On Wed, May 27, 2020 at 1:23 PM Joshua McKenzie wrote: > Maybe. Do we just time box, say we're going to cut an RC and give it 4 > weeks, if nothing awful surfaces we GA? > I've seen that work well in the past on other projects. I agree with the notion that RCs are real candidates for release if no one finds issues with them. Ideally we would have as little RCs as possible and have more alphas/betas. > > On Wed, May 27, 2020 at 4:12 PM Brandon Williams wrote: > > > Absolutely my understanding. > > > > On Wed, May 27, 2020, 2:49 PM Jeremiah D Jordan < > jeremiah.jor...@gmail.com > > > > > wrote: > > > > > > A clear point to cut RC's doesn't surface from the above for me. > > > Releasing > > > > an RC before broad verification seems wrong, and cutting an RC after > > the > > > 4 > > > > points above may as well be GA because it's all known scope. > > > > > > Isn’t the whole point of an RC is that it could be the GA? It is a > > > “release candidate”, meaning if no one finds any issues with it, that > can > > > them become the release? So that seems like exactly the right time to > > make > > > RC releases? > > > > > > > On May 27, 2020, at 2:45 PM, Joshua McKenzie > > > wrote: > > > > > > > > I think we're all on the same page here; I was focusing more on the > > > release > > > > lifecycles and sequencing than the entire version cycle. Good to > > broaden > > > > scope I think. > > > > > > > > One thing we're not considering is the separation of API changes from > > > major > > > > changes and how that intersects with release milestones. > > > > > > > > Meaning: > > > > 1. alpha phase > > > > 2. Milestone: API freeze (all API changes pushed to next major) > > > > 3. beta phase > > > > 4. Verification phase (all major disruptive pushed to next major) > > > > > > > > A clear point to cut RC's doesn't surface from the above for me. > > > Releasing > > > > an RC before broad verification seems wrong, and cutting an RC after > > the > > > 4 > > > > points above may as well be GA because it's all known scope. > > > > > > > > Thoughts? > > > > > > > > On Wed, May 27, 2020 at 3:28 PM Scott Andreas > > > wrote: > > > > > > > >> That makes sense to me, yep. > > > >> > > > >> My hope and expectation is that the time required for "verification > > > work" > > > >> will shrink dramatically in the not too distant future - ideally to > a > > > >> period of less than a month. In this world, the cost of missing one > > > train > > > >> is reduced to catching the next one. > > > >> > > > >> One of the main goals in shifting focus from "testing" and "test > > plans" > > > to > > > >> "test engineering" is automating as many aspects of release > > > qualification > > > >> as possible, with an asymptotic ideal as a function of compute > > capacity > > > and > > > >> time. While such automation will never be complete (it's likely that > > > >> development of new features will/must include qualification infra > > > changes > > > >> to exercise them), if we're able to apply the same rigor to major > > > releases > > > >> as we are to patchlevel builds with little incremental effort, I'd > be > > > >> thrilled. > > > >> > > > >> This is mostly a way of saying: > > > >> – I like the cadence/sequencing Benedict proposes below. > > > >> – I think improvements in test engineering can reduce/eliminate > > > >> invalidation and may increase the scope of what can be a candidate > for > > > >> merge on a given branch > > > >> – And if not, the cost of missing the train is lower because we'll > be > > > able > > > >> to deliver major releases more often. > > > >> > > > >> Scott > > > >> > > > >> > > > >> From: Jeremiah D Jordan > > > >> Sent: Wednesday, May 27, 2020 11:54 AM > > > >> To: Cassandra DEV > > > >> Subject: Re: [DISCUSS] CASSANDRA-13994 > > > >> > > > >> +1 strongly agree. If we aren’t going to let something go into > 4.0.0 > > > >> because it would "invalidate testing” then we can not let such a > thing > > > go > > > >> into 4.0.1 unless we plan to re-do said testing for the patch > release. > > > >> > > > >>> On May 27, 2020, at 1:31 PM, Benedict Elliott Smith < > > > bened...@apache.org> > > > >> wrote: > > > >>> > > > >>> I'm being told this still isn't clear, so let me try in a > > bullet-point > > > >> timeline: > > > >>> > > > >>> * 4.0 Beta > > > >>> * 4.0 Verification Work > > > >>> * [Merge Window] > > > >>> * 4.0 GA > > > >>> * 4.0 Minor Releases > > > >>> * ... > > > >>> * 5.0 Dev > > > >>> * ... > > > >>> * 5.0 Verification Work > > > >>> * GA 5.0 > > > >>> > > > >>> I think that anything that is prohibited from "[Merge Window]" > > because > > > >> it invalidates "4.0 Verification Work" must also be prohibited until > > > "5.0 > > > >> Dev" because the next equivalent work that can now validate it > occurs > > > only > > > >> at "5.0 Verification Work" > > > >>> > > > >>> On 27/05/2020, 19:05, "Benedict Elliott Smith" < > bened...@apache.org > > > > > > >> wrote:
Re: [DISCUSSION] Flaky tests
> > So my idea was to suggest to start tracking an exact Jenkins report maybe? Basing our point of view on the canonical test runs on apache infra makes sense to me, assuming that infra is behaving these days. :) Pretty sure Mick got that in working order. At least for me, what I learned in the past is we'd drive to a green test board and immediately transition it as a milestone, so flaky tests would reappear like a disappointing game of whack-a-mole. They seem frustratingly ever-present. I'd personally advocate for us taking the following stance on flaky tests from this point in the cycle forward: - Default posture to label fix version as beta - *excepting* on case-by-case basis, if flake could imply product defect that would greatly impair beta testing we leave alpha - Take current flakes and go fixver beta - Hard, no compromise position on "we don't RC until all flakes are dead" - Use Jenkins as canonical source of truth for "is beta ready" cutoff I'm personally balancing the risk of flaky tests confounding beta work against my perceived value of being able to widely signal beta's availability and encourage widespread user testing. I believe the value in the latter justifies the risk of the former (I currently perceive that risk as minimal; I could be wrong). I am also weighting the risk of "test failures persist to or past RC" at 0. That's a hill I'll die on. On Wed, May 27, 2020 at 5:13 PM Ekaterina Dimitrova < ekaterina.dimitr...@datastax.com> wrote: > Dear all, > I spent some time these days looking into the Release Lifecycle document. > As we keep on saying we approach Beta based on the Jira board, I was > curious what is the exact borderline to cut it. > > Looking at all the latest reports (thanks to everyone who was working on > that; I think having an overview on what's going on is always a good > thing), I have the feeling that the thing that prevents us primarily from > cutting beta at the moment is flaky tests. According to the lifecycle > document: > >- No flaky tests - All tests (Unit Tests and DTests) should pass >consistently. A failing test, upon analyzing the root cause of failure, > may >be “ignored in exceptional cases”, if appropriate, for the release, > after >discussion in the dev mailing list." > > Now the related questions that popped up into my mind: > - "ignored in exceptional cases" - examples? > - No flaky tests according to Jenkins or CircleCI? Also, some people run > the free tier, others take advantage of premium CircleCI. What should be > the framework? > - Furthermore, flaky tests with what frequency? (This is a tricky question, > I know) > > In different conversations with colleagues from the C* community I got the > impression that canonical suite (in this case Jenkins) might be the right > direction to follow. > > To be clear, I am always checking any failures seen in any environment and > I truly believe that they are worth it to be checked. Not advocating to > skip anything! But also, sometimes I feel in many cases CircleCI could > provide input worth tracking but less likely to be product flakes. Am I > right? In addition, different people use different CircleCI config and see > different output. Not to mention flaky tests on Mac running with two > cores... Yes, this is sometimes the only way to reproduce some of the > reported tests' issues... > > So my idea was to suggest to start tracking an exact Jenkins report maybe? > Anything reported out of it also to be checked but potentially to be able > to leave it for Beta in case we don't feel it shows a product defect. One > more thing to consider is that the big Test epic is primarily happening in > beta. > > Curious to hear what the community thinks about this topic. Probably people > also have additional thoughts based on experience from the previous > releases. How those things worked in the past? Any lessons learned? What is > our "plan Beta"? > > Ekaterina Dimitrova > e. ekaterina.dimitr...@datastax.com > w. www.datastax.com >