I agree Benedict - I don't think we can provide a clear advisory to our users, so I would approve of not sharing anything in the release notes. But if someone posts an issue (likely to the user ML) related to streaming / bootstrapping on 4.1.0, then we should engage with the knowledge that it might be related to 18110.
> On Dec 12, 2022, at 5:06 PM, Benedict <[email protected]> wrote: > > I’m unsure that without more information it is very helpful to highlight in > the release notes. We don’t even have a strong hypothesis tying this issue to > 4.1.0 specifically, and don’t have a general policy of highlighting > undiagnosed issues in release notes? > > >> On 13 Dec 2022, at 00:48, Jon Meredith <[email protected]> wrote: >> >> >> Thanks for the extra time to investigate. Unfortunately no progress on >> finding the root cause for this issue, just successful bootstraps in our >> attempts to reproduce. I think highlighting the ticket in the release notes >> is sufficient and resolving this issue should not hold up the release. >> >> I agree with Jeff that the multiple concurrent bootstraps are unlikely to be >> the issue - I only mentioned in the ticket in case I am wrong. Abe or I will >> update the ticket if we find anything new. >> >> On Sun, Dec 11, 2022 at 12:33 PM Jeff Jirsa <[email protected] >> <mailto:[email protected]>> wrote: >> Concurrent shouldn’t matter (they’re non-overlapping in the repro). And I’d >> personally be a bit surprised if table count matters that much. >> >> It probably just requires high core count and enough data that the streams >> actually interact with the rate limiter >> >>> On Dec 11, 2022, at 10:32 AM, Mick Semb Wever <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> >>> >>> On Sat, 10 Dec 2022 at 23:09, Abe Ratnofsky <[email protected] >>> <mailto:[email protected]>> wrote: >>> Sorry - responded on the take1 thread: >>> >>> Could we defer the close of this vote til Monday, December 12th after 6pm >>> Pacific Time? >>> >>> Jon Meredith and I have been working thru an issue blocking streaming on >>> 4.1 for the last couple months, and are now testing a promising fix. We're >>> currently working on a write-up, and we'd like to hold the release until >>> the community is able to review our findings. >>> >>> >>> Update on behalf of Jon and Abe. >>> >>> The issue raised is CASSANDRA-18110. >>> Concurrent, or nodes with high cpu count and number of tables performing, >>> host replacements can fail. >>> >>> It is still unclear if this is applicable to OSS C*, and if so to what >>> extent users might ever be impacted. >>> More importantly, there's a simple workaround for anyone that hits the >>> problem. >>> >>> Without further information on the table, I'm inclined to continue with >>> 4.1.0 GA (closing the vote in 32 hours), but add a clear message to the >>> release announcement of the issue and workaround. Interested in hearing >>> others' positions, don't be afraid to veto if that's where you're at. >>> >>>
