Re: [DISCUSS] CEP-21: Transactional Cluster Metadata

2022-09-02 Thread Benedict
Unmesh, LWTs today repair themselves periodically already, and do not rely on a 
later proposer.

Also, the CMS will naturally use a single partition key for each log it needs 
to maintain, else they would not be linearised.

> On 2 Sep 2022, at 05:01, Unmesh Joshi  wrote:
> 
> 
>> I think implementation has to work according to expectations described in 
>> CEP, and have enough tests to prove it. You can follow the progress of the 
>> patch whenever CEP is accepted and code is published to learn about the 
>> details. 
> 
> 
> Thanks, will follow the implementation.
> 
>> If you'd like to learn more about incomplete Paxos writes (I'm assuming you 
>> mean dealing with inability of proposer to collect a second quorum), you can 
>> refer to Cassandra Paxos implementation. In our prototypes, we were able to 
>> simply use Cassandra Paxos out of the box, and everything related to Paxos 
>> is hidden from us behind CQL syntax.
>  
> Yes, it's the inability of the proposer to collect a second quorum. As I 
> understand the existing LWT Paxos implementation is per key instance of 
> Paxos. Being a key-value setup, it can always repair incomplete paxos runs 
> when the key is read, For immutable log entries for CMS, it needs to be 
> different. (LWT is also expecting a mutable operation on key. So it requires 
> resetting of paxos state on commit and handling committed values separately 
> as part of prepare. For immutable entries, that's not required). But will 
> wait for the Jira and PR to understand the proposed approach better.
> 
> Thanks,
> Unmesh
> 
> 
>> 
>>> On Thu, Sep 1, 2022, at 11:08 AM, Unmesh Joshi wrote:
>>> On Thu, Sep 1, 2022 at 11:20 AM Alex Petrov  wrote:
>>> 
>>> There will be no changes required to our existing Paxos implementation. We 
>>> can just use it. Besides, Paxos is only used as K-sequencer. There is no 
>>> need to use Raft, and both existing LWTs (with Multi-Paxos) and Accord 
>>> aren't tied to a single leader, which is well in the spirit of Cassandra.
>>> 
>>> Will the CMS log implementation be documented in another CEP?  There are 
>>> subtle things like dealing with uncommitted incomplete writes or 
>>> propagating committed log entries to all the CMS replicas while deciding 
>>> how to maintain commit-index for the log will be a good detail to add? 
>>> The LWT Paxos implementation does this for the per key instance of Paxos 
>>> when a new Paxos read/write triggered (with special handling of committed 
>>> values).
>>> 
>>> Thanks,
>>> Unmesh
>> 


Re: [DISSCUSS] Access to JDK internals only after dev mailing list consensus?

2022-09-02 Thread Ekaterina Dimitrova
“ A quick heads up to the dev list with the jira would be sufficient for
anybody interested in discussing it further to comment on the jira.”

Agreed, I did’t mean voting but more or less we have the lazy consensus or
sharing concerns. Discussing them on a ticket should be enough but it needs
to happen. Also, it shouldn’t be  more often than adding dependencies I
guess.

JDK team is only closing more and more internals and warning us about
potential breakages. I want to prevent us from urgent fixing in patch
releases and to ease the maintenance.

I think ensuring that it is clearly documented why an exception is
acceptable and what options were considered will be of benefit for
maintenance. We can revise in time what has changed.

“ . Unless absolutely needed we should avoid accessing the internals. Folks
on this project should understand why. We can make the dangers of this
explicit in our contributor documentation. ”
+1

On Fri, 2 Sep 2022 at 1:26, Dinesh Joshi  wrote:

> Personally not opposed to this. However, this is something that should be
> vetted closely by the reviewers. Unless absolutely needed we should avoid
> accessing the internals. Folks on this project should understand why. We
> can make the dangers of this explicit in our contributor documentation.
> However, requiring an entire dev list discussion around it seems
> unnecessary. A quick heads up to the dev list with the jira would be
> sufficient for anybody interested in discussing it further to comment on
> the jira. WDYT?
>
> Dinesh
>
> On Sep 1, 2022, at 8:31 AM, Ekaterina Dimitrova 
> wrote:
>
> Hi everyone,
>
>
> Some time ago we added a note to the project Cassandra Code Style:
> “New dependencies should not be included without community consensus first
> being obtained via a [DISCUSS] thread on the dev@cassandra.apache.org
> mailing list”
>
> I would like to suggest also to add a point around accessing JDK
> internals. Any  patch that suggests accessing internals and/or adding even
> more add-opens/add-exports to be approved prior commit on the mailing list.
>
> It seems to me the project can only benefit of this visibility. If
> something is accepted as an exception, we need to have the right
> understanding and visibility of why; in some cases maybe to see for
> alternatives, to have follow up tickets opened, ownership taken. In my
> opinion this will be very helpful for maintaining the codebase.
>
> If others agree with that I can add a sentence to the Code Style. Please
> let me know what you think.
>
> Best regards,
> Ekaterina
>
>
>


Re: [DISCUSS] Removing support for java 8

2022-09-02 Thread Josh McKenzie
+1 on removing JDK8 support from trunk.

On Wed, Aug 31, 2022, at 12:07 PM, David Capwell wrote:
> +1 to remove from trunk
> 
>> On Aug 30, 2022, at 7:54 PM, Caleb Rackliffe  
>> wrote:
>> 
>> +1 on removing 8 for trunk
>> 
>> On Tue, Aug 30, 2022 at 2:42 PM Jon Haddad  
>> wrote:
>>> +1 to removal of 8 in trunk.
>>> 
>>> On 2022/08/29 20:09:55 Blake Eggleston wrote:
>>> > Hi all, I wanted to propose removing jdk8 support for 4.1. Active support 
>>> > ended back in March of this year, and I believe the community has built 
>>> > enough confidence in java 11 to make it an uncontroversial change for our 
>>> > next major release. Let me know what you think.
>>> > 
>>> > Thanks,
>>> > 
>>> > Blake


Re: [DISCUSS] LWT UPDATE semantics with + and - when null

2022-09-02 Thread Josh McKenzie
+1 to matching SQL. If we look at our population of users that are going to run 
into this, my intuition is that more of them will be familiar with SQL 
semantics than counters, so there's the angle where "the more consistent 
option" here is to follow SQL convention.

On Wed, Aug 31, 2022, at 12:19 PM, Benjamin Lerer wrote:
> The approach 2) is the one used by CQL operators. 
> SELECT v + 1 FROM t WHERE pk = 1; Will return null if the row exists but the 
> v is null.
> 
> Le mer. 31 août 2022 à 18:05, David Capwell  a écrit :
>> Sounds like matching SQL is the current favor, the current patch matches 
>> this so will leave this thread open a while longer before trying to merge 
>> the patch.
>> 
>>> On Aug 31, 2022, at 5:07 AM, Ekaterina Dimitrova  
>>> wrote:
>>> 
>>> I am also +1 to match SQL, option 2. Also, I like Andres’ suggestion
>>> 
>>> On Wed, 31 Aug 2022 at 7:15, Claude Warren via dev 
>>>  wrote:
 I like this approach.  However, in light of some of the discussions on 
 view and the like perhaps the function is  (column value as returned by 
 select ) + 42
 
 So a null counter column becomes 0 before the update calculation is 
 applied.
 
 Then any null can be considered null unless addressed by IfNull(), or 
 zeroIfNull()
 
 Any operation on null returns null.
 
 I think this follows what would be expected by most users in most cases.
 
 
 
 On 31/08/2022 11:55, Andrés de la Peña wrote:
> I think I'd prefer 2), the SQL behaviour. We could also get the 
> convenience of 3) by adding CQL functions such as "ifNull(column, 
> default)" or "zeroIfNull(column)", as it's done by other dbs. So we could 
> do things like "UPDATE ... SET name = zeroIfNull(name) + 42".
> 
> On Wed, 31 Aug 2022 at 04:54, Caleb Rackliffe  
> wrote:
>> Also +1 on the SQL behavior here. I was uneasy w/ coercing to "" / 0 / 1 
>> (depending on the type) in our previous discussion, but for some reason 
>> didn't bring up the SQL analog :-|
>> 
>> On Tue, Aug 30, 2022 at 5:38 PM Benedict  wrote:
>>> I’m a bit torn here, as consistency with counters is important. But 
>>> they are a unique eventually consistent data type, and I am inclined to 
>>> default standard numeric types to behave as SQL does, since they write 
>>> a new value rather than a “delta” 
>>> 
>>> It is far from optimal to have divergent behaviours, but also 
>>> suboptimal to diverge from relational algebra, and probably special 
>>> casing counters is the least bad outcome IMO.
>>> 
>>> 
 On 30 Aug 2022, at 22:52, David Capwell  wrote:
 
  
 4.1 added the ability for LWT to support "UPDATE ... SET name = name + 
 42", but we never really fleshed out with the larger community what 
 the semantics should be in the case where the column or row are NULL; 
 I opened up https://issues.apache.org/jira/browse/CASSANDRA-17857 for 
 this issue. 
 
 As I see it there are 3 possible outcomes:
 1) fail the query
 2) null + 42 = null (matches SQL)
 3) null + 42 == 0 + 42 = 42 (matches counters)
 
 In SQL you get NULL (option 2), but CQL counters treat NULL as 0 
 (option 3) meaning we already do not match SQL (though counters are 
 not a standard SQL type so might not be applicable).  Personally I 
 lean towards option 3 as the "zero" for addition and subtraction is 0 
 (1 for multiplication and division).
 
 So looking for feedback so we can update in CASSANDRA-17857 before 4.1 
 release.
 
 


Re: [DISSCUSS] Access to JDK internals only after dev mailing list consensus?

2022-09-02 Thread Derek Chen-Becker
I think it's fine to state it explicitly rather than making it an
assumption. Are we tracking any usage of internals in the codebase
currently?

Cheers,

Derek

On Fri, Sep 2, 2022 at 6:30 AM Ekaterina Dimitrova 
wrote:

>
>
> “ A quick heads up to the dev list with the jira would be sufficient for
> anybody interested in discussing it further to comment on the jira.”
>
> Agreed, I did’t mean voting but more or less we have the lazy consensus or
> sharing concerns. Discussing them on a ticket should be enough but it needs
> to happen. Also, it shouldn’t be  more often than adding dependencies I
> guess.
>
> JDK team is only closing more and more internals and warning us about
> potential breakages. I want to prevent us from urgent fixing in patch
> releases and to ease the maintenance.
>
> I think ensuring that it is clearly documented why an exception is
> acceptable and what options were considered will be of benefit for
> maintenance. We can revise in time what has changed.
>
> “ . Unless absolutely needed we should avoid accessing the internals.
> Folks on this project should understand why. We can make the dangers of
> this explicit in our contributor documentation. ”
> +1
>
> On Fri, 2 Sep 2022 at 1:26, Dinesh Joshi  wrote:
>
>> Personally not opposed to this. However, this is something that should be
>> vetted closely by the reviewers. Unless absolutely needed we should avoid
>> accessing the internals. Folks on this project should understand why. We
>> can make the dangers of this explicit in our contributor documentation.
>> However, requiring an entire dev list discussion around it seems
>> unnecessary. A quick heads up to the dev list with the jira would be
>> sufficient for anybody interested in discussing it further to comment on
>> the jira. WDYT?
>>
>> Dinesh
>>
>> On Sep 1, 2022, at 8:31 AM, Ekaterina Dimitrova 
>> wrote:
>>
>> Hi everyone,
>>
>>
>> Some time ago we added a note to the project Cassandra Code Style:
>> “New dependencies should not be included without community consensus
>> first being obtained via a [DISCUSS] thread on the
>> dev@cassandra.apache.org mailing list”
>>
>> I would like to suggest also to add a point around accessing JDK
>> internals. Any  patch that suggests accessing internals and/or adding even
>> more add-opens/add-exports to be approved prior commit on the mailing list.
>>
>> It seems to me the project can only benefit of this visibility. If
>> something is accepted as an exception, we need to have the right
>> understanding and visibility of why; in some cases maybe to see for
>> alternatives, to have follow up tickets opened, ownership taken. In my
>> opinion this will be very helpful for maintaining the codebase.
>>
>> If others agree with that I can add a sentence to the Code Style. Please
>> let me know what you think.
>>
>> Best regards,
>> Ekaterina
>>
>>
>>

-- 
+---+
| Derek Chen-Becker |
| GPG Key available at https://keybase.io/dchenbecker and   |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---+


Re: [DISSCUSS] Access to JDK internals only after dev mailing list consensus?

2022-09-02 Thread Ekaterina Dimitrova
Git and jira , nothing specific

On Fri, 2 Sep 2022 at 12:51, Derek Chen-Becker 
wrote:

> I think it's fine to state it explicitly rather than making it an
> assumption. Are we tracking any usage of internals in the codebase
> currently?
>
> Cheers,
>
> Derek
>
> On Fri, Sep 2, 2022 at 6:30 AM Ekaterina Dimitrova 
> wrote:
>
>>
>>
>> “ A quick heads up to the dev list with the jira would be sufficient for
>> anybody interested in discussing it further to comment on the jira.”
>>
>> Agreed, I did’t mean voting but more or less we have the lazy consensus
>> or sharing concerns. Discussing them on a ticket should be enough but it
>> needs to happen. Also, it shouldn’t be  more often than adding dependencies
>> I guess.
>>
>> JDK team is only closing more and more internals and warning us about
>> potential breakages. I want to prevent us from urgent fixing in patch
>> releases and to ease the maintenance.
>>
>> I think ensuring that it is clearly documented why an exception is
>> acceptable and what options were considered will be of benefit for
>> maintenance. We can revise in time what has changed.
>>
>> “ . Unless absolutely needed we should avoid accessing the internals.
>> Folks on this project should understand why. We can make the dangers of
>> this explicit in our contributor documentation. ”
>> +1
>>
>> On Fri, 2 Sep 2022 at 1:26, Dinesh Joshi  wrote:
>>
>>> Personally not opposed to this. However, this is something that should
>>> be vetted closely by the reviewers. Unless absolutely needed we should
>>> avoid accessing the internals. Folks on this project should understand why.
>>> We can make the dangers of this explicit in our contributor documentation.
>>> However, requiring an entire dev list discussion around it seems
>>> unnecessary. A quick heads up to the dev list with the jira would be
>>> sufficient for anybody interested in discussing it further to comment on
>>> the jira. WDYT?
>>>
>>> Dinesh
>>>
>>> On Sep 1, 2022, at 8:31 AM, Ekaterina Dimitrova 
>>> wrote:
>>>
>>> Hi everyone,
>>>
>>>
>>> Some time ago we added a note to the project Cassandra Code Style:
>>> “New dependencies should not be included without community consensus
>>> first being obtained via a [DISCUSS] thread on the
>>> dev@cassandra.apache.org mailing list”
>>>
>>> I would like to suggest also to add a point around accessing JDK
>>> internals. Any  patch that suggests accessing internals and/or adding even
>>> more add-opens/add-exports to be approved prior commit on the mailing list.
>>>
>>> It seems to me the project can only benefit of this visibility. If
>>> something is accepted as an exception, we need to have the right
>>> understanding and visibility of why; in some cases maybe to see for
>>> alternatives, to have follow up tickets opened, ownership taken. In my
>>> opinion this will be very helpful for maintaining the codebase.
>>>
>>> If others agree with that I can add a sentence to the Code Style. Please
>>> let me know what you think.
>>>
>>> Best regards,
>>> Ekaterina
>>>
>>>
>>>
>
> --
> +---+
> | Derek Chen-Becker |
> | GPG Key available at https://keybase.io/dchenbecker and   |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---+
>
>


Cassandra Token ownership split-brain (3.0.14)

2022-09-02 Thread Jaydeep Chovatia
Hi,

We are running a production Cassandra version (3.0.14) with 256 tokens
v-node configuration. Occasionally, we see that different nodes show
different ownership for the same key. Only a node restart corrects;
otherwise, it continues to behave in a split-brain.

Say, for example,

*NodeA*
nodetool getendpoints ks1 table1 10
- n1
- n2
- n3

*NodeB*
nodetool getendpoints ks1 table1 10
- n1
- n2
*- n5*

If I restart NodeB, then it shows the correct ownership {n1,n2,n3}. The
majority of the nodes in the ring show correct ownership {n1,n2,n3}, only a
few show this issue, and restarting them solves the problem.

To me, it seems I think Cassandra's Gossip cache and StorageService cache
(TokenMetadata) are having some sort of cache coherence.

Anyone has observed this behavior?
Any help would be highly appreciated.

Jaydeep


[Marketing] For Review: Changelog blog #19 - August

2022-09-02 Thread Chris Thornett
Changelog #19, the overview blog for August is open for 72-hr community
review. Permissions are set for comment only for amends and issues:
https://docs.google.com/document/d/1I5_6pUTWpfKfX16cc-eDS9AjMhe0kPimOpKgFAiS8c4/edit?usp=sharing

Thanks,

-- 

Chris Thornett
Senior Content Strategist, Constantia.io