Re: [VOTE][IP CLEARANCE] Spark-Cassandra-Connector

2025-04-05 Thread Mick Semb Wever
 .


On Tue, 18 Mar 2025 at 09:13, Mick Semb Wever  wrote:

> (general@incubator cc'd)
>
> Please vote on the acceptance of the Spark-Cassandra-Connector and its
> IP Clearance:
>
> https://incubator.apache.org/ip-clearance/cassandra-spark-cassandra-connector.html
>
> …
> PMC members, please check carefully the IP Clearance requirements before
> voting.
>
> The vote will be open for 72 hours (or longer). Votes by PMC members
> are considered binding. A vote passes if there are at least three
> binding +1s and no -1's.



Vote passes with 11 +1 votes (9 binding) and no vetoes.


Re: Inconsistent null handling between WHERE and IF clauses

2025-04-05 Thread Benedict
Modifying the behaviour for IF clauses is a major breaking change that could 
have disastrous effects for customers, that would be very hard to audit 
applications for on upgrade, so I think that option is a non-starter.

I would support an effort to introduce a new session mode where we make 
ourselves more ANSI-SQL like, and introduce IS NULL as a concept, and use it 
consistently (along with any other appropriate changes)

FWIW, I think the invalid request is probably thrown because ALLOW FILTERING 
isn’t really expected to be used and a NULL primary key column cannot match 
since we don’t have JOIN. Since we now have SAI perhaps we have more reason to 
support NULL in WHERE clauses but I think without introducing a new mode, if we 
want to support it, we have to treat NULL like we do in IF - even if it’s not 
how we want it to work.

> On 24 Mar 2025, at 23:45, David Capwell  wrote:
> 
> In fuzz testing I have found some differences between `WHERE` and `IF` 
> clauses that want to get feedback from the broader community.
> 
> If you try to query with a `null` we will reject it
> 
> ```
> @Test  
> public void test() throws IOException  
> {  
>try (Cluster cluster = Cluster.build(1).start())  
>{  
>init(cluster);  
>cluster.schemaChange(withKeyspace("CREATE TABLE %s.tbl(pk int, ck int, 
> v0 int, v1 int, primary key(pk, ck))"));  
>var inst = cluster.coordinator(1);  
> 
>inst.execute(withKeyspace("INSERT INTO %s.tbl (pk, ck, v0) VALUES (?, 
> ?, ?)"), ConsistencyLevel.ALL, 0, 0, 0);  
>AssertUtils.assertRows(inst.execute(withKeyspace("SELECT * FROM %s.tbl 
> WHERE pk=? AND ck=? and v1=? ALLOW FILTERING"), ConsistencyLevel.ALL, 0, 0, 
> null),  
>   rows());  
>}  
> }
> ```
> 
> This fails as follows
> 
> ```
> org.apache.cassandra.exceptions.InvalidRequestException: Invalid null value 
> for column v1
> ```
> 
> But if you do this in the `IF` clause it is accepted
> 
> ```
> @Test  
> public void test() throws IOException  
> {  
>try (Cluster cluster = Cluster.build(1).start())  
>{  
>init(cluster);  
>cluster.schemaChange(withKeyspace("CREATE TABLE %s.tbl(pk int, ck int, 
> v0 int, v1 int, primary key(pk, ck))"));  
>var inst = cluster.coordinator(1);  
> 
> 
>inst.execute(withKeyspace("UPDATE %s.tbl SET v1=0 WHERE pk=0 AND ck=0 
> IF v0=?"), ConsistencyLevel.QUORUM, new Object[]{null});  
>AssertUtils.assertRows(inst.execute(withKeyspace("SELECT * FROM %s.tbl 
> WHERE pk=? AND ck=?"), ConsistencyLevel.SERIAL, 0, 0, null),  
>   rows());  
>}  
> }
> ```
> 
> CAS accepts this and will apply the `UPDATE` (the row doesn't exist, so `null 
> = null => true`; this behavior isn't consistent).
> 
> Most of the project treats `null` as something that won't ever match, which 
> is consistent with other DBs
> 
> ```
> sqlite> select * from employees;
> sqlite> insert into employees (id, name, age, department) values (0, "name", 
> 42, "cassandra");
> sqlite> insert into employees (id, name, age) values (1, "name2", 42);
> sqlite> select * from employees where department = null;
> sqlite> sqlite> select * from employees where department is null;
>id = 1
>  name = name2
>   age = 42
> department = NULL
> sqlite>
> ```
> 
> ```
> postgres=# select * from employees where department = null;
> id | name | age | department
> +--+-+
> (0 rows)
> postgres=# select * from employees where department is null;
> id | name  | age | department
> +---+-+
>  1 | name2 |  42 |
> (1 row)
> ```
> 
> So I guess my main question; is this a bug or a feature?


Huge NetApp donation of hardware for ci-cassandra

2025-04-05 Thread Mick Semb Wever
Under a ASF targeted sponsorship, NetApp (Instaclustr) has been very
generous with the community and donated ten beefy (AMD EPYC 9454P Genoa
48-Core, 256G ram) servers to be used with our ci-cassandra.apache.org
infrastructure.

On each server we fit 6 jenkins executors, increasing our ci-cassandra.a.o
executor count by 42 !
(60 new, minus 18 old executors from Instaclustr now removed).

This raises our executor count from 98 to 140, and means NetApp's donation
is currently running 30% of the project's CI resources !

This is a big deal for the project, adding both stability and improved
throughput of CI for the community.
https://github.com/apache/cassandra-builds/blob/trunk/ASF-jenkins-agents.md

A very big thank you to NetApp, and to all our contributors employed there
to help make this happen.