Re: Inconsistent null handling between WHERE and IF clauses

Benedict Sat, 05 Apr 2025 10:34:18 -0700

Modifying the behaviour for IF clauses is a major breaking change that could 
have disastrous effects for customers, that would be very hard to audit 
applications for on upgrade, so I think that option is a non-starter.


I would support an effort to introduce a new session mode where we make 
ourselves more ANSI-SQL like, and introduce IS NULL as a concept, and use it 
consistently (along with any other appropriate changes)

FWIW, I think the invalid request is probably thrown because ALLOW FILTERING 
isn’t really expected to be used and a NULL primary key column cannot match 
since we don’t have JOIN. Since we now have SAI perhaps we have more reason to 
support NULL in WHERE clauses but I think without introducing a new mode, if we 
want to support it, we have to treat NULL like we do in IF - even if it’s not 
how we want it to work.

> On 24 Mar 2025, at 23:45, David Capwell <[email protected]> wrote:
> 
> In fuzz testing I have found some differences between `WHERE` and `IF` 
> clauses that want to get feedback from the broader community.
> 
> If you try to query with a `null` we will reject it
> 
> ```
> @Test  
> public void test() throws IOException  
> {  
>    try (Cluster cluster = Cluster.build(1).start())  
>    {  
>        init(cluster);  
>        cluster.schemaChange(withKeyspace("CREATE TABLE %s.tbl(pk int, ck int, 
> v0 int, v1 int, primary key(pk, ck))"));  
>        var inst = cluster.coordinator(1);  
> 
>        inst.execute(withKeyspace("INSERT INTO %s.tbl (pk, ck, v0) VALUES (?, 
> ?, ?)"), ConsistencyLevel.ALL, 0, 0, 0);  
>        AssertUtils.assertRows(inst.execute(withKeyspace("SELECT * FROM %s.tbl 
> WHERE pk=? AND ck=? and v1=? ALLOW FILTERING"), ConsistencyLevel.ALL, 0, 0, 
> null),  
>                               rows());  
>    }  
> }
> ```
> 
> This fails as follows
> 
> ```
> org.apache.cassandra.exceptions.InvalidRequestException: Invalid null value 
> for column v1
> ```
> 
> But if you do this in the `IF` clause it is accepted
> 
> ```
> @Test  
> public void test() throws IOException  
> {  
>    try (Cluster cluster = Cluster.build(1).start())  
>    {  
>        init(cluster);  
>        cluster.schemaChange(withKeyspace("CREATE TABLE %s.tbl(pk int, ck int, 
> v0 int, v1 int, primary key(pk, ck))"));  
>        var inst = cluster.coordinator(1);  
> 
> 
>        inst.execute(withKeyspace("UPDATE %s.tbl SET v1=0 WHERE pk=0 AND ck=0 
> IF v0=?"), ConsistencyLevel.QUORUM, new Object[]{null});  
>        AssertUtils.assertRows(inst.execute(withKeyspace("SELECT * FROM %s.tbl 
> WHERE pk=? AND ck=?"), ConsistencyLevel.SERIAL, 0, 0, null),  
>                               rows());  
>    }  
> }
> ```
> 
> CAS accepts this and will apply the `UPDATE` (the row doesn't exist, so `null 
> = null => true`; this behavior isn't consistent).
> 
> Most of the project treats `null` as something that won't ever match, which 
> is consistent with other DBs
> 
> ```
> sqlite> select * from employees;
> sqlite> insert into employees (id, name, age, department) values (0, "name", 
> 42, "cassandra");
> sqlite> insert into employees (id, name, age) values (1, "name2", 42);
> sqlite> select * from employees where department = null;
> sqlite> sqlite> select * from employees where department is null;
>        id = 1
>      name = name2
>       age = 42
> department = NULL
> sqlite>
> ```
> 
> ```
> postgres=# select * from employees where department = null;
> id | name | age | department
> ----+------+-----+------------
> (0 rows)
> postgres=# select * from employees where department is null;
> id | name  | age | department
> ----+-------+-----+------------
>  1 | name2 |  42 |
> (1 row)
> ```
> 
> So I guess my main question; is this a bug or a feature?

Re: Inconsistent null handling between WHERE and IF clauses

Reply via email to