Sorry I didn’t see on my phone this has already been brought up.

On Fri, Feb 14, 2025, at 9:17 PM, Ariel Weisberg wrote:
> Hi,
>
> This is already specified in SQL. Columns that aren’t nullable need 
> either to have a default value or a value specified in each 
> insert/update.
>
> Unless I am wrong about the standard or there is a persuasive reason to 
> deviate from the standard I would argue this is actually a bug.
>
> A persuasive reason to deviate from the standard would be something 
> like the standard is just bad and not useful, but in this instance the 
> expectation with a not null constraint is that the data is not null.
>
> There is a separate discussion to be had regarding what to do with data 
> that is already null, but for now I think it is fine to make that a 
> separate enhancement.
>
> Ariel
>
> On Mon, Feb 10, 2025, at 9:49 AM, Bernardo Botella wrote:
>> Hi everyone,
>>
>> Stefan Miklosovic and I have been working on a NOT_NULL 
>> (https://github.com/apache/cassandra/pull/3867) constraint to be added 
>> to the constraints tool belt, and a really interesting conversation 
>> came up.
>>
>> First, as a problem statement, let's consider this:
>>
>> -----------------------------------------
>> CREATE TABLE ks.tb2 (
>>     id int,
>>     cl1 int,
>>     cl2 int,
>>     val text CHECK NOT_NULL(val),
>>     PRIMARY KEY (id, cl1, cl2)
>> ) 
>>
>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 
>> 3, null);
>> InvalidRequest: Error from server: code=2200 [Invalid query] 
>> message="Column value does not satisfy value constraint for column 
>> 'val' as it is null."
>>
>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 
>> 3, “text");
>> cassandra@cqlsh> select * from ks.tb2;
>>
>>  id | cl1 | cl2 | val
>> ----+-----+-----+------
>>   1 |   2 |   3 | text
>>
>> (1 rows)
>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2) VALUES ( 1, 2, 4);
>> cassandra@cqlsh> select * from ks.tb2;
>>
>>  id | cl1 | cl2 | val
>> ----+-----+-----+------
>>   1 |   2 |   3 | text
>>   1 |   2 |   4 | null
>>
>> -----------------------------------------
>>
>> As you see, we have a hole in which a 'null' value is getting written 
>> on column val even if we have a NOT_NULL on that particular column 
>> whenever the column is NOT specified on the write. That raises the 
>> question on how this particular constraint should behave.
>>
>> If we consider the other constraints (scalar constraint and length 
>> constraint so far), this particular behavior is fine. But, if the 
>> constraint is NOT_NULL, then it becomes a little bit trickier.
>>
>> The conclusions we have reached is that the meaning of constraints 
>> should be interpreted like: I check whatever you give me as part of the 
>> write, ignoring everything else. Let me elaborate:
>> If we decide to treat this particular NOT_NULL constraint differently, 
>> and check if the value for that column is present in the insert 
>> statement, we then open a different can of worms. What happens if the 
>> row already exists with a valid value, and that insert statement is 
>> only trying to do an update to a different column in the row? If that 
>> was the case, we would be forcing the user to specify the 'val' column 
>> value for every update, even if it is not needed. 
>>
>> Mainly for this reason, we think it is better to treat this NOT_NULL 
>> constraint just like the other constraints, and execute it ONLY on the 
>> values that are present on the insert statement.
>>
>> The main con is that it may lead to a little bit of confussion (as in, 
>> why I just added a null value to the table even if I have a NOT_NULL 
>> constraint?). We have thought on aliviating this particular confusion 
>> by:
>> - Extensive documentation. Let's be upfront on what this constraint 
>> does and does not. 
>> (https://github.com/apache/cassandra/blob/ed58c404e8c880b69584e71a3690d3d9f73ef9fa/doc/modules/cassandra/pages/developing/cql/constraints.adoc#not_null-constraint)
>> - Adding, as part of this patch, yet another constraint 
>> (STRICTLY_NOT_NULL), that checks for the actual column value to be 
>> present in the insert statement..
>>
>> If you've made it until here, that means you are really interested in 
>> constraints. Thanks! The question for you is, would you have any 
>> concern with this approach?
>>
>> Thanks,
>> Bernardo

Reply via email to