In my head NOT_NULL constraint implies that the column must be specified on
each write and must not be NULL. If a column with the NOT_NULL constraint
is omitted during a write then shouldn’t it be treated as if it was
specified and set to NULL?

If the column has a non-NULL value that was previously written and you’re
updating the rest of the columns, you still have to force the user to
specify it otherwise you will have to perform a read before write to
validate that the column was not NULL. I think this is a fine compromise
given that the goal here is to ensure that an application shouldn’t
inadvertently write a NULL value for a column specified as NOT_NULL.

On Mon, Feb 10, 2025 at 6:50 AM Bernardo Botella <
conta...@bernardobotella.com> wrote:

> Hi everyone,
>
> Stefan Miklosovic and I have been working on a NOT_NULL (
> https://github.com/apache/cassandra/pull/3867) constraint to be added to
> the constraints tool belt, and a really interesting conversation came up.
>
> First, as a problem statement, let's consider this:
>
> -----------------------------------------
> CREATE TABLE ks.tb2 (
>     id int,
>     cl1 int,
>     cl2 int,
>     val text CHECK NOT_NULL(val),
>     PRIMARY KEY (id, cl1, cl2)
> )
>
> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3,
> null);
> InvalidRequest: Error from server: code=2200 [Invalid query]
> message="Column value does not satisfy value constraint for column 'val' as
> it is null."
>
> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3,
> “text");
> cassandra@cqlsh> select * from ks.tb2;
>
>  id | cl1 | cl2 | val
> ----+-----+-----+------
>   1 |   2 |   3 | text
>
> (1 rows)
> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2) VALUES ( 1, 2, 4);
> cassandra@cqlsh> select * from ks.tb2;
>
>  id | cl1 | cl2 | val
> ----+-----+-----+------
>   1 |   2 |   3 | text
>   1 |   2 |   4 | null
>
> -----------------------------------------
>
> As you see, we have a hole in which a 'null' value is getting written on
> column val even if we have a NOT_NULL on that particular column whenever
> the column is NOT specified on the write. That raises the question on how
> this particular constraint should behave.
>
> If we consider the other constraints (scalar constraint and length
> constraint so far), this particular behavior is fine. But, if the
> constraint is NOT_NULL, then it becomes a little bit trickier.
>
> The conclusions we have reached is that the meaning of constraints should
> be interpreted like: I check whatever you give me as part of the write,
> ignoring everything else. Let me elaborate:
> If we decide to treat this particular NOT_NULL constraint differently, and
> check if the value for that column is present in the insert statement, we
> then open a different can of worms. What happens if the row already exists
> with a valid value, and that insert statement is only trying to do an
> update to a different column in the row? If that was the case, we would be
> forcing the user to specify the 'val' column value for every update, even
> if it is not needed.
>
> Mainly for this reason, we think it is better to treat this NOT_NULL
> constraint just like the other constraints, and execute it ONLY on the
> values that are present on the insert statement.
>
> The main con is that it may lead to a little bit of confussion (as in, why
> I just added a null value to the table even if I have a NOT_NULL
> constraint?). We have thought on aliviating this particular confusion by:
> - Extensive documentation. Let's be upfront on what this constraint does
> and does not. (
> https://github.com/apache/cassandra/blob/ed58c404e8c880b69584e71a3690d3d9f73ef9fa/doc/modules/cassandra/pages/developing/cql/constraints.adoc#not_null-constraint
> )
> - Adding, as part of this patch, yet another constraint
> (STRICTLY_NOT_NULL), that checks for the actual column value to be present
> in the insert statement..
>
> If you've made it until here, that means you are really interested in
> constraints. Thanks! The question for you is, would you have any concern
> with this approach?
>
> Thanks,
> Bernardo

Reply via email to