If we're going to introduce a feature that looks like SQL constraints, we should make sure it's "reasonably" compliant. In particular, we should avoid situations where a user creates a constraint, writes some data, then reads data that violates that constraint, unless they've expressed that violations on read would be acceptable.
For Postgres, when adding a new constraint you can specify NOT VALID to avoid scanning all existing relevant data[1]. If we want to avoid scan-on-DDL, this tradeoff needs to be made clear to a user. As we've already discussed, constraints must deal with operations that appear within limits on the write path, but once reconciled on read or during compaction can lead to a violation. Adding to non-frozen collections is one example. Expecting users to understand the write path for collections feels unrealistic to me; I wonder if we should express in the constraint itself that it only applies during write. Anything that uses "nodetool import" (including cassandra-analytics) could theoretically push constraint-violating mutations to a table. We could update import to scan table contents first, or add a flag to trust the data in imported SSTables and make cassandra-analytics executors aware of table-level constraints. Some client implementations read the system_schema tables to build their object mappers, I'd like to confirm that nothing will require clients to be aware of these new schema constructs. Overall, I'm supportive of the distinctions discussed between constraints and guardrails and like the direction this is heading; I'd just like to make sure the more detailed semantics aren't confusing or misleading for our users, and semantics are much harder to change in the future. [1]: https://www.postgresql.org/docs/current/sql-altertable.html