Re: [DISCUSS] SQL support in Cassandra

Josh McKenzie Tue, 04 Nov 2025 05:42:51 -0800

+1 to Mick and Aleksey. I think the key for me was this:
> One is Cassandra’s wide-partition model with flexible clustering columns, 
> which supports very large, ordered partitions (e.g. time-series and efficient 
> range scans), rather than a strictly normalised, join-centric model. These 
> patterns don’t always map cleanly to SQL semantics, and CQL’s query-driven, 
> table-per-query modelling helps move users toward designs that scale 
> predictably.


We'd need really robust EXPLAIN / EXPLAIN ANALYZE support (see here 
<https://www.postgresql.org/docs/current/sql-explain.html>) for users to be 
able to make sense of how their SQL queries translate into underlying disk 
access patterns. Having a wide-open field of full SQL compliance they then need 
to understand how to constrain to get horizontal scale out of it would be *much 
more challenging* than the already somewhat "new" cognitive muscle our users 
have to build to realize that horizontal scaling of data access doesn't come 
free.

I think that would give us a future state of "Use SQL when you need / want a 
lot of expressivity, use CQL when you need to be constrained to language 
primitives that keep your data access scalable". The part that gets me wary 
here is how we've run into pain in the past trying to be both a database that 
allows more query expressivity (ALLOW FILTERING, legacy 2i come to mind) and a 
database that also wants horizontal scale.

I'd love us to be able to have our cake and eat it too but I don't know if 
that's possible. So at the very least I'd advocate for SQL + CQL going forward, 
or SQL + a constrained "CQL-like" mode that gives the same constraints CQL does 
today on modeling that guide people towards that very partitionable path.

On Tue, Nov 4, 2025, at 8:12 AM, Aleksey Yeshchenko wrote:
> I don’t mind us implementing some Postgres syntax support in some capacity, 
> but I do not like the idea of limiting what Cassandra is allowed to do, or 
> expose via CQL, to what is expressible by Postgres’s SQL.
> 
> Many moons ago, before we started work on native protocol and CQL, I could 
> perhaps a bigger benefit to going Postgres route - for the client protocol 
> and the language. We could piggyback on existing client infrastructure and 
> SQL familiarity. But at this stage, when we have already made the effort to 
> develop decent drivers, and CQL is fleshed out, and C* is quite mature 
> overall, how much would we gain from this transition?
> 
> I’m broadly with Mick here. And I support using Postgres’ SQL as inspiration 
> for implementing new CQL features wherever it makes sense - it’s something 
> we’ve been doing for a decade already. But I don’t believe that deprecating 
> CQL is the way to go at this point.
> 
> > On 4 Nov 2025, at 06:38, Mick <[email protected]> wrote:
> > 
> > 
> > 
> >> On 3 Nov 2025, at 20:32, Joel Shepherd <[email protected]> wrote:
> >> 
> >> At the same time, my personal opinion is that if SQL compatibility is 
> >> pursued, then the end game should be to deprecate CQL. That will probably 
> >> take years, but at the limit I don't see a lot of benefit to supporting 
> >> both.
> > 
> > 
> > 
> > We want SQL, but _why_ (in all its nuances) do we want SQL ?  A lot is 
> > obvious, but it is a very broad question.
> > 
> > The adoption and standardisation benefits are obvious, but CQL has 
> > strengths relative to SQL in Cassandra’s context.  
> > 
> > One is Cassandra’s wide-partition model with flexible clustering columns, 
> > which supports very large, ordered partitions (e.g. time-series and 
> > efficient range scans), rather than a strictly normalised, join-centric 
> > model. These patterns don’t always map cleanly to SQL semantics, and CQL’s 
> > query-driven, table-per-query modelling helps move users toward designs 
> > that scale predictably.
> > 
> > I can see CQL continuing as Cassandra’s high-throughput, query-driven DSL, 
> > while we pursue SQL compatibility.  I appreciate Dinesh’s ‘lanes’ framing, 
> > e.g. eventually default to a SQL interface (with Accord) for the broadest 
> > UX, while CQL remains a high-throughput path.
> > 
> > Should we also be discussing storage-engine implications ?  Cassandra’s 
> > LSMT/SSTable design optimises write paths; while a SQL presents a logical 
> > view without constraining physical layout; so data on disk stays optimised 
> > for dominant access patterns.  I can also see the need to discuss transport 
> > vs query languages differences.
> > 
> > Are we after both SQL's DML and DDL abilities ?  Beyond accessibility and 
> > exploration, SQL often comes with mature tooling for schema change 
> > management. Cassandra supports online schema changes (e.g., ALTER TABLE), 
> > but cross-table/primary-key changes remain constrained. A SQL interface 
> > alone won’t ‘solve’ this: it’s about migration tooling and engine 
> > capabilities; changing data models at-scale faces separate challenges.
> > 
> > Especially outside of early-stage apps and ad-hoc exploration I find SQL 
> > less interesting and its ergonomics less aligned with Cassandra’s runtime 
> > performance model.  That doesn't make me opposed to the endeavour of SQL 
> > compatibility, it pushes me on the why question a bit more for alignment 
> > clarity to our strengths.
> 
>

Re: [DISCUSS] SQL support in Cassandra

Reply via email to