There is no regression here yet, however static columns turn out to be insanely useful (perhaps for reasons other than originally intended), so I am curious what the general approach is to changes in what is essentially a very new feature.
In this context, I am showing something that works as intended today (at the
bottom - hopefully in courier new), which is sufficiently interesting I think
to be added as a regression test (assuming people agree it is valid)
That is the long and short of my question, for more context, this is what I am
trying to do:
=====
Background: We store millions of lets say “records” in C*, and as we stream
data through our systems we like to work with timestamped deltas to these
records (lets say changes to a subset of fields). We do want to be able to
recreate the data at a specific point in time (so we munge sequences of deltas
read from cassandra together) both because historical state is a use case, but
also because we need MVCC, so the “readers" don’t actually want to see the
latest deltas since they may be logically in the future compared to the
readers’ “now"
Currently, we start with a full version of the record in the first
row(/clustering column value) for the record and add deltas forward in time.
This is bad because you have to munge a lot of deltas to get to the state near
now, and because the deltas and the initial state are naturally immutable,
there is little point in using a map, since we do no map-like operations on
them.
With a map in a static column, we can apply map operations to update the
“latest” state, and use the rows to store immutable “backwards” deltas to take
you from a newer state to an older state. Note now you can even TTL old deltas,
and for MVCC of a “readers” current, you can sort by decreasing time stamp, and
select only very few (mostly 1 perhaps) rows to get from the “latest” state to
the “now” state from a MVCC perspective.
All this is very nice and sweet and works.
So, as I said above… I’d kind of like to use this (in something we are
replacing anyway), but I would like anyones opinion on whether the following
(particularly the UPDATE statements to insert rows, and make non-replacing
updates to the static map) are entirely reasonable and something we should
consider valid syntax.
Thanks,
Graham.
cqlsh:test> CREATE TABLE backwards_time_series ( id text, update_time bigint,
backwards_delta_json text, latest_state map<text, text> static, PRIMARY KEY
(id, update_time) ) WITH CLUSTERING ORDER BY (update_time DESC);
cqlsh:test>
cqlsh:test> // initial insert of row and static column value for partition id =
'id1';
cqlsh:test> INSERT INTO backwards_time_series (id, update_time, latest_state)
VALUES ( 'id1', 100, { 'field1' : 'a', 'field2' : 'b', 'field3' : 'c'});
cqlsh:test>
cqlsh:test> // Because we are going backwards in time we generally want a lower
bound near to now, so including that as part of the test
cqlsh:test> SELECT * from backwards_time_series WHERE id = 'id1' AND
UPDATE_TIME >= 100;
id | update_time | latest_state |
backwards_delta_json
-----+-------------+-----------------------------------------------+----------------------
id1 | 100 | {'field1': 'a', 'field2': 'b', 'field3': 'c'} |
null
(1 rows)
cqlsh:test>
cqlsh:test> // Use map operation on static column to update latest state of
field2 while inserting a new regular row with clustering key update_time = 200
cqlsh:test> UPDATE backwards_time_series SET backwards_delta_json =
'{"field2":"b"}', latest_state['field2'] = 'newb' WHERE id='id1' AND
update_time = 200;
cqlsh:test>
cqlsh:test> SELECT * from backwards_time_series WHERE id = 'id1' AND
UPDATE_TIME >= 100;
id | update_time | latest_state |
backwards_delta_json
-----+-------------+--------------------------------------------------+----------------------
id1 | 200 | {'field1': 'a', 'field2': 'newb', 'field3': 'c'} |
{"field2":"b"}
id1 | 100 | {'field1': 'a', 'field2': 'newb', 'field3': 'c'} |
null
(2 rows)
cqlsh:test>
cqlsh:test> // Same as above, but we both update a map keys value, add a new
map key/value, and delete a new map key all in the same operation while
inserting the new row
cqlsh:test> UPDATE backwards_time_series SET backwards_delta_json = '.. omitted
for brevity ..', latest_state['field3'] = 'newc', latest_state['field4'] =
'value4', latest_state['field2'] = null WHERE id='id1' and update_time = 300;
cqlsh:test>
cqlsh:test> SELECT * from backwards_time_series WHERE id = 'id1' AND
UPDATE_TIME >= 100;
id | update_time | latest_state |
backwards_delta_json
-----+-------------+-------------------------------------------------------+---------------------------
id1 | 300 | {'field1': 'a', 'field3': 'newc', 'field4': 'value4'} | ..
omitted for brevity ..
id1 | 200 | {'field1': 'a', 'field3': 'newc', 'field4': 'value4'} |
{"field2":"b"}
id1 | 100 | {'field1': 'a', 'field3': 'newc', 'field4': 'value4'} |
null
(3 rows)
smime.p7s
Description: S/MIME cryptographic signature
