> I’m still not sure if having tombstones vs. empty values / frozen UDTs
will have the same results.
When in doubt, benchmark.
Good luck,
Jon
On Wed, Jan 9, 2019 at 3:02 PM Tomas Bartalos
wrote:
> Loosing atomic updates is a good point, but in my use case its not a
> problem, since I always ov
Loosing atomic updates is a good point, but in my use case its not a problem,
since I always overwrite the whole record (no partitial updates).
I’m still not sure if having tombstones vs. empty values / frozen UDTs will
have the same results.
When I update one row with 10 null columns it will cr
The idea of storing your data as a single blob can be dangerous.
Indeed, you loose the ability to perform atomic update on each column.
In Cassandra, LWW is the rule. Suppose 2 concurrent updates on the same
row, 1st update changes column Firstname (let's say it's a Person record)
and 2nd update
Those are two different cases though. It *sounds like* (again, I may be
missing the point) you're trying to overwrite a value with another value.
You're either going to serialize a blob and overwrite a single cell, or
you're going to overwrite all the cells and include a tombstone.
When you do a
Hello Jon,
I thought having tombstones is much higher overhead than just overwriting
values. The compaction overhead can be l similar, but I think the read
performance is much worse.
Tombstones accumulate and hang for 10 days (by default) before they are
eligible for compaction.
Also we have tom
If you're overwriting values, it really doesn't matter much if it's a
tombstone or any other value, they still need to be compacted and have the
same overhead at read time.
Tombstones are problematic when you try to use Cassandra as a queue (or
something like a queue) and you need to scan over tho
Hello,
I beleive your approach is the same as using spark with "
spark.cassandra.output.ignoreNulls=true"
This will not cover the situation when a value have to be overwriten with
null.
I found one possible solution - change the schema to keep only primary key
fields and move all other fields to
"The problem is I can't know the combination of set/unset values" --> Just
for this requirement, Achilles has a working solution for many years using
INSERT_NOT_NULL_FIELDS strategy:
https://github.com/doanduyhai/Achilles/wiki/Insert-Strategy
Or you can use the Update API that by design only perf
Hello,
The problem is I can't know the combination of set/unset values. From my
perspective every value should be set. The event from Kafka represents the
complete state of the happening at certain point in time. In my table I
want to store the latest event so the most recent state of the happenin
Depending on the use case, creating separate prepared statements for each
combination of set / unset values in large INSERT/UPDATE statements may be
prohibitive.
Instead, you can look into driver level support for UNSET values. Requires
Cassandra 2.2 or later IIRC.
See:
Java Driver:
https://docs
You say the events are incremental updates. I am interpreting this to mean only
some columns are updated. Others should keep their original values.
You are correct that inserting null creates a tombstone.
Can you only insert the columns that actually have new values? Just skip the
columns with
11 matches
Mail list logo