Re: Inserting new data, where the key points to a tombstone record.

Jools Wed, 09 Jun 2010 02:09:33 -0700

Hi Martin,

Many thanks for the succinct, and clear response.


I've got some pointers to move me in the right direction, many thanks.

However, as a final point of clarification, is there a particular reason
that insert does not raise an exception when trying to insert over an
existing key, or when the key points to a tombstone record ?

--Jools



On 9 June 2010 09:53, Dr. Martin Grabmüller <martin.grabmuel...@eleven.de>wrote:

>  Hi Jools,
>
> what happens in Cassandra with your scenario is the following:
>
> 1) insert new record
>   -> the record is added to Cassandra's dataset (with the given timestamp)
>
> 2) delete record
>   -> a tombstone is added to the data set (with the timestamp of the
> deletion,
>       which should be larger than the timestamp in 1), otherwise, the
> delete
>       will be lost.
>
> 3) insert new record with same key as deleted record
>   -> the record is added as in 2), but the timestamp should be larger than
>      the timestamps from both 1) and 2)
>
> When you compact between 2) and 3), the record inserted at 1) will be
> thrown
> away, but the tombstone from 2) will not be thrown away *unless* the
> tombstone
> was created more than GCGraceSeconds (a configuration option) before the
> compaction.
>
> If you do not compact, all records and tombstone will be present in
> Cassandra's
> dataset, and each read operation checks which of the records has the
> highest
> timestamp before returning the most current record (or report an error, if
> the tombstone
> has the highest timestamp).
>
> So whether you compact or not does not make a difference for your scenario,
> as long as all replicas see the tombstone before GCGraceSeconds have
> elapsed.
> If that is the case, it is possible that deleted records come alive again,
> because
> tombstones are deleted before all replicas had a chance to remove the
> deleted
> record.
>
> Your question about concurrently inserting the same key from different
> clients
> is another beast.  The simple answer is: don't do it.
>
> The longer answer: either you use some external synchronisation mechanism
> (e.g., Zookeeper), or you make sure that all clients use disjoint keys
> (UUIDs, or
> keys derived from the clients IP address+timestamp, that sort of thing).
>
> For keys representing user accounts or something similar, I would recommend
> using an external synchronisation mechanism, because for actions like
> account
> registration latency caused by such a mechanism is usually not a problem.
>
> For data coming in quickly, where the overhead of synchronisation is not
> acceptable,
> use the UUID variant and reconcile the data on read.
>
> HTH,
>   Martin
>
>  ------------------------------
> *From:* Jools [mailto:jool...@gmail.com]
> *Sent:* Wednesday, June 09, 2010 10:39 AM
> *To:* user@cassandra.apache.org
> *Subject:* Inserting new data, where the key points to a tombstone record.
>
>
> Hi,
>
> I've been developing a system against cassandra over the last few weeks,
> and I'd like to ask the community some advice on the best way to deal with
> inserting new data where the key is currently a tombstone record.
>
> As with all distributed systems, this is always a tricky thing to deal
> with, so I though I'd throw it to a wider audience.
>
> 1) insert new record.
> 2) deleted record.
> 3) insert record with same key as deleted record.
>
> Now I know I can make this work if I flush and compact between 2 and 3.
> However, I don't want to rely on a flush and compact and I'd like to code
> defensively against this senario, and I've ended up looking up to see if the
> key exists, then if it does then I know I can't insert the data. However, if
> the key does not exist then I attempt an insert.
>
> Now, here lies the issue. If I have more than one client doing this at the
> same time, both trying to insert using the same key. One will succeed and
> ones will fail. However neither insert will give me an indication of which
> one actually succeeded.
>
> So should an insert against an existing key, or deleted key produce some
> kind of exception ?
>
> Cheers,
>
> --Jools
>
>
>

Re: Inserting new data, where the key points to a tombstone record.

Reply via email to