Re: Possible issue with read repair?

Jonathan Ellis Wed, 10 Oct 2012 14:31:42 -0700

You're both right -- "read repair" as a concept is indeed performed
asynchronously, but RowRepairResolver is used for synchronous, high-CL
reads as well, which is the code Niklas is referring to.


Niklas, can you create a ticket to fix this officially?

On Wed, Oct 10, 2012 at 3:31 PM, Mikhail Panchenko <[email protected]> wrote:
> I'll take a stab:
>
> Without looking at the code, that seems perfectly fine - the purpose of
> read repair is to repair potentially stale data out of band. It is
> acceptable (from the viewpoint of the datastore) to have "stale" reads
> while read-repair happens in the background. Once the repair is completed,
> future reads will have the correct data ("eventually"). Reads do not and
> should not block on read repair tasks. See
> http://www.datastax.com/docs/1.1/cluster_architecture/about_client_requests#about-read-requestsfor
> more info.
>
> In order to achieve what you're looking for and eliminate the window you
> are describing, one would write and read at QUORUM consistency level.
>
> On Wed, Oct 10, 2012 at 1:25 PM, Niklas Ekström <[email protected]> wrote:
>
>> Hi,
>>
>> I’m looking in the file StorageProxy.java (Cassandra 1.1.5), and line 766
>> seems odd to me.
>>
>> FBUtilities.waitOnFutures() is called with the repairResults from the
>> RowRepairResolver resolver.
>>
>> The problem though is that repairResults is only assigned when the object
>> is created at line 737 in StorageProxy.java, and there it is assigned to
>> Collections.emptyList(), and in the resolve() method in RowRepairResolver,
>> which is indirectly called from line 771 in StorageProxy.java, that is,
>> after the call to FBUtilities.waitOnFutures().
>>
>> So the effect is that line 766 in StorageProxy.java is essentially a no-op.
>>
>> If on the other hand line 766 is moved down to just below the try-catch
>> block under it (to line 777), the effect of the call to
>> FBUtilities.waitOnFutures() would be to wait for responses to the
>> READ_REPAIR message. Not waiting for responses to read repair messages
>> opens a window of time in which stale reads can happen.
>>
>> Does this sound reasonable or am I overlooking something?
>>
>> Regards,
>> Niklas
>>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Possible issue with read repair?

Reply via email to