You're both right -- "read repair" as a concept is indeed performed asynchronously, but RowRepairResolver is used for synchronous, high-CL reads as well, which is the code Niklas is referring to.
Niklas, can you create a ticket to fix this officially? On Wed, Oct 10, 2012 at 3:31 PM, Mikhail Panchenko <[email protected]> wrote: > I'll take a stab: > > Without looking at the code, that seems perfectly fine - the purpose of > read repair is to repair potentially stale data out of band. It is > acceptable (from the viewpoint of the datastore) to have "stale" reads > while read-repair happens in the background. Once the repair is completed, > future reads will have the correct data ("eventually"). Reads do not and > should not block on read repair tasks. See > http://www.datastax.com/docs/1.1/cluster_architecture/about_client_requests#about-read-requestsfor > more info. > > In order to achieve what you're looking for and eliminate the window you > are describing, one would write and read at QUORUM consistency level. > > On Wed, Oct 10, 2012 at 1:25 PM, Niklas Ekström <[email protected]> wrote: > >> Hi, >> >> I’m looking in the file StorageProxy.java (Cassandra 1.1.5), and line 766 >> seems odd to me. >> >> FBUtilities.waitOnFutures() is called with the repairResults from the >> RowRepairResolver resolver. >> >> The problem though is that repairResults is only assigned when the object >> is created at line 737 in StorageProxy.java, and there it is assigned to >> Collections.emptyList(), and in the resolve() method in RowRepairResolver, >> which is indirectly called from line 771 in StorageProxy.java, that is, >> after the call to FBUtilities.waitOnFutures(). >> >> So the effect is that line 766 in StorageProxy.java is essentially a no-op. >> >> If on the other hand line 766 is moved down to just below the try-catch >> block under it (to line 777), the effect of the call to >> FBUtilities.waitOnFutures() would be to wait for responses to the >> READ_REPAIR message. Not waiting for responses to read repair messages >> opens a window of time in which stale reads can happen. >> >> Does this sound reasonable or am I overlooking something? >> >> Regards, >> Niklas >> -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
