[ 
https://issues.apache.org/jira/browse/GEODE-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15925271#comment-15925271
 ] 

Dan Smith commented on GEODE-2654:
----------------------------------

Darrel asked me to clarify the suggestion about getting a lock on the oplog in 
lockStoreBeforeBackup.

We're not trying to ensure that all members have exactly the same data in their 
oplogs. It's okay for some members to have the put to region A, and others to 
not have seen the put. We'll sort that out when we recover region A from disk 
and we'll make the redundant copies of region A consistent. We're just trying 
to prevent ourselves from backing up the put to region B without backing up the 
put to region A.

The locking mechanism works because it forces all of the members to come to a 
stop. The point at which they are stopped is the point in time that is backed 
up, so we don't end up backing up different members at different points in time.

> Backups can capture different members from different points in time
> -------------------------------------------------------------------
>
>                 Key: GEODE-2654
>                 URL: https://issues.apache.org/jira/browse/GEODE-2654
>             Project: Geode
>          Issue Type: Bug
>          Components: persistence
>            Reporter: Dan Smith
>
> Geode backups should behave the same as recovering from disk after killing 
> all of the members.
> Unfortunately, the backups instead can backup data on different members at 
> different points in time, resulting in application level inconsistency. 
> Here's an example of what goes wrong:
> # Start a Backup
> # Do a put in region A
> # Do a put in region B
> # The backup finishes
> # Recover from the backup
> # You may see the put to region B, but not A, even if the data is colocated.
> We ran into this with with lucene indexes - see GEODE-2643. We've worked 
> around GEODE-2643 by putting all data into the same region, but we're worried 
> that we still have a problem with the async event queue. With an async event 
> listener that writes to another geode region, because it's possible to 
> recover different points in time for the async event queue and the region, 
> resulting in missed events. 
> The issue is that there is no locking or other mechanism to prevent different 
> members from backing up their data at different points in time. Colocating 
> data does not avoid this problem, because when we recover from disk we may 
> recover region A's bucket from one member and region B's bucket from another 
> member.
> The backup operation does have a mechanism for making sure that it gets a 
> point in time snapshot of *metadata*. It sends a PrepareBackupRequest to all 
> members which causes them to lock their init file. Then it sends a 
> FinishBackupRequest which tells all members to backup their data and release 
> the lock. This ensures that a backup doesn't completely miss a bucket or get 
> corrupt metadata about what members host as bucket. See the comments in 
> DiskStoreImpl.lockStoreBeforeBackup.
> We should extend this Prepare/Finish mechanism to make sure we get a point in 
> time snapshot of region data as well. One way to do this would be to get a 
> lock on the *oplog* in lockStoreBeforeBackup to prevent writes and hold it 
> until releaseBackupLock is called.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to