[ 
https://issues.apache.org/jira/browse/SOLR-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-13608:
-----------------------------------
    Description: 
SIP-12 lays out a plan for adding support for incremental backups to Solr.  At 
a high level, the idea is that Solr will be able to store multiple backups in 
the same location, and backups beyond the first one will only upload those 
files that were not uploaded by previous backups.

This involves changes to the file structure within a particular backup 
location.  It also entails changes to some of the backup/restore API parameters 
and semantics, to accommodate storing multiple backups in the same place, etc.

This ticket covers the changes required for this functionality, as described in 
SIP-12 unless mentioned specifically below.  It does not implement all of 
[SIP-12.|https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore]
  Same-collection-restoration, support for popular proprietary blob stores, 
etc. are left for separate tickets in an attempt to keep PRs manageable and 
conceptually cohesive. 



  was:
Currently every call to backup API requires backup the whole index with 
different backupName. This is very costly and nearly useless in case of large 
frequent change indexes.

Since index files of Lucene are written one only, they also contains the 
informantion about checksum of files. Then we can rely on these to support 
incremental backup -- only upload files that do not present in the repository.

The design for this issue will be like this
* Adding another parameter named {{incremental}} to backup API.
* Adding new methods to {{BackupRepository}}, like compute checksum, 
deletefiles..
* {{SnapShooter}} will skip uploading files from local if file in repository 
matches in checksum and length.
* Segments_N will be copied last to guarantee that even the backup process get 
interrupted in the middle, the old backup will still can be used.
* We only keep the last {{IndexCommit}} therefore after uploading Segments_N 
successfully, any file does not needed for the last {{IndexCommit}} will be 
deleted. We will try to improve this situation in another issue.
* Any files in ZK will be re-uploaded
** The ZK files coressponds first backup will be stored in same location as 
today (to maintain backward compatibility)
** On subsequent backups ZK files will be stored in folder {{gen-ith}}




> Incremental backup for Solr
> ---------------------------
>
>                 Key: SOLR-13608
>                 URL: https://issues.apache.org/jira/browse/SOLR-13608
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Jason Gerlowski
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> SIP-12 lays out a plan for adding support for incremental backups to Solr.  
> At a high level, the idea is that Solr will be able to store multiple backups 
> in the same location, and backups beyond the first one will only upload those 
> files that were not uploaded by previous backups.
> This involves changes to the file structure within a particular backup 
> location.  It also entails changes to some of the backup/restore API 
> parameters and semantics, to accommodate storing multiple backups in the same 
> place, etc.
> This ticket covers the changes required for this functionality, as described 
> in SIP-12 unless mentioned specifically below.  It does not implement all of 
> [SIP-12.|https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore]
>   Same-collection-restoration, support for popular proprietary blob stores, 
> etc. are left for separate tickets in an attempt to keep PRs manageable and 
> conceptually cohesive. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to