On 3/28/2012 12:46 PM, Artem Shnayder wrote:
Does anyone know of any work done to automatically run a backup prior to a
DataImportHandler full-import?

I've asked this question on #solr and was pointed to
https://wiki.apache.org/solr/SolrReplication?highlight=%28backup%29#HTTP_API
which
is helpful but is not an automatic backup in the context of full-import's.
I'm wondering if anyone else has done this work yet.

I have located a previous message from you where you mention that you are on Ubuntu. If that's true, you can use hard links to make nearly instantaneous backups with a single command:

ln /path/to/index/* /path/to/backup/.

One caveat to that - the backup must be on the same filesystem as the index. If keeping backups on another filesystem (or even another computer) is important, then treat the hard link backup as a temporary directory. Copy the files from that directory to your remote location, then delete them.

This works because of the way that Lucene (and by extension Solr) manages files on disk - existing segment files are never modified. If they get merged, new files are created before the old ones are deleted. There is only one file in an index directory that does change without getting a new name - segments.gen. I have verified (on Solr 3.5) that even this file is properly handled so that a hard link backup keeps the correct version.

For people running on Windows, this particular method won't work. Newer Windows server versions do have one feature that might actually make it possible to do something similar - shadow copies. I do not know how to leverage the feature, though.

Thanks,
Shawn

Reply via email to