There are two options as I see it.. 1. Do something like you describe and create a secondary index, index into it, then switch... I personally would create a completely separate solr cloud alongside my existing one vs new core in the same cloud as you might see some negative impacts on GC caused by the indexing load.
2. Tag each record with a field (eg "generation") that identifies which generation of data a record is from.. when querying filter on only the generation of data that is complete.. new records get a new generation.. the only problem with this is changing field types doesn't really work with the same field names.. but if you used dynamic fields instead of static the field name would change anyway which isn't a problem then. We use both of these patterns in different applications.. steve On Wed, Jul 6, 2016 at 1:27 PM Steven White <swhite4...@gmail.com> wrote: > Hi everyone, > > In my environment, I have use cases where I need to fully re-index my > data. This happens because Solr's schema requires changes based on changes > made to my data source, the DB. For example, my DB schema may change so > that it now has a whole new set of field added or removed (on records), or > the data type changed (on fields). When that happens, the only solution I > have right now is to drop the current Solr index, update Solr's schema.xml, > re-index my data (I use Solr's core admin to dynamical do all this). > > The issue with my current solution is during the re-indexing, which right > now takes 10 hours (expect it to take over 30 hours as my data keeps on > growing) search via Solr is not available. Sure, I can enable search while > the data is being re-indexed, but then I get partial results. > > My question is this: how can I avoid this so there is minimal downtime, > under 1 min.? I was thinking of creating a second core (again dynamically) > and re-index into it (after setting up the new schema) and once the > re-index is fully done, switch over to the new core and drop the index from > the old core and then delete the old core, and rename the new core to the > old core (original core). > > Would the above work or is there a better way to do this? How do you guys > solve this problem? > > Again, my goal is to minimize downtime during re-indexing when Solr's > schema is drastically changed (requiring re-indexing). > > Thanks in advanced. > > Steve >