First, using Solr as a repository is pretty risky. I would keep the official copy of the data in a database, not in Solr.
Second, you can’t “migrate tables” because Solr doesn’t have tables. You need to turn the tables into documents, then index the documents. It can take a lot of joins to flatten a relational schema into Solr documents. Solr does not support schema migration, so yes, you will need to save off all the documents, then reload them. I would save them to files. It makes no sense to put them in another copy of Solr. Changing the schema will be difficult and time-consuming, but you’ll probably run into much worse problems trying to use Solr as a repository. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jun 9, 2016, at 8:50 AM, Hui Liu <h...@opentext.com> wrote: > > Hi, > > We are porting an application currently hosted in Oracle 11g to > Solr Cloud 6.x, i.e we plan to migrate all tables in Oracle as collections in > Solr, index them, and build search tools on top of this; the goal is we won't > be using Oracle at all after this has been implemented; every fields in Solr > will have 'stored=true' and selectively a subset of searchable fields will > have 'indexed=true'; the question is what steps we should follow if we need > to re-index a collection after making some schema changes - mostly we only > add new fields to store, or make a non-indexed field as indexed, we normally > do not delete or rename any existing fields; according to this url: > https://wiki.apache.org/solr/HowToReindex it seems we need to setup a > 'intermediate' Solr1 to only store the data themselves without any indexing, > then have another Solr2 setup to store the indexed data, and in case of > re-index, just delete all the documents in Solr2 for the collection and > re-import data from Solr1 into Solr2 using SolrEntityProcessor (from > dataimport handler)? Is this still the recommended approach? I can see the > downside of this approach is if we have tremendous amount of data for a > collection (some of our collection could have several billions of documents), > re-import it from Solr1 to Solr2 may take a few hours or even days, and > during this time, users cannot query the data, is there any better way to do > this and avoid this type of down time? Any feedback is appreciated! > > Regards, > Hui Liu > Opentext, Inc.