On Sun, Oct 4, 2009 at 8:05 PM, Paul Rosen <p...@performantsoftware.com>wrote:
> Hi, > > I've been trying to experiment with merging, but have been running into > some problems. > > First, I'm using ruby and the solr-ruby-0.0.7 gem. It looks like there is > no support in that gem for merging. Have I overlooked something? > > > Second, I was attempting to just follow the instructions in > http://wiki.apache.org/solr/MergingSolrIndexes so I could see merging > work. I just tried putting the sample url in the address bar of my browser, > but it just sent me to the admin page. (It does the same thing as if I had > left off all the parameters.) Here is the URL I constructed: > > > http://localhost:8983/solr/merged/admin/?action=mergeindexes&core=merged&indexDir=/Users/my/path/solr_1.4/solr/data/reindexed_marc/index&indexDir=/Users/my/path/solr_1.4/solr/data/reindexed_rdf/index > > Why didn't that work? Do I have to POST that instead of using GET? > The path on the wiki page was wrong. You need to use the adminPath in the url. Look at the adminPath attribute in solr.xml. It is typically /admin/cores So the correct path for you would be: http://localhost:8983/solr/admin/cores?action=mergeindexes&core=merged&indexDir=/Users/my/path/solr_1.4/solr/data/reindexed_marc/index&indexDir=/Users/my/path/solr_1.4/solr/data/reindexed_rdf/index<http://localhost:8983/solr/merged/admin/?action=mergeindexes&core=merged&indexDir=/Users/my/path/solr_1.4/solr/data/reindexed_marc/index&indexDir=/Users/my/path/solr_1.4/solr/data/reindexed_rdf/index> I've fixed the wiki too. > Alternately, is there a way to specify merging from the admin interface? > > Third, I've googled for info about merging and not come up with any > solutions, but I did see a possible concern: > > Is it true that after merging, that your index can have duplicate > documents? If so, then I need to create a step after merging for deleting > the old copy of everything I merged. > > Yes it can have duplicate documents. Merge is handled by Lucene which does not have the concept of a uniqueKey. I'm not sure how you can do that in a separate step. > Given all the above, I'm wondering if it would make more sense to just > retrieve each document from the old index and add it to the new index and > forget about merging. I know that would be a slow process, but I'm not sure > how much slower that would be than doing the merge (how long does that > take?), then going through the entire index and eliminating duplicates. > > It could be slow. But if in the end you need to merge, can you skip the intermediate lucene index completely? -- Regards, Shalin Shekhar Mangar.