Thanks for the help Yonik & Shalin.It really makes it easy for me if I do not
have to stop/start the SOLR app during the merge operations.

The reason I have to do this many times a day, is that I am implementing a
simple-minded entity-extraction procedure for the content I am indexing. I
have a user defined taxonomy into which the current documents, and any new
documents should be classified under. The taxonomy defines the nested facet
fields for SOLR. When a new document is posted, the user expects to have it
available in the right facet right away. My classification procedure is as
follows when a new document is added.

1. Create a new temporary index with that document (no taxonomy fields at
this time)
2. Search this index with each of the taxonomy terms (synonyms are employed
as well through synonyms.txt) and find out which of these categories is a
hit for this document.
3. Add a new "<field ..." line into the document for each category that is a
match for this document.
4. Repost this updated document.

Now I have a new index that facets this document, the same the the big index
does.

5. I merge these two indices now so that the new document also part of the
big index.

6. Delete the temporary index

The reason for a new temporary index is that, the step 2 is A LOT quicker
with a single (or a handful) document. If I simply posted this new doc, into
the big index, and then tried to classify it, this search will take a while.
I have over 200 nested taxonomy fields to search over.

Are there better approaches?

Thanks

- ashok



Yonik Seeley wrote:
> 
> On Thu, Dec 4, 2008 at 6:39 PM, ashokc <[EMAIL PROTECTED]> wrote:
>>
>> The SOLR wiki says
>>
>>>>3. Make sure both indexes you want to merge are closed.
>>
>> What exactly does 'closed' mean?
> 
> If you do a commit, and then prevent updates, the index should be
> closed (no open IndexWriter).
> 
>> 1. Do I need to stop SOLR search on both indexes before running the merge
>> command? So a brief downtime is required?
>> Or do I simply prevent any 'updates/deletes' to these indices during the
>> merge time so they can still serve up results (read only?) while I am
>> creating a new merged index?
> 
> Preventing updates/deletes should be sufficient.
> 
>> 2. Before the new index replaces the old index, do I need to stop SOLR
>> for
>> that instance? Or can I simply move the old index out and place the new
>> index in the same place, without having to stop SOLR
> 
> Yes, simply moving the index should work if you are careful to avoid
> any updates since the last commit.
> 
>> 3. If SOLR has to be stopped during the merge operation, can we work with
>> a
>> redundant/failover instance and stagger the merge so the search service
>> will
>> not go down? Any guidelines here are welcome.
>>
>> Thanks
>>
>> - ashok
>> --
>> View this message in context:
>> http://www.nabble.com/Merging-Indices-tp20845009p20845009.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Merging-Indices-tp20845009p20859513.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to