Newbie Design Questions

Gunaranjan Chandraraju Tue, 20 Jan 2009 15:46:09 -0800

Hi All

We are considering SOLR for a large database of XMLs. I have somenewbie questions - if there is a place I can go read about them do letme know and I will go read up :)

1. Currently we are able to pull the XMLs from a file systems usingFileDataSource. The DIH is convenient since I can map my XML fieldsusing the XPathProcessor. This works for an initial load. Howeverafter the initial load, we would like to 'post' changed xmls to SOLRwhenever the XML is updated in a separate system. I know we can postxmls with 'add' however I was not sure how to do this and maintain theDIH mapping I use in data-config.xml? I don't want to save the fileto the disk and then call the DIH - would prefer to directly post it.Do I need to use solrj for this?

2. If my solr schema.xml changes then do I HAVE to reindex all theold documents? Suppose in future we have newer XML documents thatcontain a new additional xml field. The old documents that arealready indexed don't have this field and (so) I don't need search onthem with this field. However the new ones need to be search-able onthis new field. Can I just add this new field to the SOLR schema,restart the servers just post the new new documents or do I need toreindex everything?

3. Can I backup the index directory. So that in case of a disk crash- I can restore this directory and bring solr up. I realize that anydocuments indexed after this backup would be lost - I can however keeptrack of these outside and simply re-index documents 'newer' than thatbackup date. This question is really important to me in the contextof using a Master Server with replicated index. I would like to runthis backup for the 'Master'.

4. In general what happens when the solr application is bounced? Isthe index affected (anything maintained in memory)?


Regards
Guna

Newbie Design Questions

Reply via email to