Re: Newbie Design Questions

Gunaranjan Chandraraju Wed, 21 Jan 2009 17:33:12 -0800

Thanks

Yes the source of data is a DB. However the xml is also posted onupdates via publish framework. So I can just plug in an adapter hearto listen for changes and post to SOLR. I was trying to use theXPathProcessor inside the SQLEntityProcessor and this did not work(using 1.3 - I did see support in 1.4). That is not a show stopperfor me and I can just post them via the framework and use files forthe first time load.

Have a seen a couple of answers on the backup for crash scenarios.just wanted to confirm - if I replace the index with the backup'edfiles then I can simple start the up solr again and reindex thedocuments changed since last backup? Am I right? The slaves will alsoautomatically adjust to this.


THanks
Guna

On Jan 20, 2009, at 9:37 PM, Noble Paul നോബിള്‍नोब्ळ् wrote:

On Wed, Jan 21, 2009 at 5:15 AM, Gunaranjan Chandraraju
<[email protected]> wrote:
Hi All
We are considering SOLR for a large database of XMLs. I have somenewbiequestions - if there is a place I can go read about them do let meknow and
I will go read up :)

1. Currently we are able to pull the XMLs from a file systems using
FileDataSource. The DIH is convenient since I can map my XMLfields usingthe XPathProcessor. This works for an initial load. Howeverafter theinitial load, we would like to 'post' changed xmls to SOLR wheneverthe XMLis updated in a separate system. I know we can post xmls with'add' however
I was not sure how to do this and maintain the DIH mapping I use in
data-config.xml? I don't want to save the file to the disk andthen callthe DIH - would prefer to directly post it. Do I need to use solrjfor
this?
What is the source of your new data? is it a DB?
2. If my solr schema.xml changes then do I HAVE to reindex all theolddocuments? Suppose in future we have newer XML documents thatcontain a newadditional xml field. The old documents that are already indexeddon't
have this field and (so) I don't need search on them with this field.
However the new ones need to be search-able on this new field.Can Ijust add this new field to the SOLR schema, restart the serversjust post
the new new documents or do I need to reindex everything?
3. Can I backup the index directory. So that in case of a diskcrash - Ican restore this directory and bring solr up. I realize that anydocumentsindexed after this backup would be lost - I can however keep trackof theseoutside and simply re-index documents 'newer' than that backupdate. Thisquestion is really important to me in the context of using a MasterServerwith replicated index. I would like to run this backup for the'Master'.
the snapshot script is can be used to take backups on commit.
4. In general what happens when the solr application is bounced?Is the
index affected (anything maintained in memory)?

Regards
Guna
--
--Noble Paul

Re: Newbie Design Questions

Reply via email to