On 10/9/2011 7:00 PM, Pulkit Singhal wrote:
I know that Solr accepts xml with Solr specific elements that are commands
that only it understands ... such as<add/>,<commit/>  etc.

Question: Is there some way to ask Solr to dump out whatever it has in its
index already ... as a Solr xml document?

Plan: I intend to message that xml dump (add the field + value that I need
in every doc's xml element) and then I should be able to push this dump back
to Solr to get data indexed again, I hope.

I don't know whether Solr will dump a format with the add tags, but I am guessing that it won't.

Although it is possible to use Solr as a data storage mechanism, it's not really designed for that role. If you have not set all your fields to stored=true in schema.xml, you won't be able to do what you are thinking about at all. Most solr installations do not store every field, because it makes the index huge.

For best results, you should be prepared at any time to rebuild your index from the original data source. Have you incorporated the extra field into your original data source and normal DIH mechanism? If you have, simply run a full-import and you're in business. Hopefully you've got a robust installation with multiple copies of the index, and can take one copy offline to do the rebuild.

At the Lucene Revolution conference in Boston last year, I saw a presentation where one company was getting data from multiple sources into one or more staging Solr instances, and using that as a data source for their real index. I believe it was the Hathi Trust, but I may be wrong there. Whoever it was, I don't know if they have released source for this mechanism or not.

Thanks,
Shawn

Reply via email to