Some work has been done in this general area, see SOLR-445. That might give you some pointers....
Best Erick On Mon, Oct 17, 2011 at 11:00 AM, samuele.mattiuzzo <samum...@gmail.com> wrote: > Hi all, as far as i know, when solr finds a faulty document (inside an xml > containing let say 1000 docs) it skips the whole file and the indexing > process exits with exception (am i correct?) > > I'm using a custom indexing plugin, and i can trap the exception. Instead of > using "default" values if that exception is raised, i would like to skip the > document raising the error (example: sometimes i try to insert a string > inside a "string" field, but solr exits saying it's expecting a multiValued > field... i guess it's because of some ascii chars within the text, something > like \n or sort...) maybe logging it somewhere, and pass to the next one. > We're indexing millions of them, and we don't care much if we loose 10-20% > of them, so the best solution is skip the single faulty doc and continue > with the rest. > > I guess i have to work on the super.processAdd() call, but i don't know > where i can find info about it. Can anybody help me? Is there a book talking > about advanced solr plugin developement i could read? > > Thanks! > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-indexing-plugin-skip-single-faulty-document-tp3427646p3427646.html > Sent from the Solr - User mailing list archive at Nabble.com. >