Thank you, I am already on 4alpha. Patch feels a little too unstable for my needs/familiarity with the codes.
What about something around multiple cores? Could I have full-text fields stored in a separate cores and somehow (again, minimum hand-coding) do search against all those cores and get back combined list of document IDs? Or would it making comparative ranking/sorting impossible? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Sun, Jul 15, 2012 at 12:08 PM, Erick Erickson <erickerick...@gmail.com> wrote: > You've got a couple of choices. There's a new patch in town > https://issues.apache.org/jira/browse/SOLR-139 > that allows you to update individual fields in a doc if (and only if) > all the fields in the original document were stored (actually, all the > non-copy fields). > > So if you're storing (stored="true") all your metadata information, you can > just update the document when the text becomes available assuming you > know the uniqueKey when you update. > > Under the covers, this will find the old document, get all the fields, add the > new fields to it, and re-index the whole thing. > > Otherwise, your fallback idea is a good one. > > Best > Erick > > On Sat, Jul 14, 2012 at 11:05 PM, Alexandre Rafalovitch > <arafa...@gmail.com> wrote: >> Hello, >> >> I have a database of metadata and I can inject it into SOLR with DIH >> just fine. But then, I also have the documents to extract full text >> from that I want to add to the same records as additional fields. I >> think DIH allows to run Tika at the ingestion time, but I may not have >> the full-text files at that point (they could arrive days later). I >> can match the file to the metadata by a file name matching a field >> name. >> >> What is the best approach to do that staggered indexing with minimum >> custom code? I guess my fallback position is a custom full-text >> indexer agent that re-adds the metadata fields when the file is being >> indexed. Is there anything better? >> >> I am a newbie using v4.0alpha of SOLR (and loving it). >> >> Thank you, >> Alex. >> Personal blog: http://blog.outerthoughts.com/ >> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch >> - Time is the quality of nature that keeps events from happening all >> at once. Lately, it doesn't seem to be working. (Anonymous - via GTD >> book)