Got an answer from the excellent support folks at LucidWorks: Currently Lucene/Solr can't do field-level updating. So whenever a new document is indexed, if it has the same unique identifier (in this case "id") field, then the new document replaces the older document. There is an open JIRA issue in Lucene to add field-level updates, but it is a very major, long-term task.
Thanks, On Thu, Aug 25, 2011 at 6:45 PM, Goran Pocina <gpoc...@gmail.com> wrote: > New to Solr and Lucene. We're indexing text, pdf, html docs located on > local Unix file systems, and need the ability to search for file owner, > group, and other Linux file metadata, in addition to the file contents. It > would be great if we could use nutch to index everything, and then crawl > through the file system again with a 10 line shell script that passed the > missing metadata to solr, and updated the existing docs. > > But <add><doc> deletes all the old fields even if they're not present in > the new document. > > If partial updates aren't possible, what would be the best way to > accomplish what we need? Do we want to to modify the source code for each > of the different doc format parsers to add support for this metadata? > > Thanks, >