Thanks for your very fast response :-)
> > 2.) > > The documentation from DataImportHandler describes the index update > process for SQL databases only... > > > > My scenario: > > - My application creates, deletes and modifies files from /tmp/files > every night. > > - delta-import / DataImportHandler should "mirror" _all_ this changes to > my lucene index (=> create, delete, update documents). > The only Entityprocessor which supports delta is SqlEntityProcessor. > The XPathEntityProcessor has not implemented it , because we do not > know of a consistent way of finding deltas for XML. So , > unfortunately,no delta support for XML. But that said you can > implement those methods in XPathEntityProcessor . The methods are > explained in EntityProcessor.java. if you have questions specific to > this I can help.Probably we can contribute it back > > > > ===> Is this possible with delta-import / DataImportHandler? > > ===> If not: Do you have any suggestions on how to do this? Ok so, at the moment I have to do a full-import to update my index. What happens with (user) queries while full-import is running? Does Solr block this queries the import is finished? Which configuration options control this behavior? > > My scenario: > > - /tmp/files contains 682 'myDoc_.*\.xml' XML files. > > - Each XML file contains 12 XML elements (e.g. <title>foo</title>). > > - DataImportHandler transfer only 5 from this 12 elements to the lucene > index. > > > > > > I don't understand the output from 'solr/dataimport' (=> status): > > > > ### > > <response> > > ... > > <lst name="statusMessages"> > > <str name="Total Requests made to DataSource">0</str> > > <str name="Total Rows Fetched">1363</str> > > <str name="Total Documents Skipped">0</str> > > <str name="Full Dump Started">2008-10-24 13:19:03</str> > > <str name=""> > > Indexing completed. Added/Updated: 681 documents. Deleted 0 > documents. > > </str> > > <str name="Committed">2008-10-24 13:19:05</str> > > <str name="Optimized">2008-10-24 13:19:05</str> > > <str name="Time taken ">0:0:2.648</str> > > </lst> > > ... > > </response> > > > > ===> Why shows the "Added/Updated" counter 681 and not 682? > > Added updated is the no:of docs . How do you know the number is not > accurate? /tmp/files$ ls myDoc_*.xml | wc -l 682 But "Added/Updated" shows 681. Does this mean that one file has an XML error? But the statistic says "Total Documents Skipped" = 0?! > > 4.) > > And my last questions about Solr statistics/informations... > > > > ===> Is it possible to get informations (number of indexed documents, > stored values from documents etc.) from the current lucene index? > > ===> The admin webinterface shows 'numDocs' and 'maxDoc' in > 'statistics/core'. Is 'numDocs' = number of indexed documents? What means > 'maxDocs'? Do you have answers for this questions too? Bye, Simon -- Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer