On Wed, Sep 15, 2010 at 4:21 PM, yklxmas <yklx...@gmail.com> wrote: [...] >> I'm using standard data import handler with file data source and xpath >> processor. so my script will be calling >> http://host:8983/solr/dataimport?command=full-import
I am not sure if you are aware of this, but unless you are doing some further processing in your data import handler, or have additional requirements, it would be simpler to just POST the file to the appropriate Solr URL (please see the "Indexing Data" section under http://lucene.apache.org/solr/tutorial.html for an example). I think that this method would solve a lot of your problems, unless there is a good reason for you to use a data import handler. [...] >> Does the fresh import go in a queue or simply won't start at all? If it >> won't start, that means I need to find out the status and start again if >> necessary. It does not go into a queue, so you will need to check the status, and start a new import. [...] >> we need to push the changes out immediately. The script does not have to run periodically. Depending on how the files are edited, it should be possible to trigger a POST to Solr when the file is saved. Of course, this is more problematic if you have to do a full-import with a data import handler. >> i need to >> rethink the process >> and suggest my team a new approach more along what you suggested. >> currently we haven't got that much data but constant full importing will >> cause problems for sure. If you are forced to use a data import handler, a delta-import would probably make sense. Do not know offhand, and am away from a Solr setup at the moment, but it ought to be possible to tell XPathEntityProcessor how to look at the file-modification date. Regards, Gora