On Wed, Sep 15, 2010 at 4:21 PM, yklxmas <yklx...@gmail.com> wrote:
[...]
>> I'm using standard data import handler with file data source and xpath
>> processor. so my script will be calling
>> http://host:8983/solr/dataimport?command=full-import

I am not sure if you are aware of this, but unless you are doing some further
processing in your data import handler, or have additional requirements, it
would be simpler to just POST the file to the appropriate Solr URL (please
see the "Indexing Data" section under
http://lucene.apache.org/solr/tutorial.html
for an example). I think that this method would solve a lot of your problems,
unless there is a good reason for you to use a data import handler.

[...]
>> Does the fresh import go in a queue or simply won't start at all? If it
>> won't start, that means I need to find out the status and start again if
>> necessary.

It does not go into a queue, so you will need to check the status, and
start a new import.

[...]
>> we need to push the changes out immediately.

The script does not have to run periodically. Depending on how the files are
edited, it should be possible to trigger a POST to Solr when the file is saved.
Of course, this is more problematic if you have to do a full-import with a
data import handler.

>>                                                              i need to 
>> rethink the process
>> and suggest my team a new approach more along what you suggested.
>> currently we haven't got that much data but constant full importing will
>> cause problems for sure.

If you are forced to use a data import handler, a delta-import would probably
make sense. Do not know offhand, and am away from a Solr setup at the
moment, but it ought to be possible to tell XPathEntityProcessor how to
look at the file-modification date.

Regards,
Gora

Reply via email to