Attached is a Python script I use, with slight redactions, on several data 
import jobs.  The main points here are:

* Watch the job until the import finishes
* Always send email whether it succeeds or fails
* Put the hostname, and whether it was a success, in the subject for quick 
removal
* Always include both text/html and text/plain parts so that Outlook/Exchange 
don't remove new lines.
* Put some available statistics into the email body

At some point, I wanted to make this run *anywhere* in the cluster and use the 
Python client for Zookeeper to keep track of whether it has run or not.
You could, for instance, have a crontab start it many times per day, and have 
zookeeper arbitrate whether some other node has done the work.

For most of us, that is overkill...   and for those for whom it matters, you 
can run something like this as an AWS Lambda instead, and then AWS is in charge 
of scheduling it.


-----Original Message-----
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Friday, April 28, 2017 2:45 PM
To: solr-user@lucene.apache.org
Subject: Re: Import Handler using shell scripts

Yes, via the HTTP API (via curl or other tool).  See the commands and URL 
examples here: 
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler#UploadingStructuredDataStoreDatawiththeDataImportHandler-DataImportHandlerCommands
 
<https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler#UploadingStructuredDataStoreDatawiththeDataImportHandler-DataImportHandlerCommands>


> On Apr 28, 2017, at 2:14 PM, Vijay Kokatnur <kokatnur.vi...@gmail.com> wrote:
> 
> Is it possible to call dataimport handler from a shell script?  I have 
> not found any documentation regarding this. Any pointers?
> 
> --
> Best,
> Vijay

Reply via email to