Hello Olivier, for real production use, you won't really want to use any toys like post.jar or curl. You want a decent connector to whatever data source there is, that fetches data, possibly massages it a bit, and then feeds it into Solr - by means of SolrJ or directly into the web service of Solr via binary protocols. This way, you can properly handle incremental feeding, processing of data from remote locations (with the connector being closer to the data source), and also source data security. Also think about what happens if you do processing of incoming documents in Solr. What happens if Tika runs out of memory because of PDF problems? What if this crashes your Solr node? In our Solr projects, we generally do not do any sizable processing within Solr as document processing and document indexing or querying have all different scaling properties.
"Production use" most typically is not achieved by deploying a vanilla Solr, but rather having a bit more glue and wrappage, so the whole will fit your requirements in terms of functionality, scaling, monitoring and robustness. Some similar platforms like Elasticsearch try to alleviate these pains of going to a production-style infrastructure, but that's at the expense of flexibility and comes with limitations. For proof-of-concept or demonstrator-style applications, the plain tools out of the box will be fine. For production applications, you want to have more robust components. Best regards, --Jürgen On 28.10.2014 22:12, Olivier Austina wrote: > Hi All, > > I am reading the solr documentation. I have understood that post.jar > <http://wiki.apache.org/solr/ExtractingRequestHandler#SimplePostTool_.28post.jar.29> > is not meant for production use, cURL > <https://cwiki.apache.org/confluence/display/solr/Introduction+to+Solr+Indexing> > is not recommanded. Is SolrJ better for production? Thank you. > Regards > Olivier > -- Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С уважением *i.A. Jürgen Wagner* Head of Competence Center "Intelligence" & Senior Cloud Consultant Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 E-Mail: juergen.wag...@devoteam.com <mailto:juergen.wag...@devoteam.com>, URL: www.devoteam.de <http://www.devoteam.de/> ------------------------------------------------------------------------ Managing Board: Jürgen Hatzipantelis (CEO) Address of Record: 64331 Weiterstadt, Germany; Commercial Register: Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071