On Fri, 2016-03-04 at 12:41 +0530, Aneesh Mon N wrote:
>    - is there any difference in posting the data in json format vs xml?
>    - do we get any performance improvement if we generate the json/xml
>    files, scp to the solr server and then push via curl command

I have not tested that, but as part of performance testing indexing, I
achieved a markedly increase in performance when I used CSV. That was
for very small documents though. I do not know how well it works for
large ones.

Standard sanity check: Have you tried piping the result from Penthao
into /dev/null, to see if it is Solr or the extraction part that is the
heavy one?

- Toke Eskildsen, State and University Library, Denmark


Reply via email to