Hi Kevin,
I'm also a newbie but some thoughts along the line ...
+) for evaluating SOLR we used a less exotic setup for data import base
on Pnuts (a JVM based scripting language) ... :-) ... but Groovy would
do as well if you feel at home with Java.
+) my colleague just finished a database import service running within
the servlet container to avoid writing out the data to the file system
and transmitting it over HTTP.
+) I think there were some discussion regarding a generic database
importer but nothing I'm aware of
Cheers,
Siegfried Goeschl
Kevin Holmes wrote:
I inherited an existing (working) solr indexing script that runs like
this:
Python script queries the mysql DB then calls bash script
Bash script performs a curl POST submit to solr
We're injecting about 1000 records / minute (constantly), frequently
pushing the edge of our CPU / RAM limitations.
I'm in the process of building a Perl script to use DBI and
lwp::simple::post that will perform this all from a single script
(instead of 3).
Two specific questions
1: Does anyone have a clever (or better) way to perform this process
efficiently?
2: Is there a way to inject into solr without using POST / curl / http?
Admittedly, I'm no solr expert - I'm starting from someone else's setup,
trying to reverse-engineer my way out. Any input would be greatly
appreciated.