Condensing the loader into a single executable sounds right if
you have performance problems. ;-)

You could also try adding multiple <doc>s in a single post if you
notice your problems are with tcp setup time, though if you're
doing localhost connections that should be minimal.

If you're already local to the solr server, you might check out the
CSV slurper. http://wiki.apache.org/solr/UpdateCSV  It's a little
specialized.

And then there's of course the question of "are you doing full
re-indexing or incremental indexing of changes?"

--cw


On 8/9/07, Kevin Holmes <[EMAIL PROTECTED]> wrote:
>
> I inherited an existing (working) solr indexing script that runs like
> this:
>
>
>
> Python script queries the mysql DB then calls bash script
>
> Bash script performs a curl POST submit to solr
>
>
>
> We're injecting about 1000 records / minute (constantly), frequently
> pushing the edge of our CPU / RAM limitations.
>
>
>
> I'm in the process of building a Perl script to use DBI and
> lwp::simple::post that will perform this all from a single script
> (instead of 3).
>
>
>
> Two specific questions
>
> 1: Does anyone have a clever (or better) way to perform this process
> efficiently?
>
>
>
> 2: Is there a way to inject into solr without using POST / curl / http?
>
>
>
> Admittedly, I'm no solr expert - I'm starting from someone else's setup,
> trying to reverse-engineer my way out.  Any input would be greatly
> appreciated.
>
>

Reply via email to