On 17-Jun-08, at 12:55 AM, Bram de Jong wrote:
It looks like all my tests with solr have been very conclusive: it's
the way
to go.
Glad to hear it!
Sadly enough, me nor our sysadmin have any experience with setting up
tomcat, jetty, orion, <insert your servlet here>.
We have plenty of experience with other servers like lighttpd and
apache,
but that doesn't particularly help.
What would be the easiest roadmap to set up Solr in our live
environment and
would that easy roadmap (whatever it is) be good enough for us
(given the
data below)?
I'd suggest using Jetty that comes with the distribution. Treat it as
you would a unix process, with 'java -jar start.jar' the launch
command and sending SIGTERM to kill it. You'll want to give it more
ram with the -Xmx and -Xms parameters, as well as specify -server.
Other than that, you'll need a way keep an eye on the log file.
That's about it. One quite useful debugging tool is to send SIGQUIT
to Solr, which prints out a stack trace for every alive thread at the
current timeslice.
Tech data:
There are 60K documents (and growing slowly at << 100/day) and about
20K-30K searches per day (growing faster than the #documents, but
not that
fast either).
Solr will have to share a (quad xeon, 12GB of RAM, SAS disks) with
Postgresql.
In all my tests (replaying stored searches) I had 0 cache misses and
between 0.7 and 0.99 hit rate for all 3 caches.
I will use plenty of faceting to create various tag clouds in various
places.
That should be fine. You might want to schedule an occasional
optimize (you can do that with cron+shell script)
-Mike