On 5/13/2012 3:49 PM, Spadez wrote:
So, for the XML its easy, just update this index once a day. For the
database, should I have it incrementally reindexed into SOLR in real time,
or do it every hour or two?
Realtime is better, but I dont know how much strain this would put on my
server. If its an incremental update surely not too much?
What sort of turnaround means "realtime" to you? If several seconds
from start of update to document availability would work, and you don't
need to do an update more often than every few minutes, Solr can
probably keep up with no trouble. It might even be able to do it more
often than that.
Real world example: With Solr 3.5, which doesn't include realtime
capabilities, I do updates once a minute. A typical update cycle, which
includes a handful of deletes and reinserts on seven shards and several
dozen new document inserts on one shard, takes only a few seconds to
run. Because of autowarming and some really hairy filter queries
occasionally used by our application, the commit that follows will
sometimes take 15-30 seconds, but usually it's less than 10 seconds. I
would have no way of knowing how long a commit might take on your Solr
install, but with proper tuning, you could probably have the commit
operation only take a few seconds.
There are realtime features in Solr trunk, which will become 4.0. I'm
pretty sure that when the Solr devs say "realtime" they are talking
about turnaround times that are significantly less than one second. If
I'm wrong about that, I'm sure one of them will pipe up and say so.
Thanks,
Shawn