What's the bottleneck?

Jason Rennie Thu, 11 Sep 2008 08:25:26 -0700

We have a 14 million document index that we only use for querying
(optimized, read-only).  When we issue queries that have few, relatively
rare words, the query returns quickly.  However, when the query is longer
and uses more common words (hitting, say, ~1 million docs), it might take
seconds to return.  I'd like to know: what's the bottleneck?  It doesn't
seem to be disk---i/o wait times on the machine are much, much lower than on
our database servers (e.g. 3% vs. 50%).  Our search server is an 8-core
machine and we do see cpu regularly holding above 100%, so cpu seems
plausible, but would it really take that long to compute scores?


We're using DisMax.  There are a number of different fields that we search
over (5 to be exact).  We also have an fq on a single-digit status field.
Does it make sense that computation time could easily exceed a second?  If
cpu is the bottleneck, is there anything we could do to easily trim-down
computation time (besides removing common words from the query)?

Jason

-- 
Jason Rennie
Head of Machine Learning Technologies, StyleFeeder
http://www.stylefeeder.com/
Samantha's blog & pictures: http://samanthalyrarennie.blogspot.com/

What's the bottleneck?

Reply via email to