We have a 14 million document index that we only use for querying (optimized, read-only). When we issue queries that have few, relatively rare words, the query returns quickly. However, when the query is longer and uses more common words (hitting, say, ~1 million docs), it might take seconds to return. I'd like to know: what's the bottleneck? It doesn't seem to be disk---i/o wait times on the machine are much, much lower than on our database servers (e.g. 3% vs. 50%). Our search server is an 8-core machine and we do see cpu regularly holding above 100%, so cpu seems plausible, but would it really take that long to compute scores?
We're using DisMax. There are a number of different fields that we search over (5 to be exact). We also have an fq on a single-digit status field. Does it make sense that computation time could easily exceed a second? If cpu is the bottleneck, is there anything we could do to easily trim-down computation time (besides removing common words from the query)? Jason -- Jason Rennie Head of Machine Learning Technologies, StyleFeeder http://www.stylefeeder.com/ Samantha's blog & pictures: http://samanthalyrarennie.blogspot.com/