If users don't understand it anyway... just sum up termfreq(field,term) for
all query terms. Who will know that it is only an approximation?! BUT... it
will may cause queries to be significantly slower.
I mean, you COULD add a custom value source such as
sumtermfreq(field1,term1,field2,term2...) that iterates over all matched
documents and adds up termfreq(field,term).
The point with Lucene and Solr is that they reduce total hits into the
compact and more interesting and more "relevant" statistic of a score in the
range of 0.0 to 1.0. Maybe your users would simply like to see that more
"modern" statistic than the useless total hits which has so little real
value anyway. And the score is essentially free. Just add to "fl":
&fl=id,field1,...,score.
-- Jack Krupansky
-----Original Message-----
From: ses
Sent: Wednesday, November 28, 2012 9:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Total number of hits within all documents
Unfortunately a vague specification is all I have, due to the fact I am
trying to replicate the functionality in a closed-source legacy search
product. I suspect no-one at the company knows precisely how this works.
The purpose is ultimately to display to the user the entire number of 'hits'
found in all documents where a hit is any place in the text of the fields
searched (defined as 'qf' in the edismax search handler) where the search
terms appear. Essentially it should be like counting the number of
highlighted hits in a search with highlighting turned on. I could easily do
this for just the number of documents returned, specified by the 'rows'
parameter, by turning highlighting on and counting the snippets returned.
But I want this value for the entire dataset, which I have a feeling will be
too slow if I specify rows = total numFound.
I just want it to count this number for the fields specified in 'qf'. If it
could count all occurrences of terms that match wildcard queries, that would
be good but not essential. Fuzzy/span queries aren't used.
I would be fine with an approximation, for all I know this is how it works
using the old search product.
I hope this clarifies things a little, I realize it is a strange requirement
that the user is unlikely to even understand, but nevertheless apparently
the user must see something along the lines 'X documents found, Y hits
found'.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Total-number-of-hits-within-all-documents-tp4022895p4022920.html
Sent from the Solr - User mailing list archive at Nabble.com.