Clue us in as to what you actually want to do with this number. Maybe an approximation might solve the problem as well? In other words, what degree of accuracy is actually required?

Also, make sure you actually can reduce your proposed calculation to a mathematical function. As stated, it is a little too vague and non-specific. For example, what about queries that combine multiple fields and OR operations or wildcard or fuzzy queries. What about span queries? How would you counts hits when OR is used? And how would some downstream process actually use it?

A custom scorer and/or one or more custom function query value sources might be able to do exactly what you want, provided that you can express it with mathematical crispness - in other words, specific, crisp rules.

Without a crisp specification it is difficult to say whether Lucene/Solr can or cannot give you your desired magic number out of the box, although with custom scoring and custom values sources you should be able to do just about anything (that can be mathematically formulated.)

-- Jack Krupansky

-----Original Message----- From: ses
Sent: Wednesday, November 28, 2012 7:19 AM
To: solr-user@lucene.apache.org
Subject: Total number of hits within all documents

I'm trying to find a way to retrieve from a Solr query the total number of
hits for a query across all documents.

I'm using an edismax query handler which searches across several fields
(specified in the schema.xml).

I have tried:
/solr/my_core/keyword?q=knights of arabia&fl=ttf:totaltermfreq(html,'knights
of arabia')
but the totaltermfreq function only works on individual terms

I have also tried
/solr/my_core/keyword?q=knights of arabia&facet=true&facet.query={!edismiax}
knights of arabia
which retrieves the total number of documents found with the search terms
within (same as numFound)

What I want is the total number of times the search terms appear in all
documents. For a standard disjunctive query like this it would total all
occurrences of 'knights', 'of' and 'arabia'. For a query like q="knights of
arabia", it would only count all occurrences of the entire phrase, and for
q=knights AND of AND arabia the number would be the total number of times
each term appears across all documents (but results would be fewer than
q=knights of arabia as documents must have all three of these terms in them
by the nature of the query).

I hope this makes sense and that there is some way I might be able to do
this that I am missing? I would also (begrudgingly) be happy if the answer
is that due to the way searching works, this is not possible and Solr/Lucene
will not easily be modified to do this.



--
View this message in context: http://lucene.472066.n3.nabble.com/Total-number-of-hits-within-all-documents-tp4022895.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to