If your want to understand and debug the scoring you can use debugQuery=true to see how different documents score. Most of the time docs with both terms are on top of the result set unless norms are interferring.
To understand your should check the Solr relevancy wiki but the Lucene docs are much better although very low level. http://wiki.apache.org/solr/SolrRelevancyCookbook http://lucene.apache.org/java/3_1_0/api/core/org/apache/lucene/search/Similarity.html Your question is more a relevance question than about the termfreq function. To be short, don't use those kind of functions if you don't yet understand similarity as describe in the Lucene docs. > I am trying to test out and compare different sorts and scoring. > > When I use dismax to search for "indie music" > with: qf=all_lists_text&q="indie+music"&defType=dismax&rows=100 > I see some stuff that seems "irrelevant", meaning in top results I see only > 1 or 2 mentions of "indie music", but when I look further down the list I > do see other docs that have more occurrences of "indie music". > So I a want to test by comparing the the different queries versus seeing a > list of docs ranked specifically by the count of occurrences of the phrase > "indie music" > > On Mon, Aug 8, 2011 at 2:19 PM, Markus Jelsma <markus.jel...@openindex.io>wrote: > > > Dismax queries can. But > > > > > > sort=termfreq(all_lists_text,'indie+music') > > > > > > is not using dismax. Apparenty termfreq function can not? I am not > > > familiar with the termfreq function. > > > > It simply returns the TF of the given _term_ as it is indexed of the > > current > > document. > > > > Sorting on TF like this seems strange as by default queries are already > > sorted > > that way since TF plays a big role in the final score. > > > > > To understand why you'd need to reindex, you might want to read up on > > > how lucene actually works, to get a basic understanding of how > > > different indexing choices effect what is possible at query time. > > > Lucene In Action is a pretty good book. > > > > > > On 8/8/2011 5:02 PM, Jason Toy wrote: > > > > Are not Dismax queries able to search for phrases using the default > > > > index(which is what I am using?) If I can already do phrase > > > > searches, > > > > I > > > > > > don't understand why I would need to reindex t be able to access > > > > phrases > > > > > > from a function. > > > > > > > > On Mon, Aug 8, 2011 at 1:49 PM, Markus > > > > Jelsma<markus.jel...@openindex.io>wrote: > > > >>> Aelexei, thank you , that does seem to work. > > > >>> > > > >>> My sort results seem to be totally wrong though, I'm not sure if > > > >>> its because of my sort function or something else. > > > >>> > > > >>> My query consists of: > > > >>> sort=termfreq(all_lists_text,'indie+music')+desc&q=*:*&rows=100 > > > >>> And I get back 4571232 hits. > > > >> > > > >> That's normal, you issue a catch all query. Sorting should work > > > >> but.. > > > >> > > > >>> All the results don't have the phrase "indie music" anywhere in > > > >>> their > > > >> > > > >> data. > > > >> > > > >>> Does termfreq not support phrases? > > > >> > > > >> No, it is TERM frequency and indie music is not one term. I don't > > > >> know how this function parses your input but it might not > > > >> understand your + escape and > > > >> think it's one term constisting of exactly that. > > > >> > > > >>> If not, how can I sort specifically by termfreq of a phrase? > > > >> > > > >> You cannot. What you can do is index multiple terms as one term > > > >> using the shingle filter. Take care, it can significantly increase > > > >> your > > > > index > > > > > >> size and > > > >> number of unique terms. > > > >> > > > >>> On Mon, Aug 8, 2011 at 1:08 PM, Alexei Martchenko< > > > >>> > > > >>> ale...@superdownloads.com.br> wrote: > > > >>>> You can use the standard query parser and pass q=*:* > > > >>>> > > > >>>> 2011/8/8 Jason Toy<jason...@gmail.com> > > > >>>> > > > >>>>> I am trying to list some data based on a function I run , > > > >>>>> specifically termfreq(post_text,'indie music') and I am unable > > > >>>>> to > > > >> > > > >> do > > > >> > > > >>>>> it without passing in data to the q paramater. Is it possible to > > > > get > > > > > >>>>> a > > > >>>> > > > >>>> sorted > > > >>>> > > > >>>>> list without searching for any terms? > > > >>>> > > > >>>> -- > > > >>>> > > > >>>> *Alexei Martchenko* | *CEO* | Superdownloads > > > >>>> ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) > > > >>>> 5083.1018/5080.3535/5080.3533