Hoss, thanks for the response and confirmation. Yes, from reading the comments in the Lucene source of MoreLikeThis.java I have now realized that the field used in the TermQuery is "the top field that this word comes from". The SimilarityQuery is restricted to this field only for the specific word.
I guess I have to make "manual" modifications to the MoreLikeThis (lucene-queries.jar) code in order to make this work the way I like it to work. /Clas On Tue, Sep 16, 2008 at 8:21 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : Document 1 is probably a better match since the word yahoo is present > : two times. That seems fine, although I did not expect to see the > : "content:" part in the list of interestingTerms. > ... > : but the response is exactly the same as for the query without the mlt.qf. > : > : The problem seems to me to be related to the "content:" part of the > : interestingTerms list. I would have expected to read only "yahoo", > : "text:yahoo" or maybe "title:yahoo content:yahoo". > > There's a lot about MLT I don't undertsand, but a quick glance at the code > tells me a few things... > > 1) interestingTerms is listing out all of the TermQueries it produces ... > so the fact that it says content:yahoo does indicate it only considers the > word yahoo in the content field to be relevant. > > 2) the MLT Handler supports the debugQuery=true optinon, so that will give > you more debuging info about what is going on (including the full details > of the query being executed) > > > > -Hoss > >