Hey hhc, I am new to Solr, so pardon me if this throws you off. But I think the following piece of code is relevant to your problem from MoreLikeThisHandler#handleRequestBody():
// Find documents MoreLikeThis - either with a reader or a query // -------------------------------------------------------------------------------- if (reader != null) { mltDocs = mlt.getMoreLikeThis(reader, start, rows, filters, interesting, flags); } else if (q != null) { // Matching options boolean includeMatch = params.getBool(MoreLikeThisParams.MATCH_INCLUDE, true); int matchOffset = params.getInt(MoreLikeThisParams.MATCH_OFFSET, 0); // Find the base match* DocList match = searcher.getDocList(query, null, null, matchOffset, 1, * flags); // only get the first one... if (includeMatch) { rsp.add("match", match); } // This is an iterator, but we only handle the first match* DocIterator iterator = match.iterator(); * if (iterator.hasNext()) { // do a MoreLikeThis query for each document in results *int id = iterator.nextDoc(); mltDocs = mlt.getMoreLikeThis(id, start, rows, filters, interesting, flags);* } } else { throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "MoreLikeThis requires either a query (?q=) or text to find similar documents."); } } finally { if (reader != null) { reader.close(); } } >From the code in bold, it seems like it pulls the first document from the top 10 list (which is most likely your duplicate document, as it seems to be ranked by score), and issues an mlt query on that. As an experiment to verify this, you can try the following: 1. Add a *third* document, similar to "aaa", let's say it's called "ccc". 2. Issue the same query that you posted above: http://localhost:8983/solr/test/select?q=id:aaa&mlt=true&mlt.fl=title 3. If you see document "ccc" in the results list, that confirms the above notion of mine. Let us know how it goes! Best Regards, Nishant Kelkar On Thu, Nov 27, 2014 at 2:33 AM, hhc <hhchen1...@gmail.com> wrote: > I have two documents with ids "aaa" and "bbb", and the titles of both > documents are "a black fox jumps over a red flower". I imported both > documents, along with several other testing documents, two a core "test". > > I want solr to return documents similar to document "aaa", so I submited > the > following: > > http://localhost:8983/solr/test/select?q=id:aaa&mlt=true&mlt.fl=title > > Solr returned some similar documents. However, document "bbb", which > should > be the most similar document of "aaa", was not in the mlt returned list. > Any ideas how this could happen? Thanks! > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-mlt-doesn-t-return-documents-with-exactly-the-same-contents-tp4171284.html > Sent from the Solr - User mailing list archive at Nabble.com. >