Hi folks We started using the default implementation of MLT (org.apache.solr.handler.MoreLikeThisHandler) recently and found that there are a couple of things it lacks:
1. Searching for terms in the same field as the original document: - the current implementation picks the top field to search an interesting term in based on docFreq, however this can give bad results if say original product is from brand:"RED Valentino", and we end up searching red in color field. 2. Phrase boosts: - if product name is "business cards", then it makes sense to give a boost to the phrase boost to products which are also business cards. 3. Support for bq, bf, fq, multiplicative boost: - you might want to filter out_of_stock products, give a multiplicative boost to a product based on their price similarity / launch date. 4. Support of explainOther We had a use case for each of these and i ended up writing my own MLTQueryParser which builds the MLT query for a given document. It also has a new concept called childDocs. You can think of some documents as products, and a collection of products can be though of as a category page. You could search for similar documents based on the products a category page has. I was wondering if you guys would be interested in an alternate implementation of MLT that supports all the knobs that solr search does. I could post a patch file maybe? Thanks Gagan