It's worth adding that Lucene's BlendedTermQuery, (used in Elasticsearch's cross_field search), attempts to blend field's document frequency together. So I wonder what BlendedTermQuery plus BM25 similarity per-field would do? It might be close to true BM25F aside for the length issue.
(You'd have to write a QParserPlugin and build the BlendedTermQuery yourself, AFAIK there's not a direct Solr interface to it yet.) Best -Doug On Mon, Apr 18, 2016 at 4:52 PM Tom Burton-West <tburt...@umich.edu> wrote: > Hi David, > > It may not matter for your use case but just in case you really are > interested in the "real BM25F" there is a difference between configuring K1 > and B for different fields in Solr and a "real" BM25F implementation. This > has to do with Solr's model of fields being mini-documents (i.e. each field > has its own length, idf and tf) See the discussion in > https://issues.apache.org/jira/browse/LUCENE-2959, particularly these > comments by Robert Muir: > > "Actually as far as BM25f, this one presents a few challenges (some already > discussed on LUCENE-2091 < > https://issues.apache.org/jira/browse/LUCENE-2091> > ). > > To summarize: > > - for any field, Lucene has a per-field terms dictionary that contains > that term's docFreq. To compute BM25f's IDF method would be challenging, > because it wants a docFreq "across all the fields". (its not clear to > me at > a glance either from the original paper, if this should be across only > the > fields in the query, across all the fields in the document, and if a > "static" schema is implied in this scoring system (in lucene document 1 > can > have 3 fields and document 2 can have 40 different ones, even with > different properties). > - the same issue applies to length normalization, lucene has a "field > length" but really no concept of document length." > > Tom > > On Thu, Apr 14, 2016 at 12:41 PM, David Cawley <david.cawl...@mail.dcu.ie> > wrote: > > > Hello, > > I am developing an enterprise search engine for a project and I was > hoping > > to implement BM25F ranking algorithm to configure the tuning parameters > on > > a per field basis. I understand BM25 similarity is now supported in Solr > > but I was hoping to be able to configure k1 and b for different fields > such > > as title, description, anchor etc, as they are structured documents. > > I am fairly new to Solr so any help would be appreciated. If this is > > possible or any steps as to how I can go about implementing this it would > > be greatly appreciated. > > > > Regards, > > > > David > > > > Current Solr Version 5.4.1 > > >