Boost Strangeness

Judioo Sat, 18 Jun 2011 05:12:49 -0700

WONDERFUL!
Just reporting back.
This document is ACE

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters


For explaining what the filters are and how to affect the analyzer.

Erik your statement "First, boosting isn't absolute"  played on me so
I continued to investigate boosting.

I found this document that ( at last ) explains the dismax logic

http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/

The reason why I was not getting the order I require was due to:
A)  my boost metrics were too close together.
b) similar id's in a document affected the score


It seems that if a partial match is made the product ( a % of the
total boost ) contributes to the documents score.
This meant that one type of document in the index had a higher
aggregate score due to the fact it had all but one of the boosted
fields ( does not have parent_id ) in it and the fields where
populated with content that was *very* similar to the requested id.

for example

required id = b011mg62
X_id = b011mgsf

Due to the partial matching and closeness of the boost ranges this
type of document always aquired a higher score than another document
with just one matching field ( i.e. id field ).

My solution was to increase the value of the fields I wanted to *really* count

id^100000 parent_id^5000 brand_container_id^500 ....

As a result even if there are similar matches in any field the id and
parent_id matches should always receive a higher boost.


This was also useful
http://stackoverflow.com/questions/2179497/adding-date-boosting-to-complex-solr-queries


Thanks for the help!

Boost Strangeness

Reply via email to