Hi, We have a use case where we want to influence the score of the documents based on the document type, and I am a bit unsure what is the best way to achieve this. In essence we have about 100.000 documents, of about 15 different document types. And we more or less want to tweak the score differently for each document type (ie it is not just one document type that should be boosed over all the others).
How would you suggest that we do this? First I thought that query time boosing would be perfect for this, because that way we can tweak and fine tune the boost levels without having to reindex everything each time. But to be honest, I really don't understand how I would put such a query together, using the edismax parser. I can't seem to find one single example for edismax for this, using the multiplicative boost, that boosts like this: documentType:person^1.8 documentType:publication^1.5 documentType:news^1.5 documentType:event^1.3 etc... Can someone help me out with the syntax? Another approach could be that we use index time boost. That would simplify the querys, and to be honest I don't think that we need to modify the boosting factors much after the initial tweaking is done, and also our indexing process is fairly quick and light weight, so it isn't a big deal to perform a full reindex. But here I am also unsure of how to set that up properly. Basically we want to boost the documents based on document type, regardless of the query. According to the documentaiton, this is what happens when one uses the boost attribute on the doc element in the xml. However the documentation also mentions that this is just "a convinience mechanism equivilent to specifying a boost attribute on each of the individual fields that support norms". This leaves me wondering: 1. If boost is defined on both the doc and field level, how is that interpreted? Are the values merged using add/multiply/max/some-other-math-function? Or is the doc boost just used as a default value for fields that doesn't defined their own boost? 2. What about fields that doesn't have norms? If a query matches such a field, wouldn't that effect the score, without me being able to effect that score? 3. On a general note: Is the score I'm boosting really the total/outermost/final score of the document? So that a boost of 2.0 would double the final score of that document, all else equal? Or I'm I simply boosting one "inner score", that in turn is used in some complex math expression so that it might not influence the final score at all in circumstances, and other times might only influence the score in a much smaller way? An alternative I guess could be to start out with query time boosting like above, to find the apropriate boosting levels. And then convert this to some kind of hybrid solition afterwards, where the boost factor is stored in a field in the document (thus being set at index time), and then being used in a boost function in the query. With this solution, I guess that it would also be possible to have multiple "boost fields" in the documents, each with different relative boost values based on document type, and then be able to choose at query time what boost field we want. Would that be a good solution you think? But would it be possible to go from a query boost of the type "documentType:person^1.8 ..." to a function query boost that uses a document field with that value? Ie, would the resulting scores be the same for "documentType:person^1.8 ..." on one hand, and a function boost query with a field that has the value 1.8 for documents of type person? Or could the boost values from these different boost styles result in different final scores? Regards /Jimi