ahh, after rereading this about 20 times today 8-)
i think i finally "get it" (your final question below).
if i do index-time boosts, and search only "text" (default field)
the boosts will propogate into "text", but only insofar that the
document will weight higher when a phrase is found in the "text"
field (regardless of whether that "hit" really was due to something
copyField-ed in with boost 1, boost 100, etc.)
so that solution would have the effect of making certain documents
have higher scores in the "text" field, not the effect we'd like.
[example documentA]
[description] i like to commute
[title] commuting thoughts
copyField text to:
[text] i like to commute commuting thoughts
we, the Archive, want query hits in title to boost ^100.
if we do q=commute (which searches "text")
with index-time boosting, solr/lucene won't know
the hit due to "title" should effect a much higher ranking
compared to documents with commute in "text" but
not in "title". however, the above document *will* have a higher
score, in general, because the "title" portion was nearly
half of the "text" field. Yet A will have a
higher ranking even for matches like "q=like"
compared to documentB like:
[description] i like bread
[text] i like bread
(when in reality, we'd like them to have near equal weighting).
So index boosts won't due for us. I'm learning!
--tracey
the std handler to see the ordering of the results change for
"fieldless queries"
(eg: "q=tracey+pooh"). I have 33 fields using <copyField dest="text"
source="..."/>
(where "text" is our default field to query)
to allow for checking across most of our std XML fields. I gather that
a boost
applied to "title" on indexing a docuement must somehow "propogate"
to the
"text" field?
I've tried some experiments, adjusting the boosts at index time and
running
Background: for an indexed field name there is a single boost value
per document. This is true even if the field is multi-valued... all
values for that document "share" the same boost. This is a Lucene
restriction so we can't fix it in Solr in any way.
Solr *does* propagate the index-time boost when doing copyField, but
this just ends up being multiplied into all the other boosts for
values for that document. Matches on the resulting text field will
*always* score higher, regardless of which "part" matched. Does that
make sense?
*ith - http://www.archive.org/~tracey <http://www.archive.org/%7Etracey> --*