Adding constraints obtained from facets is best done using fq anyway, so it's worth making that switch in your client code anyway.

        Erik

On Nov 30, 2008, at 10:43 AM, Peter Wolanin wrote:

Hi Grant,

Thanks for your feedback.  The major short-term downside to switching
to dismax with multiple fields would be the required re-writing of our
current PHP code  - especially our code to handle addition of facets
fields to the q parameter.  From reading about dismax, seems we would
need to instead use fq to limit the search results to those matching a
specific facet value.

Best,

Peter


On Sun, Nov 30, 2008 at 8:43 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
Hi Peter,

What are the downsides to your last alternative approach below? That seems like the simplest approach and should work as long as the terms within those
fields do not need to be boosted separately.

If you want to go the boosting terms route, this is handled via a thing called Payloads in Lucene. Payloads are an array of bytes that are added during indexing at the term level through the analysis process. To do this in Solr, you would need to write your own TokenFilter that adds payloads as needed. Then, during search, you can take advantage of these payloads by using the BoostingTermQuery from Lucene. The downside to all of this is Solr doesn't currently support it, so you would be coding it up yourself. I'm sure, though, that if you were to start a patch on it, there would be
others who are interested.

Note, on the payloads. The biggest sticking point, I think, is coming up w/ an efficient way of encoding the byte array and putting it into the XML
format, such that one can send in payloads when indexing.  It's not
particularly hard, but no one has done it yet.

-Grant


On Nov 29, 2008, at 10:45 PM, Peter Wolanin wrote:

I've recently started working on the Drupal integration module for
SOLR, and we are looking for suggestions for how to address this
question: how do we boost the importance of a subset of terms within
a field.

For example, we are using the standard request handler for queries,
and the default field for keyword searches is a concatentation of the
title, body, taxonomy terms, etc.

One "hackish" way I can imagine is that terms we want to boost (for
example the title, or text inside h2 tags) could be concatenated on
multiple times.  Would this be effective and reasonable?

It seems like the alternative is to try to switch to using the dismax
handler, storing the terms that we desire to have different boosts
into different fields, all of which are in the list of query fields?

Thanks in advance for your suggestions.

-Peter

--------------------------------------------------------------
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
[EMAIL PROTECTED]

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ














--
--------------------------------------------------------------
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
[EMAIL PROTECTED]

Reply via email to