: Subject: "Weighted" facet strings

First off: a terminology clarification.  what you are describing has very
little to do with facets.  it's true that your "category" field is a
"facet" of your documents, but in the context of your question, you aren't
asking about any facet related features of solr.
 
what you are really asking about is specifying weighted importance on
individual values indexed in the category field of your documents.

The suggestion in another reply to use use multiple fields (cat_weight_1, 
cat_weight_2, etc...) and then boost those fields accordingly is a 
classic, easy to implement solution to this type of problem that works 
relaly well when the cardinality of "weights" is low and fixed (in your 
case 1-5)

Another way people have dealt with problems like this historicly is to 
"keyword stuff" the category field -- so if a document has category 
weights: foo=5, bar=3 yak=1 you index "foo foo foo foo foo bar bar bar 
yak" in the category field.  As long as you use a similarity that defines 
tf() as an identity function, and doesn't use length norm, this also works 
really well.  (There are also tricks you can do using custom update 
processors or tokenizers to let you send "foo=5" over the wire and have it 
index the "foo" token with a termFreq of 5)

Looking forward: the "best" way to solve this problem in theory is using 
Payloads, but there aren't a lot of options currently availbable for 
leveraging payloads in Solrs query APIs / Parsers, so you'd probably have 
to write something custom.


How you actaully execute the queries depends on hte approach you take at 
indexig -- lets assume you do the keyword stuffing approach...

: - filter: category=some_category_name, query: *.*  - Results should be score 
by 
: the above mentioned weight

        q=cat:some_category_name 
        & sort=score desc

...with a simple tf() func the default score will do exactly what you want

of you could use the same {!boost} solution as below with "*:*" ....

: - filter: category=some_category_name, query: some_keyword - Results should 
be 
: scored by a combination of the score of 'some_keyword' and the above 
mentioned 
: weight

you just have to define what you mean by "combination" in terms of solr 
query functions.  easies is multiplicitively with the {!boost} parser...

        q={!boost b=tf(cat,'some_category_name')}some_keyword 
        & fq=cat:some_category_name 
        & sort = score desc

: - filter: none, query: some_category_name - Documents with category 
: 'some_category_name' should be found as well as documents which contain the 
term 
: 'some_category_name'. Results should be scored by a combination of the score 
of 
: 'some_keyword' and the above mentioned weight

...you could do this by including your category field in the qf of a 
dismax search.

assuming you want a isngle solution that works for all of these, and your 
"query: some_keyword" example includes the possibility that some_keyword 
is also a cateogry name (and you want it's weight taking it account as 
well) then an all inclusive solution would probably be something like...

        q={!boost b=tf(cat,'some_category_name') defType=}some_keyword
        & qf = cat^10 otherfields^5
        & fq=cat:some_category_name
        & sort = score desc




-Hoss

Reply via email to