Re: documents with known relevancy

fiedzia Fri, 16 Jul 2010 08:06:53 -0700


Peter Karich wrote:
> 
> Hi,
> 
> Why do you need the weight for the tags?
>

The only reason to include weights is to sort results by weights.
So if there are multiple documents containing given tag,
i want them to be sorted by weight. Also i would like to be able 
to seach by multiple tags at once (so if there would be field "tags" with
all tags,
then documents with highest sum of their weights shoud be first. Sum is just
example here,
if solr can offer something similar or more advanced, its fine).

Peter Karich wrote:
> 
> you could index it this way:
> 
> {
>  id:     123
>  tag:    'tag1'
>  weight:  0.01
>  uniqueKey: combine(id, tag)
> }
> 
> {
>  id:     123
>  tag:    'tag2'
>  weight:  0.3
>  uniqueKey: combine(id, tag)
> }
> 
> and specify the query-time boost with the help of the weight.
> Retrieving the document content in a second request to another solrindex
> or using a db.
> 

Well, that would work for querying  for single tag. Do you know solution
solving problem of querying for multiple tags?

Perhaps i can explain the problem better by presenting obvious solution:
create multivalue field "tags" with all tags. Ths will allow to easily ask
solr for documents matching query
(which may look like that:  tags:tag1 AND tags:tag2). Then get list of all
results, retrieve tag weights from database and sort them by weight. This is
obviously inneficient, as it requires getting all documents from solr
(possibly large list), then again get them from db, then calculate weights
then sort them. So i am trying to involve solr in this processing.

Other solution i can think could work (though haven't examined it fully yet)
woud be to create single text field for tags with tags occurences matching
tag weight (so if tag2 weigtht is twice as big as tag1,
then the text contains tag1 once and tag2 twice ("tag1 tag2 tag2"), then
calculate document score
basing on amount of occurences of given tag in text). From what i know about
solr this could be done,
but maybe there is a better solution.

Peter Karich wrote:
> 
> there could be a different solution using dynamic fields and index-time
> boosts but I am not sure at the moment.       
> 

Can write more about it? Any idea is welcome.

Thanks for your help anyway.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/documents-with-known-relevancy-tp972462p972748.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: documents with known relevancy

Reply via email to