You understand that you are making your site extremely easy to spam, right? 
This is how Microsoft became the top hit for “evil empire” on Google.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Jul 7, 2016, at 11:25 AM, Mark T. Trembley <mark.tremb...@etrailer.com> 
> wrote:
> 
> I've found that it is definitely complicated!
> 
> Essentially what I am attempting to do is boost products based on the number 
> of times that particular product has been selected via historical searches 
> using the same search term or phrase.
> 
> 
> On 7/7/2016 11:55 AM, Walter Underwood wrote:
>> That is a very complicated design. What are you trying to achieve? Maybe 
>> there is a different approach that is simpler.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Jul 7, 2016, at 9:26 AM, Mark T. Trembley <mark.tremb...@etrailer.com> 
>>> wrote:
>>> 
>>> That works with static boosts based on documents matching the query 
>>> "Boost2". I want to apply a different boost to documents based on the value 
>>> assigned to Boost2 within the document.
>>> 
>>> From my sample documents, when running a query with "Boost2," I want 
>>> Document2 boosted by 20.0 and Document6 boosted by 15.0:
>>> 
>>> {
>>>   "id" : "Document2_Boost2",
>>>   "B1_s" : "Boost2",
>>>   "B1_f" : 20
>>> }
>>> {
>>>   "id" : "Document6_Boost2",
>>>   "B1_s" : "Boost2",
>>>   "B1_f" : 15
>>> }
>>> 
>>> 
>>> On 7/7/2016 10:21 AM, Walter Underwood wrote:
>>>> This looks like a job for “bq”, the boost query parameter. I used this to 
>>>> boost textbooks which were used at the student’s school. bq does not force 
>>>> documents to be included in the result set. It does affect the ranking of 
>>>> the included documents.
>>>> 
>>>> bq=B1_ss:Boost2 will boost documents that match that. You can use weights, 
>>>> like bq=B1_ss:Boost2^10
>>>> 
>>>> Here is the relationship between fq, q, and bq:
>>>> 
>>>> fq: selection, does not affect ranking
>>>> q: selection and ranking
>>>> bq: does not affect selection, affects ranking
>>>> 
>>>> wunder
>>>> Walter Underwood
>>>> wun...@wunderwood.org
>>>> http://observer.wunderwood.org/  (my blog)
>>>> 
>>>> 
>>>>> On Jul 7, 2016, at 7:30 AM, Mark T. Trembley <mark.tremb...@etrailer.com> 
>>>>> wrote:
>>>>> 
>>>>> I have a question about the best way to rank my results based on a score 
>>>>> field that can have different values per document and where each document 
>>>>> can have different scores based on which term is queried.
>>>>> 
>>>>> Essentially what I'm wanting to have happen is provide a list of terms 
>>>>> that when matched via a query it returns a corresponding score to help 
>>>>> boost the original document. So if I had a document with a multi-valued 
>>>>> field named B1_ss with terms [Boost1|10], [Boost2|20], [Boost3|100] and 
>>>>> my search query is "Boost2", I want that document's result to be boosted 
>>>>> by 20. Also note that "Boost2" can boost different documents at different 
>>>>> levels. The query to select the actual documents will select against 
>>>>> other fields in the document and could possibly return documents with any 
>>>>> combination of B1 terms.
>>>>> 
>>>>> I'm still trying to figure out how best to model this in my index, either 
>>>>> as child documents, or in another collection, or if it would make more 
>>>>> sense to figure out how to make it work via payloads or by boosting the 
>>>>> terms at index time.
>>>>> 
>>>>> I'm running Solr 5.5.1 in cloud mode. Each server has a complete replica 
>>>>> of all collections.
>>>>> 
>>>>> The document structure I've been toying with the most is to put the 
>>>>> boosts into a separate index and join them using !join syntax and 
>>>>> returning the scores, but I've not had any luck getting quality results 
>>>>> from those tests. The extra "scores" index is structured like this (I'll 
>>>>> add the json for my test collections at the end of the email):
>>>>> id:Document1_Boost1
>>>>>  B1_s:Boost1
>>>>>  B1_f:10
>>>>> id:Document1_Boost3
>>>>>  B1_s:Boost3
>>>>>  B1_f:100
>>>>> Using this structure, I get close, but the scores are not what I'm 
>>>>> expecting. If I use the following query, the explain says it's using the 
>>>>> score from Document6_Boost2 even though my query is specifying B1_s:Boost3
>>>>> http://localhost:8983/solr/generic/select?q={!join from=id to=B1_name_ss 
>>>>> fromIndex=scores 
>>>>> score=max}B1_s:Boost3{!func}B1_f&fl=*,score&debugQuery=true
>>>>> 
>>>>> <lstname="explain">
>>>>> <strname="Document6">
>>>>> *3.379996* = Score based on join value Document6_Boost2
>>>>> </str>
>>>>> <strname="Document1">
>>>>> *2.2533307* = Score based on join value Document1_Boost1
>>>>> </str>
>>>>> <strname="Document7">
>>>>> *0.24786638* = Score based on join value Document7_Boost333
>>>>> </str>
>>>>> <strname="Document3">*0.0* = Score based on join value 
>>>>> Document3_NoBoost</str>
>>>>> </lst>
>>>>> 
>>>>> My guess is that it's now doing an all document query on the "scores" 
>>>>> collection to return the scores in addition to the B1_s query I've passed 
>>>>> in. I can't figure out where it's getting those scores from as a simple 
>>>>> query against the "scores" collection returns scores like I'd expect to 
>>>>> see them based on a similar query:
>>>>> http://192.168.1.194:8983/solr/scores/select?q=B1_s:Boost3 AND 
>>>>> _val_:B1_f&fl=score,*&debugQuery=true
>>>>> 
>>>>> <lstname="explain">
>>>>> <strname="Document1_Boost3">
>>>>> *46.834885* = sum of: 1.7682717 = weight(B1_s:Boost3 in 1) 
>>>>> [ClassicSimilarity], result of: 1.7682717 = score(doc=1,freq=1.0), 
>>>>> product of: 0.8926926 = queryWeight, product of: 1.9808292 = 
>>>>> idf(docFreq=2, maxDocs=8) 0.45066613 = queryNorm 1.9808292 = fieldWeight 
>>>>> in 1, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 
>>>>> 1.9808292 = idf(docFreq=2, maxDocs=8) 1.0 = fieldNorm(doc=1) 45.066612 = 
>>>>> FunctionQuery(float(B1_f)), product of: 100.0 = float(B1_f)=100.0 1.0 = 
>>>>> boost 0.45066613 = queryNorm
>>>>> </str>
>>>>> <strname="Document6_Boost3">
>>>>> *15.288256* = sum of: 1.7682717 = weight(B1_s:Boost3 in 5) 
>>>>> [ClassicSimilarity], result of: 1.7682717 = score(doc=5,freq=1.0), 
>>>>> product of: 0.8926926 = queryWeight, product of: 1.9808292 = 
>>>>> idf(docFreq=2, maxDocs=8) 0.45066613 = queryNorm 1.9808292 = fieldWeight 
>>>>> in 5, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 
>>>>> 1.9808292 = idf(docFreq=2, maxDocs=8) 1.0 = fieldNorm(doc=5) 13.519984 = 
>>>>> FunctionQuery(float(B1_f)), product of: 30.0 = float(B1_f)=30.0 1.0 = 
>>>>> boost 0.45066613 = queryNorm
>>>>> </str>
>>>>> </lst>
>>>>> 
>>>>> I feel like I'm getting close to what I need, but it's just not clear to 
>>>>> me what I'm missing at this point.
>>>>> 
>>>>> The other option I've been toying with is using payloads, but actually 
>>>>> utilizing the payloads as part of the scoring process is beyond me at 
>>>>> this time.
>>>>> 
>>>>> Any thoughts or hints on the best way to boost the relevancy of these 
>>>>> scoreswould be appreciated.
>>>>> Thanks
>>>>> Mark
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> GENERIC:
>>>>> {
>>>>>    "id" : "Document1",
>>>>>    "B1_ss" : ["Boost1|10","Boost3|100"],
>>>>>    "title_s" : "Title1"
>>>>>    ,"otherstuff_ss" : ["stuff1","suggestion"]
>>>>>    ,"B1_name_ss" : ["Document1_Boost1","Document1_Boost3"]
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document2",
>>>>>    "B1_ss" : ["Boost2|20"],
>>>>>    "name_s" : "Product2",
>>>>>    "title_s" : "Title2"
>>>>>    ,"otherstuff_ss" : ["stuff2","recommendation"]
>>>>>    ,"B1_name_ss" : ["Document2_Boost1"]
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document3",
>>>>>    "name_s" : "Product3",
>>>>>    "B1_ss" : ["NoBoost"],
>>>>>    "title_s" : "Title3"
>>>>>    ,"otherstuff_ss" : ["stuff3","new","suggestion"]
>>>>>    ,"B1_name_ss" : ["Document3_NoBoost"]
>>>>>  },
>>>>>   {
>>>>>   "id" : "Document4",
>>>>>    "name_s" : "Product4",
>>>>>    "title_s" : "Title4"
>>>>>    ,"otherstuff_ss" : ["stuff4","old","suggestion"]
>>>>>  } ,
>>>>>   {
>>>>>   "id" : "Document5",
>>>>>    "name_s" : "Product5",
>>>>>    "title_s" : "Title5"
>>>>>    ,"otherstuff_ss" : ["stuff5","recommendation"]
>>>>>  },
>>>>>   {
>>>>>    "id" : "Document6",
>>>>>    "name_s" : "Product6",
>>>>>    "B1_ss" : ["Boost2|15","Boost3|30"],
>>>>>    "title_s" : "Title6"
>>>>>    ,"B1_name_ss" : ["Document6_Boost2","Document6_Boost3"]
>>>>>  },
>>>>>   {
>>>>>     "id" : "Document7",
>>>>>    "name_s" : "Product7",
>>>>>    "B1_ss" : ["NoBoost","Boost333|1.1"],
>>>>>    "title_s" : "Title7"
>>>>>    ,"B1_name_ss" : ["Document7_NoBoost","Document7_Boost333"]
>>>>>  }
>>>>> 
>>>>> SCORES:
>>>>>  {
>>>>>    "id" : "Document1_Boost1",
>>>>>    "B1_s" : "Boost1",
>>>>>    "B1_f" : 10
>>>>>  },
>>>>>    {
>>>>>    "id" : "Document1_Boost3",
>>>>>    "B1_s" : "Boost3",
>>>>>    "B1_f" : 100
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document2_Boost2",
>>>>>    "B1_s" : "Boost2",
>>>>>    "B1_f" : 20
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document3_NoBoost",
>>>>>    "B1_s" : "NoBoost"
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document6_Boost2",
>>>>>    "B1_s" : "Boost2",
>>>>>    "B1_f" : 15
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document6_Boost3",
>>>>>    "B1_s" : "Boost3",
>>>>>    "B1_f" : 30
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document7_NoBoost",
>>>>>    "B1_s" : "NoBoost"
>>>>>  },
>>>>>  {
>>>>>    "id" : "Document7_Boost333",
>>>>>    "B1_s" : "Boost333",
>>>>>    "B1_f" : 1.1
>>>>>  }
>>>>> 
>> 
> 

Reply via email to