I would add a multiValued field for buying_customers. Add the customer ID for 
each relevant customer to that field. Then use a boost query “bq”, to boost 
those.

Try that first before using the hit rate. Always try on/off control before 
going proportional. The simple approach will probably give you 80% of the 
benefit. Then you can declare victory and go on to the next idea.

If you do need hit rate, try quantizing that into high/medium/low, or deciles, 
or something. Then you have one multiValued field for each level and one bq for 
each level. The bq will include a weight: bq=customer_hi:1234^8. Logarithmic 
levels are probably your friend here.

You can get some unwanted idf scoring with bq. Customers that only buy a few 
things get a higher bq weight that customers that buy a lot of things. You can 
fix that with function queries, but I’d get it working with a boost query first.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Feb 5, 2016, at 8:13 AM, Jack Krupansky <jack.krupan...@gmail.com> wrote:
> 
> This doesn't sound like a great use case for Solr - or any other search
> engine for that matter. I'm not sure what you are really trying to
> accomplish, but you are trying to put way too many balls in the air to
> juggle efficiently. You really need to re-conceptualize your problem so
> that it has far fewer moving parts. Sure, Solr can handle many millions or
> even billions of documents, but the focus for scaling Solr is on more
> documents and more nodes, not incredibly complex or large documents. The
> key to effective and efficient use of Solr is that queries are "quite
> short", definitely not "quite long."
> 
> That said, the starting point for any data modeling effort is to look at
> the full range of desired queries and that should drive the data model. So,
> give us more info on queries, in terms of plain English descriptions of
> what the user is trying to achieve.
> 
> 
> -- Jack Krupansky
> 
> On Fri, Feb 5, 2016 at 8:20 AM, Jan Verweij - Experts in search <
> j...@searchxperts.nl> wrote:
> 
>> Hi,
>> We store 50K products stored in Solr. We have 10K customers and each
>> customer buys up to 10K of these products. Now we want to influence the
>> results by adding a field for every customer.
>> So we end up with 10K fields to influence the results on the buying
>> behavior of
>> each customer (personal results). Don't think this is the way to go so I'm
>> looking for suggestions how to solve
>> this.
>> One other option would be to: 1. create one multivaluefield
>> 'company_hitrate'
>> 2. store for each company their [companyID]_[hitrate]
>> 
>> During search use boostfields [companyID]_50 …. [companyID]_100 So in this
>> case the query can become quit long (51 options) but the number of
>> fields is limited to 1. What kind of effect would this have on the search
>> performance
>> Any other suggestions?
>> Jan.

Reply via email to