I would add a multiValued field for buying_customers. Add the customer ID for each relevant customer to that field. Then use a boost query “bq”, to boost those.
Try that first before using the hit rate. Always try on/off control before going proportional. The simple approach will probably give you 80% of the benefit. Then you can declare victory and go on to the next idea. If you do need hit rate, try quantizing that into high/medium/low, or deciles, or something. Then you have one multiValued field for each level and one bq for each level. The bq will include a weight: bq=customer_hi:1234^8. Logarithmic levels are probably your friend here. You can get some unwanted idf scoring with bq. Customers that only buy a few things get a higher bq weight that customers that buy a lot of things. You can fix that with function queries, but I’d get it working with a boost query first. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 5, 2016, at 8:13 AM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > > This doesn't sound like a great use case for Solr - or any other search > engine for that matter. I'm not sure what you are really trying to > accomplish, but you are trying to put way too many balls in the air to > juggle efficiently. You really need to re-conceptualize your problem so > that it has far fewer moving parts. Sure, Solr can handle many millions or > even billions of documents, but the focus for scaling Solr is on more > documents and more nodes, not incredibly complex or large documents. The > key to effective and efficient use of Solr is that queries are "quite > short", definitely not "quite long." > > That said, the starting point for any data modeling effort is to look at > the full range of desired queries and that should drive the data model. So, > give us more info on queries, in terms of plain English descriptions of > what the user is trying to achieve. > > > -- Jack Krupansky > > On Fri, Feb 5, 2016 at 8:20 AM, Jan Verweij - Experts in search < > j...@searchxperts.nl> wrote: > >> Hi, >> We store 50K products stored in Solr. We have 10K customers and each >> customer buys up to 10K of these products. Now we want to influence the >> results by adding a field for every customer. >> So we end up with 10K fields to influence the results on the buying >> behavior of >> each customer (personal results). Don't think this is the way to go so I'm >> looking for suggestions how to solve >> this. >> One other option would be to: 1. create one multivaluefield >> 'company_hitrate' >> 2. store for each company their [companyID]_[hitrate] >> >> During search use boostfields [companyID]_50 …. [companyID]_100 So in this >> case the query can become quit long (51 options) but the number of >> fields is limited to 1. What kind of effect would this have on the search >> performance >> Any other suggestions? >> Jan.