Hi,

For once I might be of some help: I've had a similar configuration
(large set of products from various sources). It's very difficult to
find the right balance between all parameters and requires a lot of
tweaking, most often in the dark unfortunately.

What I've found is that omitNorms=true is a real breakthrough: without
it results tend to favor small texts, which is not what's wanted for
product names. I also added a RemoveDuplicatesTokenFilterFactory for the
name as it's a common practice for spammers to repeat some key words in
order to be better placed in results. Stemming and custom stop words
(e.g. "cheap", "sale", ...) are other potential ideas.

I've also ended up in removing the description field as it's often too
broad, and name is now the only field left: brand, category and merchant
(as well as other fields) are offered as additional filters using
facets. Note that you'd have to re-index them as plain strings.

It's more difficult to achieve but popularity boost can also be useful:
you can measure it by sales or by number of clicks. I use a combination
of both, and store those values using partial updates.

Hope it helps,
John


On 17/03/16 09:36, Robert Brown wrote:
> Hi,
>
> I currently have an index of ~50m docs representing shopping products:
> name, description, brand, category, etc.
>
> Our "qf" is currently setup as:
>
> name^5
> brand^2
> category^3
> merchant^2
> description^1
>
> mm: 100%
> ps: 5
>
> I'm getting complaints from the business concerning relevancy, and was
> hoping to get some constructive ideas/thoughts on whether these boosts
> look semi-sensible or not, I think they were put in place pretty much
> at random.
>
> I know it's going to be a case of rounds upon rounds of testing, but
> maybe there's a good starting point that will save me some time?
>
> My initial thoughts right now are to actually just search on the name
> field, and maybe the brand (for things like "Apple Ipod").
>
> Has anyone got a similar setup that could share some direction?
>
> Many Thanks,
> Rob
>

Reply via email to