Hallo, We have a website on which you can search through a large amount of products from different shops.
The information describing the products are provided to us by the shops which sell these products. If we sort a search result by score many products of the same shop are clustered together. The reason for this behavior is that a shops tend to use the same 'style' to describe their products. For example: Shop 'foo' describes its products with 250 words and uses the searched word once. Shop 'bar' describes its products with only 25 words and also uses the searched word once. The score for shop 'foo' will be much worst than for shop 'bar'. In a search in which are many products of shop 'foo' and 'bar' the products of shop 'bar' are shown before the products of shop 'foo'. We tried to avoid this behavior by not using the term frequency. But after this we got very strange products under the first results. Has anybody an idea to avoid the clustering of products (documents) which are from the same shop? Greetings Max