In our application, we index educational resources and allow searching for
them.
We allow our customers to change some of the non-textual metadata associated
with a resource (like booklevel, interestlevel etc) to serve their users
better.
So for each resource, in theory it could have different set of metadata
values for each customer, but in reality may be 10 - 25% of our customers
customize a small portion of the resources.

Our current solution uses SQL Server to manage the customizations (the
database is sharded for other reasons as well) and also uses SQL Server's
Full Text index for search.
We are replacing this with Solr.

There are few approaches we had thought about, but none of them seem ideal

a) Duplicate the entries in Solr.  Each resource would be replicated for
each customer and there would be an index entry/customer.  
The number of index entries is an big concern even though the text field
values are the same.  
(We have about 300K resources and about 50K customers and both will grow)

b) Use a dedicated solr core for each customer.  This wouldn't be using
resources efficiently and we would be duplicating textual components 
which doesn't change from customer to customer.

c) Use a Global index that has the resources with default values and then
use a separate index for each customer that contains resources that are
customized
This requires managing lot of small cores/indexes.  Also this would require
merging results from multiple cores, so don't think this will work

d) Use solr to do the text search and do Post Processing to filter based on
metadata externally -- as you can imagine, this have all the 
challenges associated with post processing (pagination support, etc)

e) Use Advanced/Post filtering Solr support --- Even if we can figure out a
reasonable way to cache the lookup for metadata values for each customer, 
not sure if this would be efficient

Any other recommendations on solutions.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-different-customer-customized-field-values-tp4102000.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to