Correct: multivalued field with 10000 shop IDs. Use case: shopping network in U.S. for example for a big brand such as Walmart, when user implicitly provides IP address or explicitly Postal Code, so that we can find items in his/her neighbourhood.
You basically provide “join” information via this 10,000-sized collection of IDs per document. It almost doesn’t have any impact on index size. User query needs to provide list of preferred IDs (if for example we know user’s geo location). And for this “Walmart” use case you may also need “Available Online Only” option, etc. From: Karl Kildén <karl.kil...@gmail.com> <karl.kil...@gmail.com> Reply: solr-user@lucene.apache.org <solr-user@lucene.apache.org> <solr-user@lucene.apache.org> Date: February 6, 2017 at 5:57:41 AM To: solr-user@lucene.apache.org <solr-user@lucene.apache.org> <solr-user@lucene.apache.org> Subject: Help with design choice: join or multiValued field Hello! I have Items and I have Shops. This is a e-commerce system with items from thousands of shops all though the inventory is often similar between shops. Some users can shop from any shop and some only from their default one. One item can exist in about 10000 shops. - When a user logs in they may have a shop pre selected so when they search for items we need to get all matching documents but if it's' found in their pre selected shop we should mark it out in the UI. - They need to be able to filter out only items in their current shop - Items found in their shop should always be boosted heavily TLDR: Either we just have a multiValued field on the item document with all shops. This would be a multivalued field with 10000 rows Or Could we have a new document ShopItem that has the shopId and the itemId (think join table). Then we join this document instead... But we still need to get the Item document back, and we need bq boosting on item?