Kelly,
This is a good question you have posed and illustrates a challenge with Solr's 
limited schema.  I don't see how the dedup will help.  I would continue with 
the SKU based approach and use this patch:
https://issues.apache.org/jira/browse/SOLR-236
You'll collapse on the product id.  My book, p.192, highlights this component 
as it existed when I wrote it but it has been updated since then.

A recent separate question by you on this list suggests you're going down this 
path.  I would grab the attached SOLR-236.patch file and attempt to apply it to 
the 1.4 source.

~ David Smiley
Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/


On Jan 11, 2010, at 5:27 PM, Kelly Taylor wrote:

> 
> I am in the process of building a Solr search solution for my application and
> have run into a roadblock with the schema design.  Trying to match criteria
> in one multi-valued field with corresponding criteria in another
> multi-valued field.  Any advice would be greatly appreciated.
> 
> BACKGROUND:
> My RDBMS data model is such that for every one of my "Product" entities,
> there are one-to-many "SKU" entities available for purchase. Each SKU entity
> can have its own price, as well as one-to-many options, etc.  The web
> frontend displays available "Product" entities on both directory and detail
> pages.
> 
> In order to take advantage of Solr's facet count, paging, and sorting
> functionality, I decided to base the Solr schema on "Product" documents; so
> none of my documents currently contain duplicate "Product" data, and all
> "SKU" related data is denormalized as necessary, but into multi-valued
> fields.  For example, I have a document with an "id" field set to
> "Product:7," a "docType" field is set to "Product" as well as multi-valued
> "SKU" related fields and data like, "sku_color" {Red | Green | Blue},
> "sku_size" {Small | Medium | Large}, "sku_price" {10.00 | 10.00 | 7.99}
> 
> I hit the roadblock when I tried to answer the question, "Which products are
> available that contain skus with color Green, size M, and a price of $9.99
> or less?"...and have now begun the switch to "SKU" level indexing.  This
> also gives me what I need for faceted browsing/navigation, and search
> refinement...leading the user to "Product" entities having purchasable "SKU"
> entities.  But this also means I now have documents which are mostly
> duplicates for each "Product," and all, facet counts, paging and sorting is
> then inaccurate;  so it appears I need do this myself, with multiple Solr
> requests.
> 
> Is this really the best approach; and if so, should I use the Solr
> Deduplication update processor when indexing and querying?
> 
> Thanks in advance,
> Kelly
> -- 
> View this message in context: 
> http://old.nabble.com/Encountering-a-roadblock-with-my-Solr-schema-design...use-dedupe--tp27118977p27118977.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 



Reply via email to