Otis, Impressive list of possible solutions you've come up with :)
I've used Jonathan's "pattern" in several projects, but it quickly becomes unmanagable. My plan was to try to come up with a new FieldType inspired by FAST's Scope-field, which would take JSON in and be able to match hierarchical relationships with a syntax such as q=itemType:shoes AND items_json:"item:and(color:red,size:10)". The FieldType would make sure that the sub-tags within the and() actually exists within the scope of the same item. I's not trivial, as you implement a mini matching engine inside a field type and a new query syntax, but it should be possible for simple string type metadata. The FieldType would need to convert the json structure into some internal tree structure which is easily matched against the query. I also thought about a JSON PolyField, where inserting one JSON string into the poly field would generate a bunch of sub fields _items_json_item1_color, _items_json_item1_size... to be able to re-use Lucene's matching capabilities, but I did not get it to support all use cases in my head. Did anyone try SIREn? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 17. mars 2011, at 16.48, Jonathan Rochkind wrote: > The standard answer, which is a kind of de-normalizing, is to index tokens > like this: > > red_10 red_11 orange_12 > > in another field, you could do these things with size first: > > 10_red 11_red 12_orange > > Now if you want to see what sizes of red you have, you can do a facet query > with facet.prefix=red_ . You'll need to do a bit of parsing/interpreting > client size to translate from the results you get ("red_10", "red_11") to > telling the users "sizes 10 and 11 are available". The second field with > size first lets you do the same thing to answer "what colors do we have in > size X?". > > That gets unmanageable with more than 2-3 facet combinations, but with just 2 > (or, pushing it, 3), can work out okay. You'd probably ALSO want to keep the > facets you have with plain values "red red orange" etc, to support that first > level of user-implementing. There is a bit more work to do on client side > with this approach, Solr isn't just giving you exactly what you want in it's > response, you've got to have logic for when to use the top-level facets and > when to go to that second-level combo facet ("red_12"), but it's do-able. > > On 3/17/2011 11:21 AM, Otis Gospodnetic wrote: >> Hi, >> >> >> >> ----- Original Message ---- >>> From: Yonik Seeley<yo...@lucidimagination.com> >>> Subject: Re: Parent-child options >>> >>> On Thu, Mar 17, 2011 at 1:49 AM, Otis Gospodnetic >>> <otis_gospodne...@yahoo.com> wrote: >>>> The dreaded parent-child without denormalization question. What are one's >>>> options for the following example: >>>> >>>> parent: shoes >>>> 3 children. each with 2 attributes/fields: color and size >>>> * color: red black orange >>>> * size: 10 11 12 >>>> >>>> The goal is to be able to search for: >>>> 1) color:red AND size:10 and get 1 hit for the above >>>> 2) color:red AND size:12 and get *no* matches because there are no red >>>> shoes >>> of >>>> size 12, only size 10. >>> What if you had this instead: >>> >>> color: red red orange >>> size: 10 11 12 >>> >>> Do you need for color:red to return 1 or 2 (i.e. is the final answer >>> in units of child hits or parent hits)? >> The final answer is the parent, which is "shoes" in this example. >> So: >> if the query is color:red AND size:10 the answer is: Yes, we got red shoes >> size >> 10 >> if the query is color:red AND size:11 the answer is: Yes, we got red shoes >> size >> 11 >> if the query is color:red AND size:12 the answer is: No, we don't have red >> shoes >> size 12 >> >> Thanks, >> Otis >>