Hi, Long post - sorry...
I have a relatively special case of a Parent / Child relationship that I'm trying to model. I'm currently using Solr 1.4.1 and Lucene 2.9.3 For example, my Parent documents represent "Title Information" (e.g. bibliographic information), and each Parent document can contain 0 or more Children, each child representing a physical copy of the Book information (e.g. think of a library with multiple branches, the child documents each represent a books (or other format) available at a given branch). What I *think* is special about this set up is that the only information I need Solr / Lucene to make use of is all Facet based, AND, I don't need facet counts for these child facets. So, for example, "Harry Potter and the Last Crusade" (J.K. Rowlings' upcoming block buster novel :)) has the following information. Location Format Main Book Main DVD Branch Book etc etc. This is a bit simplified - there are actually 5 fields involved for each child document, each of the five fields can be used individually (easy!) or in combination (much harder!) to refine a result set. With the relatively straightforward approach of having Location and Format as ordinary everyday facet fields, I certainly get results (although I ignore the facet counts). So, I search for the book, and, without any facets applied, get back facets like this: Location Main Branch Format Book DVD And, I can narrow by those, and things work - though logically there's a hole that I'm trying to fill. For example, suppose the user chooses to narrow by Location:Branch and Format:DVD. I still get a hit back - but I don't want one, because there isn't a child record that has both of those values. (The user is looking for DVD's at the Branch library, but the only DVD is at Main). I'm completely controlling both the indexing and searching side code - i.e. I can formulate any type of document content I want to be indexed, and I can parse the results before presenting them to the end users. One approach I've been thinking of is a brute force method of accomplishing this, using Facets and using the facet.prefix parameter in the query. So - I could generate 'facets' like this: Location_facet Main Branch Format_facet Book DVD Location-Format_facet Main-Book Main-DVD Branch-Book Format-Location_facet Book-Main Book-Branch DVD-Main When narrowing by a single facet (e.g. Location:Branch), it would be a usual facet search. Something like: &fq=Location_facet:"Branch" and I would request back facets like this &facet=true&facet.mincount=1&facet.field=Location-Format_facet&facet.prefix="Branch-" and then parse out the values returned in the Location-Format_facet to retrieve what follows the "Branch-" prefix, and those would be the facet values for the 'Format' facet presented to the users (so only 'Book' remains as a value). So - with only 2 fields, it's pretty straightforward. (It could be somewhat simplified from the above down to two facet fields instead of four - just keeping the paired facets, and not using the singleton facets, retrieving just one of those paired fields when no limiting is taking place, and parsing out the pairs for the Location and Format facets, and then when limiting on one element use facet.prefix, when limiting on both, again just choose one of the facets and look for the concatenated value...) However it gets more complex as I ramp up to 5 fields. (generally it requires n! individual facet fields, where 'n' is the number of underlying fields. i.e. with two fields, there are two facet type fields needed in the Solr/Lucene index to support this. With three fields, I could do this with 6 facets required. With 5 fields, there would be 5! = 120 required facets. That's getting a bit much... :) Hmmm... A little scribbling (ok, a fair bit of scribbling), and I can actually reduce that to 12 facet fields to cover the 5 fields. So, maybe that's not all that bad... Interesting... (I haven't actually coded up anything yet - this is all a paper-napkin level exercise...) The other thing I've done is perused various archived threads and some upcoming functionality regarding parent / child or hierarchical document strategies. But, I haven't found anything that would help me out much - at least not directly. I saw the Jira LUCENE-2454<https://issues.apache.org/jira/browse/LUCENE-2454> Nested Document Query Support. which looks from the slides overview to be structurally just what I would want - but indicates that there isn't the Query Parser support in place yet... (I.E. how to do a Solr query being able to relate child level queries either within the base query, or in the fq clause... So - my question (finally :)) is - does this logical problem seem resolvable, with an approach other than the brute force outlined above? I'm willing to dive into the Solr / Lucene code if that's what it will take - I'd just like an indication of what people think would be a good / possible approach before I get into that level... e.g. some way of providing to the Indexer a tuple of each found combination of the 5 values, and then doing something (what?) with searching for the facet queries Thanks! Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com<http://www.sirsidynix.com/>