: facted metadata. The way that we build the metatdata list is post- : indexing of product, we would actual build a bit sequence that : corresponds to all possible key/value combos for each product and : associate each variation with the product. Then wehn someone refines
Hmmm... you can do something similar in solr by leveraging the Filter caching, but you don't need to build a DocSet (bit sequence) for every combination of values -- just one per value. When you want to know what the result is from combining multiple facets, you intersect the DocSets. : For the schema, I just meant the document format. the file is called : schema.xml. I haven't tried it, but it looks like you can change that : to affect the way solr works without actually affecting the way : lucene handles it. Is that wrong? I guess it doesn't really matter, well, the schema.xml lets you define the fields, and what options you want those fields to have -- but the set of options is fixed (and tied to the set of options available in lucene). You could have a "suggestable" field in every doc which contained a list of field names -- but there's really no way to annotate fields directly. : As for 'scanning the resultset', I can see how I was a little shy on : the details. Sorry about that. I meant look through the results to : see what facets apply to the resultset. So if my company sells books : and power tools, when someone searches for 'the little engine who : could', once we know there are no power tools in the result set, I : don't show the refinement facets for power tool metadata (like Ah... right. Depending on how you go about it, scanning result sets to find out which facets apply is probably just as expensive as just testing the facet to see if hte ocunt is posative -- especially if you have the RAM to allocate a big Filter cache -- then you can have a DocSet for every Filter in memory all of the time, and testing a few thousand can be done near instantly. : In your example file, how does the name facet know to display only : the names that start with whatever intial was selected? Would that Honestly, i haven't really thought about the mechanisms for dealing with facets in that way ... it would be trivial just to let all of the names be tested -- but that would obviously involve a lot of unneccessary computation, so if you could configure subgroups that were only consulted once another facet had been used, that would obviously make more sense. Perhaps something like... <group id="initial" label="Author"> <facet id="a" label="A" query="author:a*"> <group id="name" label="Author"> <facet use-prefix-terms-field="author">a</facet> </group> </facet> <facet id="b" label="B" query="author:b*"> <group id="name" label="Author"> <facet use-prefix-terms-field="author">b</facet> </group> </facet> ... ..allthough obviously common group/subgroup relationships like prefix expansion could be done as a special case to make it shorter to express... <!-- 'mfg' facet starts with initial, then full names that match initial --> <group id="mfg" label="Manufacturer"> <prefix-facet id="initial" field="mfg" prefix-chars="1"> <term-value-facet id="mfg" /> </prefix-facet> </group> <!-- 'name' facet starts with initial, then 2 char prefix, then full names that match prefix --> <group id="name" label="Author"> <prefix-facet id="initial" field="name" prefix-chars="1"> <prefix-facet id="i_prefix" field="name" prefix-chars="2"> <term-value-facet id="name" /> </prefix-facet> </prefix-facet> </group> : I think what I am starting to understand is that coming from what we : have (a rdbms based metadata gathering system), I need to rethink my : process. Ive spent so much time training myself to think in terms of : how to make things fast in mysql that I need to re-open my mind :) yeah ... inverted indexes are a completey different beast from relational databases. i would definitely suggest reading up on lucene, and getting to know the basics -- keeping in mind that anything that can be done in lucene can be done in a solr plugin, solr just makes it easier and gives you "wicked cool" caching :) -Hoss