On 2/7/07, Binkley, Peter <[EMAIL PROTECTED]> wrote:
In the library subject heading context, I wonder if a layered approach would bring performance into the acceptable range. Since Library of Congress Subject Headings break into standard parts, you could have first-tier facets representing the main heading, second-tier facets with the main heading and first subdivision, etc. So to extract the subject headings from a given result set, you'd first test all the first-tier facets like "Body, Human", then where warranted test the associated second-tier facets like "Body, Human--Social aspects.".
Yes... we've had discussions about hierarchical facets in the past, but more focused on organization/presentation than performance: http://www.nabble.com/Hierarchical-Facets--tf2560327.html#a7135353 Which got me thinking... if we could use hierarchical facets to speed up faceting, then we should also be able to use the same type of strategy for non-hierarchical facets! We could create a facet-tree, where sets at parent nodes are the union of the child sets. This should allow one to more quickly zoom into where higher counts are concentrated, without necessitating storing all the facets. One could control the space/time tradeoff with the branching factor of the tree. -Yonik