Hi all, I have a particular data structure I'm trying to index into a solr document so that I can query and facet it in a particular way, and I can't quite figure out the best way to go about it.
One sample object is here: https://gist.github.com/1139065 The part that's tripping me up is the workflows. Each workflow has a name (in this case, digitizationWF and accessionWF). Each workflow is made up of a number of processes, each of which has its own current status. Every time the status of a process within a workflow changes, the object is reindexed. What I'd like to be able to do is present several hierarchies of facets: In one, the workflow name is the top-level facet, with the second level showing each process, under which is listed each status (completed, waiting, or error) and the number of documents with that status for that process (some values omitted for brevity): accessionWF (583) publish (583) completed (574) waiting (6) error (3) shelve (583) completed (583) etc. I'd also like to be able to invert that presentation: accessionWF (583) completed (583) publish (574) shelve (583) waiting (6) publish (6) error (3) publish (3) or even completed (583) accessionWF (583) publish (574) shelve (583) digitizationWF (583) initiate (583) error (3) accessionWF (3) shelve (3) etc. I don't think Solr 4.0's pivot/hierarchical facets are what I'm looking for, because the status values are ambiguous when not qualified by the process name -- the object itself has no "completed" status, only a "publish:completed" and a "shelve:completed" that I want to be able to group together into a count/list of objects with "completed" processes. I also don't think PathHierarchyTokenizerFactory is quite the answer either. What kind of Solr magic, if any, am I looking for here? Thanks in advance for any help or advice. Michael --- Michael B. Klein Digitization Workflow Engineer Stanford University Libraries