Jason, Thanks so much for your suggestion. This seems to do what I need.
-- David On Thu, May 16, 2013 at 3:59 PM, Jason Hellman < jhell...@innoventsolutions.com> wrote: > David, > > A Pivot Facet could possibly provide these results by the following syntax: > > facet.pivot=category,includes > > We would presume that includes is a tokenized field and thus a set of > facet values would be rendered from the terms resoling from that > tokenization. This would be nested in each category…and, of course, the > entire set of documents considered for these facets is constrained by the > current query. > > I think this maps to your requirement. > > Jason > > On May 16, 2013, at 12:29 PM, David Larochelle < > dlaroche...@cyber.law.harvard.edu> wrote: > > > Is there a way to get aggregate word counts over a subset of documents? > > > > For example given the following data: > > > > { > > "id": "1", > > "category": "cat1", > > "includes": "The green car.", > > }, > > { > > "id": "2", > > "category": "cat1", > > "includes": "The red car.", > > }, > > { > > "id": "3", > > "category": "cat2", > > "includes": "The black car.", > > } > > > > I'd like to be able to get total term frequency counts per category. e.g. > > > > <category name="cat1"> > > <lst name="the">2</lst> > > <lst name="car">2</lst> > > <lst name="green">1</lst> > > <lst name="red">1</lst> > > </category> > > <category name="cat2"> > > <lst name="the">1</lst> > > <lst name="car">1</lst> > > <lst name="black">1</lst> > > </category> > > > > I was initially hoping to do this within Solr and I tried using the > > TermFrequencyComponent. This gives term frequencies for individual > > documents and term frequencies for the entire index but doesn't seem to > > help with subsets. For example, TermFrequencyComponent would tell me that > > car occurs 3 times over all documents in the index and 1 time in > document 1 > > but not that it occurs 2 times over cat1 documents and 1 time over cat2 > > documents. > > > > Is there a good way to use Solr/Lucene to gather aggregate results like > > this? I've been focusing on just using Solr with XML files but I could > > certainly write Java code if necessary. > > > > Thanks, > > > > David > >