Jason,

Thanks so much for your suggestion. This seems to do what I need.

--

David

On Thu, May 16, 2013 at 3:59 PM, Jason Hellman <
jhell...@innoventsolutions.com> wrote:

> David,
>
> A Pivot Facet could possibly provide these results by the following syntax:
>
> facet.pivot=category,includes
>
> We would presume that includes is a tokenized field and thus a set of
> facet values would be rendered from the terms resoling from that
> tokenization.  This would be nested in each category…and, of course, the
> entire set of documents considered for these facets is constrained by the
> current query.
>
> I think this maps to your requirement.
>
> Jason
>
> On May 16, 2013, at 12:29 PM, David Larochelle <
> dlaroche...@cyber.law.harvard.edu> wrote:
>
> > Is there a way to get aggregate word counts over a subset of documents?
> >
> > For example given the following data:
> >
> >  {
> >    "id": "1",
> >    "category": "cat1",
> >    "includes": "The green car.",
> >  },
> >  {
> >    "id": "2",
> >    "category": "cat1",
> >    "includes": "The red car.",
> >  },
> >  {
> >    "id": "3",
> >    "category": "cat2",
> >    "includes": "The black car.",
> >  }
> >
> > I'd like to be able to get total term frequency counts per category. e.g.
> >
> > <category name="cat1">
> >   <lst name="the">2</lst>
> >   <lst name="car">2</lst>
> >   <lst name="green">1</lst>
> >   <lst name="red">1</lst>
> > </category>
> > <category name="cat2">
> >   <lst name="the">1</lst>
> >   <lst name="car">1</lst>
> >   <lst name="black">1</lst>
> > </category>
> >
> > I was initially hoping to do this within Solr and I tried using the
> > TermFrequencyComponent. This gives term frequencies for individual
> > documents and term frequencies for the entire index but doesn't seem to
> > help with subsets. For example, TermFrequencyComponent would tell me that
> > car occurs 3 times over all documents in the index and 1 time in
> document 1
> > but not that it occurs 2 times over cat1 documents and 1 time over cat2
> > documents.
> >
> > Is there a good way to use Solr/Lucene to gather aggregate results like
> > this? I've been focusing on just using Solr with XML files but I could
> > certainly write Java code if necessary.
> >
> > Thanks,
> >
> > David
>
>

Reply via email to