I need "total number of occurrences" across all documents for each term. Imagine this...
Post #1: "I think, therefore I am like you" Reply #1: "You think too much" Reply #2 "I think that I think much as you" Each of those "documents" are put into 'content'. Pretending I don't have stop words, the top term query (not considering dateCreated in this example) would result in something like... "think": 4 "I": 4 "you": 3 "much": 2 ... Thus, just a "number of documents" approach doesn't work, because if a word occurs more than one time in a document it needs to be counted that many times. That seemed to rule out faceting like you mentioned as well as the TermsComponent (which as I understand also only counts "documents"). Thanks, Andy Pickler On Mon, Apr 1, 2013 at 4:31 PM, Tomás Fernández Löbbe <tomasflo...@gmail.com > wrote: > So you have one document per user comment? Why not use faceting plus > filtering on the "dateCreated" field? That would count "number of > documents" for each term (so, in your case, if a term is used twice in one > comment it would only count once). Is that what you are looking for? > > Tomás > > > On Mon, Apr 1, 2013 at 6:32 PM, Andy Pickler <andy.pick...@gmail.com> > wrote: > > > Our company has an application that is "Facebook-like" for usage by > > enterprise customers. We'd like to do a report of "top 10 terms entered > by > > users over (some time period)". With that in mind I'm using the > > DataImportHandler to put all the relevant data from our database into a > > Solr 'content' field: > > > > <field name="content" type="text_general" indexed="true" stored="false" > > multiValued="false" required="true" termVectors="true"/> > > > > Along with the content is the 'dateCreated' for that content: > > > > <field name="dateCreated" type="tdate" indexed="true" stored="false" > > multiValued="false" required="true"/> > > > > I'm struggling with the TermVectorComponent documentation to understand > how > > I can put together a query that answers the 'report' mentioned above. > For > > each document I need each term counted however many times it is entered > > (content of "I think what I think" would report 'think' as used twice). > > Does anyone have any insight as to whether I'm headed in the right > > direction and then what my query would be? > > > > Thanks, > > Andy Pickler > > >