Sincere apologies for the unclarity! I'm probably misusing technical terms such 'category' ...
Ok let's assume we have the basic solr engine that's able to search and give result of urls...now from those pages, I would like to know which terms are the most mentioned, e.g. iPad, Samsung, Candy...the list can be long but we could decide to only output the top#20 or so. I'm not sure if this a more 'facet' or 'category' or 'cluster' job in Solr terminology. Remi On Thursday, February 2, 2012, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : > Another alternative solution would be to add a category field to the > : > already crawled content. > > : > >> Let's say Solr is setup and can return relevant urls. What if I wanted > : > to get the most cited terms from a predefined list, instead? It could be > : > from a list of products, names, cities... > > You relaly need to explain your problem more -- i'm having a hard time > understanding what type of usecase/situation you might be describing. > > based on your initial description it seems like you are just asking about > something like using facet.query to get counts for specific terms; but > them in your followup the idea of adding categorization to your existing > index almost smells like a machine learning type problem. > > the question is just really too vague to make any guesses at. > > please give a specific examples of the type of data you are working with, > the types of requests you want to send, the types of results you want to > give back from those requests, and the types of results you do *NOT* > wantto get back. so we can understand the boundaries of your probem. > > > -Hoss >