Re: Revisiting IDF Problems and Index Slices

2009-08-12 Thread Grant Ingersoll
Some thoughts below, sorry for the late reply... On Aug 6, 2009, at 2:27 PM, Mark Bennett wrote: I'm investigating a problem I bet some of you have hit before, and exploring several options to address it. I suspect that this specific IDF scenario is common enough that it even has a name, t

Re: Revisiting IDF Problems and Index Slices

2009-08-06 Thread Otis Gospodnetic
As soon as I started reading your message I started thinking "common grams", so that is what I would try first, esp. since somebody already did the work of porting that from Nutch to Solr (see Solr JIRA). Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Ka