Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2009-03-06 Thread Tom Burton-West
eindexed nightly via DIH from MS-SQL, so we can use a separate cache layer to lower the number of hits to SOLR. B _ {Beto|Norberto|Numard} Meijome -- View this message in context: http://www.nabble.com/port-of-Nutch-CommonGrams-to-Solr-for-help-with-slow-phrase-queries-tp20

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Wed, 26 Nov 2008 10:08:03 +1100 Norberto Meijome <[EMAIL PROTECTED]> wrote: > We didn't notice any severe performance hit but : > - data set isn't huge ( ca 1 MM docs). > - reindexed nightly via DIH from MS-SQL, so we can use a separate cache layer > to lower the number of hits to SOLR. To mak

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Mon, 24 Nov 2008 13:31:39 -0500 "Burton-West, Tom" <[EMAIL PROTECTED]> wrote: > The approach to this problem used by Nutch looks promising. Has anyone > ported the Nutch CommonGrams filter to Solr? > > "Construct n-grams for frequently occuring terms and phrases while > indexing. Optimize phr

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Shalin Shekhar Mangar
Hi Tom, I don't think anybody has worked on adding this to Solr yet. Do you mind opening a jira issue? On Tue, Nov 25, 2008 at 12:01 AM, Burton-West, Tom <[EMAIL PROTECTED]>wrote: > Hello all, > > We are having problems with extremely slow phrase queries when the > phrase query contains a common

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-24 Thread Walter Underwood
This technique was used at Infoseek in 1996, and is very effective. It also gives a relevance improvement, because you have an estimate of IDF for phrases (exact for two-word phrases). The terms "the" and "who" will be very common, but "the who" is quite rare and will have a big IDF. wunder On 1

port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-24 Thread Burton-West, Tom
Hello all, We are having problems with extremely slow phrase queries when the phrase query contains a common words. We are reluctant to just use stop words due to various problems with false hits and some things becoming impossible to search with stop words turned on. (For example "to be or not to