Re: mixed index with commongrams

2017-08-03 Thread David Hastings
oices here: > >> 1> live with the differing results until you get done re-indexing > >> 2> index to an offline collection and then use, say, collection > >> aliasing to make the switch atomically. > >> > >> Best, > >> Erick > >>

Re: mixed index with commongrams

2017-08-03 Thread Walter Underwood
>> >> On Thu, Aug 3, 2017 at 8:07 AM, David Hastings >> wrote: >>> Hey all, I have yet to run an experiment to test this but was wondering >> if >>> anyone knows the answer ahead of time. >>> If i have an index built with documents before implementi

Re: mixed index with commongrams

2017-08-03 Thread David Hastings
ction and then use, say, collection > aliasing to make the switch atomically. > > Best, > Erick > > On Thu, Aug 3, 2017 at 8:07 AM, David Hastings > wrote: > > Hey all, I have yet to run an experiment to test this but was wondering > if > > anyone knows the answe

Re: mixed index with commongrams

2017-08-03 Thread Erick Erickson
ll, I have yet to run an experiment to test this but was wondering if > anyone knows the answer ahead of time. > If i have an index built with documents before implementing the commongrams > filter, then enable it, and start adding documents that have the > filter/tokenizer applied, will

mixed index with commongrams

2017-08-03 Thread David Hastings
Hey all, I have yet to run an experiment to test this but was wondering if anyone knows the answer ahead of time. If i have an index built with documents before implementing the commongrams filter, then enable it, and start adding documents that have the filter/tokenizer applied, will searches

RE: CommonGrams

2017-04-11 Thread Markus Jelsma
April 2017 22:18 > To: solr-user@lucene.apache.org > Subject: CommonGrams > > Hi, was wondering if there are any known drawbacks to using the CommonGram > factory, in regards to such features as the "more like this" >

CommonGrams

2017-04-11 Thread David Hastings
Hi, was wondering if there are any known drawbacks to using the CommonGram factory, in regards to such features as the "more like this"

Re: commongrams

2017-02-11 Thread Shawn Heisey
On 2/10/2017 2:55 PM, David Hastings wrote: > of right now has 22 million documents and sits around 360 gb. at this > rate, it would be around a TB index size. is there a common > hardware/software configuration to handle TB size indexes? Memory is the secret to Solr performance. Lots and lots of

commongrams

2017-02-10 Thread David Hastings
Hey All, I followed an old blog post about implementing the common grams, and used the 400 most popular words file on a subset of my data. original index size was 33gb with 2.2 million documents, using the 400, it grep to 96gb. I scaled it down to the 100 most common words and got to about 76gb,

RE: CommonGrams indexing very slow!

2011-04-27 Thread Burton-West, Tom
- From: Salman Akram [mailto:salman.ak...@northbaysolutions.net] Sent: Wednesday, April 27, 2011 1:43 PM To: solr-user@lucene.apache.org Subject: Re: CommonGrams indexing very slow! Thanks for the response. We got it resolved! . We made small indexes in bulk using SOLR with Standard File Forma

Re: CommonGrams indexing very slow!

2011-04-27 Thread Salman Akram
g so we can see your mergeFactor and > ramBufferSizeMB settings? > > Tom > > > > All, > > > > > > We have created index with CommonGrams and the final size is around > > 370GB. > > > Everything is working fine but now when we add more documents int

RE: CommonGrams indexing very slow!

2011-04-27 Thread Burton-West, Tom
documents and committing triggers a cascading merge. (But this is a WAG without seeing what's in your indexwriter log) Can you also send your solrconfig so we can see your mergeFactor and ramBufferSizeMB settings? Tom > > All, > > > > We have created index with CommonGram

Re: CommonGrams indexing very slow!

2011-04-27 Thread Salman Akram
you by any chance optimizing? > > Best > Erick > > On Wed, Apr 27, 2011 at 11:04 AM, Salman Akram > wrote: > > All, > > > > We have created index with CommonGrams and the final size is around > 370GB. > > Everything is working fine but now when we add

Re: CommonGrams indexing very slow!

2011-04-27 Thread Erick Erickson
Are you by any chance optimizing? Best Erick On Wed, Apr 27, 2011 at 11:04 AM, Salman Akram wrote: > All, > > We have created index with CommonGrams and the final size is around 370GB. > Everything is working fine but now when we add more documents into index it > takes forever (

CommonGrams indexing very slow!

2011-04-27 Thread Salman Akram
All, We have created index with CommonGrams and the final size is around 370GB. Everything is working fine but now when we add more documents into index it takes forever (almost 12 hours)...seems to change all the segments file in a commit. The same commit used to take few mins with normal index

Re: CommonGrams and SOLR - 1604

2011-01-18 Thread Salman Akram
Anyone? On Mon, Jan 17, 2011 at 7:48 PM, Salman Akram < salman.ak...@northbaysolutions.net> wrote: > Hi, > > I am trying to use CommonGrams with SOLR - 1604 patch but doesn't seem to > work. > > If I don't add {!complexphrase} it uses CommonGramsQueryFilterFact

CommonGrams and SOLR - 1604

2011-01-17 Thread Salman Akram
Hi, I am trying to use CommonGrams with SOLR - 1604 patch but doesn't seem to work. If I don't add {!complexphrase} it uses CommonGramsQueryFilterFactory and proper bi-grams are made but of course doesn't use this patch. If I add {!complexphrase} it simply does it the old

Re: CommonGrams phrase query

2011-01-17 Thread Salman Akram
Ok sorry it was my fault. I wasn't using CommonGramsQueryFilter for query, just had Filter for indexing. The query seems fine now. On Mon, Jan 17, 2011 at 1:44 PM, Salman Akram < salman.ak...@northbaysolutions.net> wrote: > Hi, > > I have made an index using CommonGrams. N

CommonGrams phrase query

2011-01-17 Thread Salman Akram
Hi, I have made an index using CommonGrams. Now when I query "a b" and explain it, SOLR makes it +MultiPhraseQuery(Contents:"(a a_b) b"). Shouldn't it just be searching "a_b"? I am asking this coz even though I am using CommonGrams it's much slower than

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2009-03-06 Thread Tom Burton-West
Hi Norberto, After working a bit on trying to port the Nutch CommonGrams code, I ran into lots of dependencies on Nutch and Hadoop. Would it be possible to get more information on how you use shingles (or code)? Are you creating shingles for all two word combinations or using a list of words

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Wed, 26 Nov 2008 10:08:03 +1100 Norberto Meijome <[EMAIL PROTECTED]> wrote: > We didn't notice any severe performance hit but : > - data set isn't huge ( ca 1 MM docs). > - reindexed nightly via DIH from MS-SQL, so we can use a separate cache layer > to lower the number of hits to SOLR. To mak

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Mon, 24 Nov 2008 13:31:39 -0500 "Burton-West, Tom" <[EMAIL PROTECTED]> wrote: > The approach to this problem used by Nutch looks promising. Has anyone > ported the Nutch CommonGrams filter to Solr? > > "Construct n-grams for frequently occuring terms and p

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Shalin Shekhar Mangar
uot;man on the moon" etc.) > > The approach to this problem used by Nutch looks promising. Has anyone > ported the Nutch CommonGrams filter to Solr? > > "Construct n-grams for frequently occuring terms and phrases while > indexing. Optimize phrase queries to use the n-

Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-24 Thread Walter Underwood
to various problems with false hits and some things becoming > impossible to search with stop words turned on. (For example "to be or > not to be", "the who", "man in the moon" vs "man on the moon" etc.) > > The approach to this problem used

port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-24 Thread Burton-West, Tom
r not to be", "the who", "man in the moon" vs "man on the moon" etc.) The approach to this problem used by Nutch looks promising. Has anyone ported the Nutch CommonGrams filter to Solr? "Construct n-grams for frequently occuring terms and phrases while i