Re: Snipets Solr/nutch

2008-04-15 Thread Mike Klaas
On 15-Apr-08, at 1:37 PM, khirb7 wrote: Thank you a lot you are helpful, concerning my solr I am using the 1.2.0 version i download it from the Apache download mirror http://www.apache.org/dyn/closer.cgi/lucene/solr/ , I haven't well understand you when you said : you're trying to apply a p

Re: too many queries?

2008-04-15 Thread Mike Klaas
On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is 32 bits). It is optimized twice a day, it takes around 15 minutes to optimize. The index is updated (commits) every two minutes. There are between 10 and 100 inserts/updates ever

Re: Snipets Solr/nutch

2008-04-15 Thread khirb7
Mike Klaas wrote: > > On 13-Apr-08, at 3:25 AM, khirb7 wrote: >> >> it doesn't work solr still use the default value fragsize=100. also >> I am not >> able to spécifieregex fragmenter due to this probleme of >> version I >> suppose or the way I am declaring ..> highlighting> >> b

Re: too many queries?

2008-04-15 Thread Otis Gospodnetic
Yeah, lots of evictions and tiny caches. Why not increase them? It looks like you have memory to spare. And since you reopen the searcher so often, you can play with increasing the warm-up time if you want to preserve more cached items from the previous searcher. Evictions are measured in th

Re: filtering search using regex

2008-04-15 Thread Chris Hostetter
Solr doesn't provide any regex based searching features out of the box. There are some regex based query classes in lucene, if you wrote a custom Solr plugin to do the query parsing, you could use them. Your question appears to be an "XY Problem" ... that is: you are dealing with "X", you are

Re: Interleaved results form different sources

2008-04-15 Thread Mike Klaas
first query: q=foo&fq=source:one&rows=5 second query: q=foo&fq=source:two&rows=5 I don't know the answer to your second question, sicnce I don't understand the use case for interleaving two sources anyway (I would try to create scores for the sources that were comparable in some way and c

Re: Interleaved results form different sources

2008-04-15 Thread peter360
How do you get the top N/2 results from each source? What if you have more than 2 sources? Mike Klaas wrote: > > By far the easiest way is to get the top N/2 results from each source > and interleave on the client side. > > regards, > -Mike > > -- View this message in context: http://w

Re: Interleaved results form different sources

2008-04-15 Thread Chris Hostetter
: > We have an index of documents from different sources and we want to make : > sure the results we display are interleaved from the different sources and : > not only ranked based on relevancy.Is there a way to do this ? : : By far the easiest way is to get the top N/2 results from each source

Re: Fuzzy queries in dismax specs?

2008-04-15 Thread Chris Hostetter
: I've started implementing something to use fuzzy queries for selected fields : in dismax. The request handler spec looks like this: : :exact~0.7^4.0 stemmed^2.0 that's a pretty cool idea ... usually when people talk about adding support for other querytypes in dismax they mean to the quer

Re: Slow Highlighting -> CopyField maxSize property

2008-04-15 Thread Koji Sekiguchi
Hello Nicolas, Thank you for letting me know this. Yes, your patch will solve my problem (highlighter performance w/ large doc). BTW, I posted similar ticket to solve my another problem (hl.alternateField w/ large field). https://issues.apache.org/jira/browse/SOLR-516 Thank you again, Koji

Re: too many queries?

2008-04-15 Thread Jonathan Ariel
Thanks. It should be around lookups*1.5, right? Is this measured in bytes? On Tue, Apr 15, 2008 at 11:26 AM, Erik Hatcher <[EMAIL PROTECTED]> wrote: > Filter cache evictions are a big red flag. Try bumping up the size of > your filter cache to avoid regenerating filters. > >Erik > > > >

RE: Slow Highlighting -> CopyField maxSize property

2008-04-15 Thread Nicolas DESSAIGNE
Koji, The patch is now available at https://issues.apache.org/jira/browse/SOLR-538 Tell me if it fits your needs. Nicolas -Message d'origine- De : Koji Sekiguchi [mailto:[EMAIL PROTECTED] Envoyé : vendredi 21 mars 2008 16:50 À : solr-user@lucene.apache.org Objet : Re: Slow Highlighting -

Re: too many queries?

2008-04-15 Thread Erik Hatcher
Filter cache evictions are a big red flag. Try bumping up the size of your filter cache to avoid regenerating filters. Erik On Apr 15, 2008, at 8:38 AM, Jonathan Ariel wrote: filterCache autowarmCount=256 lookups : 24241 hits : 21575 hitratio : 0.89 inserts : 3708 evictions : 3155

Re: too many queries?

2008-04-15 Thread Jonathan Ariel
My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is 32 bits). It is optimized twice a day, it takes around 15 minutes to optimize. The index is updated (commits) every two minutes. There are between 10 and 100 inserts/updates every 2 minutes. The cache configuration is: filterCache

RE: issues with solr

2008-04-15 Thread dudes dudes
thanks for your help Erik, ak > From: [EMAIL PROTECTED] > Subject: Re: issues with solr > Date: Mon, 14 Apr 2008 14:50:34 -0400 > To: solr-user@lucene.apache.org > > There is an "Ant script" section on that mySolr page. > > But there is no need to use a