Optimize seaches; business is progressing with my Solr site

2011-02-05 Thread Dennis Gearon
Thanks to LOTS of information from you guys, my site is up and working. It's only an API now, I need to work on my OWN front end, LOL! I have my second customer. My general purpose repository API is very useful I'm finding. I will soon be in the business of optimizing the search engine part.

Re: UIMA Error

2011-02-05 Thread Tommaso Teofili
Hi Darx, The other in the basis configuration is the AlchemyAPIAnnotator. Cheers, Tommaso 2011/2/6, Darx Oman : > Hi Tommaso > yes my server isn't connected to the internet. > what other UIMA annotators that I can run which doesn't require an internet > connection? >

Re: UIMA Error

2011-02-05 Thread Darx Oman
Hi Tommaso yes my server isn't connected to the internet. what other UIMA annotators that I can run which doesn't require an internet connection?

Re: Is there anything like MultiSearcher?

2011-02-05 Thread Roman Chyla
Unless I am wrong, sharding across two cores is done over HTTP and has the limitations as listed at: http://wiki.apache.org/solr/DistributedSearch While MultiSearcher is just a decorator over IndexSearcher - therefore the limitations there would (?) not apply and if indexes reside locally, would be

Re: How to use q.op

2011-02-05 Thread Bill Bell
That sentence would be great to add to the Wiki. I changed the Wiki to add that. On 2/5/11 5:03 PM, "Chris Hostetter" wrote: > >: Dismax uses a strategy called Min-Should-Match which emulates the binary >: operator in the Standard Handler. In a nutshell, this parameter (called >mm) >: specifie

Re: How to use q.op

2011-02-05 Thread Chris Hostetter
: Dismax uses a strategy called Min-Should-Match which emulates the binary : operator in the Standard Handler. In a nutshell, this parameter (called mm) : specifies how many of the entered terms need to be present in your matched : documents. You can either specify an absolute number or a percenta

Re: keepword file with phrases

2011-02-05 Thread Bill Bell
OK that makes sense. If you double quote the synonyms file will that help for white space? Bill On 2/5/11 4:37 PM, "Chris Hostetter" wrote: > >: You need to switch the order. Do synonyms and expansion first, then >: shingles.. > >except then he would be building shingles out of all the permut

Re: keepword file with phrases

2011-02-05 Thread Chris Hostetter
: You need to switch the order. Do synonyms and expansion first, then : shingles.. except then he would be building shingles out of all the permutations of "words" in his symonyms -- including the multi-word synonyms. i don't *think* that's what he wants based on his example (but i may be wron

Re: prices

2011-02-05 Thread Lance Norskog
Jonathan- right in one! Using floats for prices will lead to madness. My mortgage UI kept changing the loan's interest rate. On Fri, Feb 4, 2011 at 12:13 PM, Dennis Gearon wrote: > That's a good idea, Yonik. So, fields that aren't stored don't get displayed, > so > the float field in the schema

Re: geodist and spacial search

2011-02-05 Thread Yonik Seeley
On Sat, Feb 5, 2011 at 10:59 AM, Estrada Groups wrote: > Use the {!geofilt} param like Grant suggested. IMO, it works the best > especially on larger datasets. Right, use geofilt if you need to restrict to a radius, or bbox if a bounding box is sufficient (which is often the case if you are goin

Re: Is there anything like MultiSearcher?

2011-02-05 Thread Bill Bell
Why not just use sharding across the 2 cores? On 2/5/11 8:49 AM, "Roman Chyla" wrote: >Dear Solr experts, > >Could you recommend some strategies or perhaps tell me if I approach >my problem from a wrong side? I was hoping to use MultiSearcher to >search across multiple indexes in Solr, but there

Re: geodist and spacial search

2011-02-05 Thread Bill Bell
Sure. I just didn't understand why you would use fq={!func}geodist() sfield=store pt=49.45031,11.077721 You would normally use {!geofilt} On 2/5/11 8:59 AM, "Estrada Groups" wrote: >Use the {!geofilt} param like Grant suggested. IMO, it works the best >especially on larger datasets. > >Ada

Re: keepword file with phrases

2011-02-05 Thread Bill Bell
You need to switch the order. Do synonyms and expansion first, then shingles.. Have you tried using analysis.jsp ? On 2/5/11 10:31 AM, "lee carroll" wrote: >Just to add things are going not as expected before the keepword, the >synonym list is not be expanded for shingles I think I don't unders

Re: UIMA Error

2011-02-05 Thread Tommaso Teofili
Hi Darx, are you running it without an internet connection? As the problem seems to be that the OpenCalais service host cannot be resolved. Remember that you can select which UIMA annotators run inside the OverridingParamsAggregateAEDescriptor.xml. Hope this helps. Tommaso 2011/2/5, Darx Oman : >

Re: keepword file with phrases

2011-02-05 Thread lee carroll
Just to add things are going not as expected before the keepword, the synonym list is not be expanded for shingles I think I don't understand term position On 5 February 2011 16:08, lee carroll wrote: > Hi List > I'm trying to achieve the following > > text in "this aisle contains preserves

keepword file with phrases

2011-02-05 Thread lee carroll
Hi List I'm trying to achieve the following text in "this aisle contains preserves and savoury spreads" desired index entry for a field to be used for faceting (ie strict set of normalised terms) is "jams" "savoury spreads" ie two facet terms current set up for the field is

Re: geodist and spacial search

2011-02-05 Thread Estrada Groups
Use the {!geofilt} param like Grant suggested. IMO, it works the best especially on larger datasets. Adam Sent from my iPhone On Feb 4, 2011, at 10:56 PM, Bill Bell wrote: > Why not just: > > q=*:* > fq={!bbox} > sfield=store > pt=49.45031,11.077721 > d=40 > fl=store > sort=geodist() asc >

Re: Index Not Matching

2011-02-05 Thread Erick Erickson
One other thing. After blowing away your index and doing a complete reindex, look at the Solr stats page for numDocs and maxDocs. If these numbers are not identical, you're somehow deleting records when reindexing, possibly because the in your schema is the same for some documents. Of course this

Is there anything like MultiSearcher?

2011-02-05 Thread Roman Chyla
Dear Solr experts, Could you recommend some strategies or perhaps tell me if I approach my problem from a wrong side? I was hoping to use MultiSearcher to search across multiple indexes in Solr, but there is no such a thing and MultiSearcher was removed according to this post: http://osdir.com/ml/

Re: Performance optimization of Proximity/Wildcard searches

2011-02-05 Thread Otis Gospodnetic
Yes, OS cache mostly remains (obviously index files that are no longer around are going to remain the OS cache for a while, but will be useless and gradually replaced by new index files). How long warmup takes is not relevant here, but what queries you use to warm up the index and how much you a

Re: How to use q.op

2011-02-05 Thread Savvas-Andreas Moysidis
Hi Bagesh, Dismax uses a strategy called Min-Should-Match which emulates the binary operator in the Standard Handler. In a nutshell, this parameter (called mm) specifies how many of the entered terms need to be present in your matched documents. You can either specify an absolute number or a perce

AND operator and dismax request handler

2011-02-05 Thread Bagesh Sharma
Hi friends, Please suggest me that how can i set query operator to AND for dismax request handler case. My problem is that i am searching a string "water treatment plant" using dismax request handler . The query formed is of such type http://localhost:8884/solr/select/?q=water+treatment+pla

How to use q.op

2011-02-05 Thread Bagesh Sharma
Hi friends , Please tell me how to use q.op for for dismax and standared request handler. I found that q.op=AND was not working for dismax. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-use-q-op-tp2431273p2431273.html Sent from the Solr - User mailing list archive at

Re: jndi datasource in dataimport

2011-02-05 Thread lee carroll
ah should this work or am i doing something obvious wrong in config in dataimport config what am i doing wrong ? On 5 February 2011 10:16, lee carroll wrote: > Hi list, > > It looks like you can use a jndi datsource in the data import handler. > however i can't find any syntax on this.

jndi datasource in dataimport

2011-02-05 Thread lee carroll
Hi list, It looks like you can use a jndi datsource in the data import handler. however i can't find any syntax on this. Where is the best place to look for this ? (and confirm if jndi does work in dataimporthandler)

Re: Highlighting with/without Term Vectors

2011-02-05 Thread Salman Akram
Yea I was going to reply to that thread but then it just slipped out of my mind. :) Actually we have two indexes. One that is used for searching and other for highlighting. Their structure is different too like the 1st one has all the metadata + document contents indexed (just for searching). This

TermVector query using Solr Tutorial

2011-02-05 Thread Ryan Chan
Hello all, I am following this tutorial: http://lucene.apache.org/solr/tutorial.html, I am playing with the TermVector, here is my step: 1. Launch the example server, java -jar start.jar 2. Index the monitor.xml, java -jar post.jar monitor.xml, which contains the following 3007WFP Dell Wi

Re: Performance optimization of Proximity/Wildcard searches

2011-02-05 Thread Salman Akram
Since all queries return total count as well so on average a query matches 10% of the total documents. The index I am talking about is around 13 million so that means around 1.3 million documents match on average. Of course all of them won't be overlapping so I am guessing that around 30-50% docume

Re: Performance optimization of Proximity/Wildcard searches

2011-02-05 Thread Salman Akram
Correct me if I am wrong. Commit in index flushes SOLR cache but of course OS cache would still be useful? If a an index is updated every hour then a warm up that takes less than 5 mins should be more than enough, right? On Sat, Feb 5, 2011 at 7:42 AM, Otis Gospodnetic wrote: > Salman, > > Warm

Re: DataImportHandler: no queries when using entity=something

2011-02-05 Thread Darx Oman
sorry add to url "&clean=false" http://solr:8983/solr/dataimport?command=full-import&entity=games&; clean=false this is by mistake it was intended for somebody else

Re: Solr Indexing Performance

2011-02-05 Thread Darx Oman
I indexed 1000 pdf file with the same configuration, it completed in about 32 min.

Re: Spellcheck in solr-nutch integration

2011-02-05 Thread Anurag
First go thru the schema.xml file . Look at the different components. On Sat, Feb 5, 2011 at 1:01 PM, 666 [via Lucene] < ml-node+2429702-1399813783-146...@n3.nabble.com > wrote: > Hello Anurag, I'm facing the same problem. Will u please elaborate on how u > solved the problem? It would be great i

Re: Spellcheck in solr-nutch integration

2011-02-05 Thread 666
Hello Anurag, I'm facing the same problem. Will u please elaborate on how u solved the problem? It would be great if u give me a step by step description as I'm new in Solr. -- View this message in context: http://lucene.472066.n3.nabble.com/Spellcheck-in-solr-nutch-integration-tp1953232p2429702