Implement Custom Soundex

2011-10-16 Thread Momo..Lelo ..
Dear, Does anyone there has an experience of developing a custom Soundex. If you have an experience doing this and can offer some help and share experience I'd really appreciate it.

Re: Implement Custom Soundex

2011-10-16 Thread Gora Mohanty
2011/10/16 Momo..Lelo .. : > > Dear, > > Does anyone there has an experience of developing a custom Soundex. > >  If you have an experience doing this and can offer some help and share > experience I'd really appreciate it. I presume that this is in the context of Solr, and spell-checking. We did

Re: multiple document types in a core

2011-10-16 Thread lee carroll
Hi Chris thanks for the response > It's an inverted index, so *tems* exist once (per segment) and those terms > "point" to the documents -- so having the same terms (in the same fields) > for multiple types of documents in one index is going to take up less > overall space then having distinct col

Re: Multi CPU Cores

2011-10-16 Thread Rob Brown
Looks like I checked the load during a quiet period, ab -n 1 -c 1000 saw a decent 40% load on each core. Still a little confused as to why 1 core stays at 100% constantly - even during the quiet periods? -- IntelCompute Web Design and Online Marketing http://www.intelcompute.com -Or

Re: Multi CPU Cores

2011-10-16 Thread Ken Krugler
On Oct 16, 2011, at 1:44pm, Rob Brown wrote: > Looks like I checked the load during a quiet period, ab -n 1 -c 1000 > saw a decent 40% load on each core. > > Still a little confused as to why 1 core stays at 100% constantly - even > during the quiet periods? Could be background GC, dependin

RE: Implement Custom Soundex

2011-10-16 Thread Momo..Lelo ..
Dear Gora, Thank you for the quick response. Actually I need to do Soundex for Arabic language. The code is already done in Java. But I couldn't understand how can I implement it as Solr filter. Regards, > From: g...@mimirtech.com > Date: Sun, 16 Oct 2011 16:19:48 +0530 > Subject: Re: I

Re: Multi CPU Cores

2011-10-16 Thread Johannes Goll
Try using -useParallelGc as vm option. Johannes On Oct 16, 2011, at 7:51 AM, Ken Krugler wrote: > > On Oct 16, 2011, at 1:44pm, Rob Brown wrote: > >> Looks like I checked the load during a quiet period, ab -n 1 -c 1000 >> saw a decent 40% load on each core. >> >> Still a little confused

Re: Multi CPU Cores

2011-10-16 Thread Rob Brown
Thanks, Java is completely new to me (Perl/C background), so a little guidance would be great with config options like this, while I get to grips with Java... Or pointing to a useful resource to start filling in these gaps too. -Original Message- From: Johannes Goll Reply-to: solr-user

Re: Combine XML data with DIH

2011-10-16 Thread O. Klein
O. Klein wrote: > > > O. Klein wrote: >> >> I have folder with XML files >> >> 1.xml contains: >> http://www.site.com/1.html >> blacontent >> blatitle >> >> 2.xml contains: >> http://www.site.com/1.html >> blatitle2 >> >> I want to create document in Solr: >> >> http://ww

Re: Multi CPU Cores

2011-10-16 Thread Li Li
for indexing, your can make use of multi cores easily by call IndexWriter.addDocument with multi-threads as far as I know, for searching, if there is only one request, you can't make good use of cpus. On Sat, Oct 15, 2011 at 9:37 PM, Rob Brown wrote: > Hi, > > I'm running Solr on a machine with

Re: Multi CPU Cores

2011-10-16 Thread Johannes Goll
we use the the following in production java -server -XX:+UseParallelGC -XX:+AggressiveOpts -XX:+DisableExplicitGC -Xms3G -Xmx40G -Djetty.port= -Dsolr.solr.home= jar start.jar more information http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html Johannes

help with phrase query

2011-10-16 Thread Vijay Ramachandran
Hello. I have an application where I try to match longer queries (sentences) to short documents (search phrases). Typically, the documents are 3-5 terms in length. I am facing a problem where phrase match in the indicated phrase fields via "pf" doesn't seem to match in most cases, and I am stumped.

Callback on starting solr?

2011-10-16 Thread Jithin
Hi, Is is possible to have a callback after solr starts listening on the configured port. What I have found is there is a certain delay by the time solr starts listening on the port after restarting solr is done. So if I try to reindex solr it fails during this period. What I want is a notification

Solr Open File Descriptors

2011-10-16 Thread samarth s
Hi, Is it safe to assume that with a megeFactor of 10 the open file descriptors required by solr would be around (1+ 10) * 10 = 110 ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed* Solr wiki: http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationss

Re: Callback on starting solr?

2011-10-16 Thread Jan Høydahl
Hi, This depends on your application server and config. A very simple option is to let your client poll with a ping request http://localhost:8983/solr/admin/ping/ until it succeeds. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com

Re: Callback on starting solr?

2011-10-16 Thread Jithin
I am doing something similar to that. checking netstat for any connection on port. Wanted to know if there is anything solr can do built in. Also I notice that my reindex is failing when I have to reindex some 7k+ docs. Solr is giving error in logs - Caused by: java.net.SocketException: Broken p

Re: Callback on starting solr?

2011-10-16 Thread Jan Høydahl
Your app-server will start listening to the port some time before the Solr webapp is ready, so you should check directly with Solr. You could also use JMX to check Solr's status. If you want help with your reindex failing issue, please provide more context. 25Mb is very low, please try give your

Re: Solr Open File Descriptors

2011-10-16 Thread Shawn Heisey
On 10/16/2011 12:01 PM, samarth s wrote: Hi, Is it safe to assume that with a megeFactor of 10 the open file descriptors required by solr would be around (1+ 10) * 10 = 110 ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed* Solr wiki: http://wiki.apache.org/solr/SolrPerf

Re: In-document highlighting DocValues?

2011-10-16 Thread Michael Sokolov
On 10/14/2011 7:20 PM, Jan Høydahl wrote: Hi, The Highlighter is way too slow for this customer's particular use case - which is veery large documents. We don't need highlighted snippets for now, but we need to accurately decide what words (offsets) in the real HTML display of the resulting p

Re: Field Collapsing and Record Filtering

2011-10-16 Thread Michael Sokolov
On 10/13/2011 5:04 PM, lee carroll wrote: current: bool //for fq which searches only current versions last_current_at: date time // for date range queries or group sorting what was current for a given date sorry if i've missed a requirement lee c Lee the idea of "last_current_at" is interestin

Re: Multiple search analyzers on the same field type possible?

2011-10-16 Thread Victor van der Wolf
I don't think this will be a problem. I'll contact you tomorrow directly by email for some details. -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-search-analyzers-on-the-same-field-type-possible-tp3417898p3426678.html Sent from the Solr - User mailing list archive

Question about near query order

2011-10-16 Thread Jason, Kim
Hi, all I have some near query like "analyze term"~2. That is matched in that order. But I want to search regardless of order. So far, I just queried "analyze term"~2 OR "term analyze"~2. Is there a better way than what i did? Thanks in advance. Jason. -- View this message in context: http://lu

Re: Multi CPU Cores

2011-10-16 Thread Mikhail Khludnev
when I'm puzzled by jvm's cpu consumption I use the following combo: $ top -H -p gives you hottest threads PID, then convert them onto hex and find the thread in the output of $jstack as "nid=0x" Regards On Sun, Oct 16, 2011 at 4:25 PM, Rob Brown wrote: > Thanks, Java is completely new to