Re: # open files with SolrCloud

2012-04-21 Thread Gopal Patwa
forgot to mention we are not using Solr Cloud yet but we use Lucene NRT feature, This issue is happening WITHOUT Solr Cloud On Sat, Apr 21, 2012 at 8:14 PM, Gopal Patwa wrote: > Yonik, This same issue we have on our production with Solr 4 Trunk build > running on Cent OS, JDK 6 64-bit > > I hav

Re: # open files with SolrCloud

2012-04-21 Thread Gopal Patwa
Yonik, This same issue we have on our production with Solr 4 Trunk build running on Cent OS, JDK 6 64-bit I have reported "java.io.IOException: Map failed" and "Too many open files" issue, i seems their is a search leak in Solr which is not closing them and file being kept open. It would be great

Re: Opposite to MoreLikeThis?

2012-04-21 Thread Lance Norskog
Are these documents classified already? Sounds like it would be much faster to suppress documents with the same tags as your target tags. On Fri, Apr 20, 2012 at 4:16 PM, Darren Govoni wrote: > You could run the MLT for the document in question, then gather all > those doc id's in the MLT results

Re: Storing the md5 hash of pdf files as a field in the index

2012-04-21 Thread Lance Norskog
The SignatureUpdateProcessor implements a smaller, faster cryptohash. It is used by the de-duplication feature. What's the purpose? Do you need the MD5 algorithm, or is any competent cryptohash good enough? On Sat, Apr 21, 2012 at 5:55 AM, wrote: > Hi Otis, > >  thank you very much for the quic

Analyzers and ReuseStrategy in Solr 4

2012-04-21 Thread Dominique Bejean
Hi, I developed a custom analyzer. This analyzer needs to be polymorphous according to the first 4 characters of the text to be analyzed. In order to do this I implement my own ReuseStratgy class (NoReuseStrategy) and in the constructor, I do this super(new NoReuseStrategy()); At Lucene leve

Re: # open files with SolrCloud

2012-04-21 Thread Yonik Seeley
I can reproduce some kind of searcher leak issue here, even w/o SolrCloud, and I've opened https://issues.apache.org/jira/browse/SOLR-3392 -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10

RE: Special characters in synonyms.txt on Solr 3.5

2012-04-21 Thread carl.nordenf...@bwinparty.com
Thanks, That worked like a charm. Should've thought about that :) / Carl From: Robert Muir [rcm...@gmail.com] Sent: 20 April 2012 18:21 To: solr-user@lucene.apache.org Subject: Re: Special characters in synonyms.txt on Solr 3.5 On Fri, Apr 20, 2012 at 12:

Re: Storing the md5 hash of pdf files as a field in the index

2012-04-21 Thread kuchenbrett
Hi Otis, thank you very much for the quick response to my question. I'll have a look at your suggested solution. Do you know if there's any documentation about writing such an Update Request Handler or how to trigger it using the Data Import/Tika combination? Thanks. Joe

Re: null pointer error with solr deduplication

2012-04-21 Thread Alexander Aristov
Hi I might be wrong but it's your responsibility to put unique doc IDs across shards. read this page http://wiki.apache.org/solr/DistributedSearch#Distributed_Searching_Limitations particualry - Documents must have a unique key and the unique key must be stored (stored="true" in schema.xm

Re: How to index pdf's content with SolrJ?

2012-04-21 Thread vasuj
Still i am not able to index my docs in solr -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-fails-on-server-request-up-tp3927284p3927749.html Sent from the Solr - User mailing list archive at Nabble.com.