AW: AW: OutOfMemory when batchupdating from SolrJ

2016-02-22 Thread Clemens Wyss DEV
Find attached the relevant part of the batch-update: ... SolrClient solrClient = getSolrClient( coreName, true ); Collection batch = new ArrayList(); while ( elements.hasNext() ) { IIndexableElement elem = elements.next(); SolrInputDocument doc = createSolrDocForElement( elem, provider, locale

Re: Boost exact search

2016-02-22 Thread elisabeth benoit
Hello, There was a discussion on this thread about exact match http://www.mail-archive.com/solr-user%40lucene.apache.org/msg118115.html they mention an example on this page https://github.com/cominvent/exactmatch Best regards, Elisabeth 2016-02-19 18:01 GMT+01:00 Loïc Stéphan : > Hello, >

RE: Slow commits

2016-02-22 Thread Adam Neal [Extranet]
Well I got the numbers wrong, there are actually around 66000 fields on the index. I have restructured the index and there are now around 1500 fiields. This has resulted in the commit taking 34 seconds which is acceptable for my usage however it is still significantly slower than the 4.10.2 comm

Re: both way synonyms with ManagedSynonymFilterFactory

2016-02-22 Thread Jan Høydahl
Hi Did you get any Further with this? I reproduced your situation with Solr 5.5. Think the issue here is that when the SynonymFilter is created based on the managed map, option “expand” is always set to “false”, while the default for file-based synonym dictionary is “true”. So with expand=fals

Re: Facet Filter

2016-02-22 Thread Toke Eskildsen
On Mon, 2016-02-22 at 11:48 +0530, Anil wrote: > solr Documentation says docValues=true/false works for only few fields. > will that work on Text field ? No. It might at some point, but so far it is just a feature request: https://issues.apache.org/jira/browse/SOLR-8362 - Toke Eskildsen, State

Frequent connection reset in AbstractFullDistribZkTestBase

2016-02-22 Thread Markus Jelsma
q

RE: Frequent connection reset in AbstractFullDistribZkTestBase

2016-02-22 Thread Markus Jelsma
Hi - we have quite some unit tests implementing AbstractFullDistribZkTestBase. Since the upgrade to 5.4.1 we frequently see tests failing due to connection reset problems. Is there an issue connected to this problem? Is there something else i can do? Thanks, Markus -Original message---

Sort vs boost

2016-02-22 Thread Anil
Hi, we would like to display recent records on top. two ways 1. boost by create time desc 2. sort create time by desc i tried both, seems both looks good. which one is better in terms of performance ? i noticed, sort is good than boost in terms of performance. Please correct me if I am wrong

Re: Facet Filter

2016-02-22 Thread Anil
Thank you. it means to we need to create two fields of same content to support facet and case insensitive , term search on a field. Agree? Thanks again,. Regards, Anil On 22 February 2016 at 16:07, Toke Eskildsen wrote: > On Mon, 2016-02-22 at 11:48 +0530, Anil wrote: > > solr Documentation s

Re: Slow commits

2016-02-22 Thread Susheel Kumar
Adam - how many documents you have in your index? Thanks, Susheel On Mon, Feb 22, 2016 at 4:37 AM, Adam Neal [Extranet] wrote: > Well I got the numbers wrong, there are actually around 66000 fields on > the index. I have restructured the index and there are now around 1500 > fiields. This has r

Re: Slow commits

2016-02-22 Thread Susheel Kumar
Sorry, I see now you mentioned 56K docs which is pretty small. On Mon, Feb 22, 2016 at 8:30 AM, Susheel Kumar wrote: > Adam - how many documents you have in your index? > > Thanks, > Susheel > > On Mon, Feb 22, 2016 at 4:37 AM, Adam Neal [Extranet] > wrote: > >> Well I got the numbers wrong, th

RE: Slow commits

2016-02-22 Thread Adam Neal [Extranet]
Yup, that's correct. Not talking massive amounts of data really. The commit performance difference between 4.10.2 and 5.3.1 is huge in this case. From: Susheel Kumar [susheel2...@gmail.com] Sent: 22 February 2016 13:31 To: solr-user@lucene.apache.org Subjec

AW: AW: OutOfMemory when batchupdating from SolrJ

2016-02-22 Thread Clemens Wyss DEV
> solrClient.add( documents ); // [2] is of course: solrClient.add( batch ); // [2] -Ursprüngliche Nachricht- Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch] Gesendet: Montag, 22. Februar 2016 09:55 An: solr-user@lucene.apache.org Betreff: AW: AW: OutOfMemory when batchupdating from S

Re: Sort vs boost

2016-02-22 Thread Emir Arnautovic
Hi Anil, Decision also depends on your usecase - if you are sure that there will be no cases where documents matches are of different score or you don't care about how well document match query (e.g. all queries will be single term query) then sorting by time is way to go. But, if there is cha

Re: Slow commits

2016-02-22 Thread Yonik Seeley
What are the types of the fields with the highest count? I assume they are indexed. Are they stored, and do they have docValues? -Yonik On Mon, Feb 22, 2016 at 4:37 AM, Adam Neal [Extranet] wrote: > Well I got the numbers wrong, there are actually around 66000 fields on the > index. I have r

Re: AW: AW: OutOfMemory when batchupdating from SolrJ

2016-02-22 Thread Shawn Heisey
On 2/22/2016 1:55 AM, Clemens Wyss DEV wrote: > SolrClient solrClient = getSolrClient( coreName, true ); > Collection batch = new ArrayList(); > while ( elements.hasNext() ) > { > IIndexableElement elem = elements.next(); > SolrInputDocument doc = createSolrDocForElement( elem, provider, locale

RE: Slow commits

2016-02-22 Thread Adam Neal [Extranet]
Highest count is fairly equal between string and text. They are not indexed but stored and no docvalues used From: Yonik Seeley [ysee...@gmail.com] Sent: 22 February 2016 14:40 To: solr-user@lucene.apache.org Subject: Re: Slow commits What are the types o

Re: Slow commits

2016-02-22 Thread Yonik Seeley
On Mon, Feb 22, 2016 at 10:22 AM, Adam Neal [Extranet] wrote: > Highest count is fairly equal between string and text. They are not indexed > but stored and no docvalues used Ah, that's a big clue... I wonder if it's related to stored fields compression. In Solr 5.5. there is a way to tune comp

Solr InputFormat Exist?

2016-02-22 Thread Jamie Johnson
Is there an equivalent of the ESInputFormat ( https://github.com/elastic/elasticsearch-hadoop/blob/03c056142a5ab7422b81bb1f519fd67a9581405f/mr/src/main/java/org/elasticsearch/hadoop/mr/EsInputFormat.java) in Solr or is there any work that is planned in this regard? -Jamie

Re: Sort vs boost

2016-02-22 Thread Anil
Thanks Emir On Feb 22, 2016 7:31 PM, "Emir Arnautovic" wrote: > Hi Anil, > Decision also depends on your usecase - if you are sure that there will be > no cases where documents matches are of different score or you don't care > about how well document match query (e.g. all queries will be single

RE: Slow commits

2016-02-22 Thread Adam Neal [Extranet]
I have also tried it with the fields just indexed and not stored, the performance is the same so I doubt it is related to stored field compression. I can file a JIRA, unfortunately I wont be able to provide example files for this system but I will try and reproduce it with some test data and inc

Re: Facet Filter

2016-02-22 Thread Toke Eskildsen
Anil wrote: > it means to we need to create two fields of same content to support > facet and case insensitive , term search on a field. Agree? As things are now, yes. - Toke Eskildsen

AW: AW: AW: OutOfMemory when batchupdating from SolrJ

2016-02-22 Thread Clemens Wyss DEV
Hi Shawn, important note ahead: I appreciate your help very much! > it's too proprietary for public eyes That's not the reason for not posting all the code. I just tried to extract the "relevant parts" in order to prevent not seeing the forest for the trees. And yes " addMultipleDocuments" ends

Exception SolrServerException: No live SolrServers available to handle this request:

2016-02-22 Thread Mugeesh Husain
I am getting no live node exception i dont know why In my schema define ro field such as ... { "responseHeader":{ "status":500, "QTime":9, "params":{ "q":"_id:(1 3 2)", "indent":"true", "fl":"ro", "group.ngroups":"true",

very slow frequent updates

2016-02-22 Thread Roland Szűcs
Hi folks, We use SOLR 5.2.1. We have ebooks stored in SOLR. The majority of the fields do not change at all like content, author, publisher Only the price field changes frequently. We let the customers to make full text search so we indexed the content filed. Due to the frequency of the price

SolrCloud, Best performance directly from C

2016-02-22 Thread Robert Brown
Hi, As a pure C user, without wishing to use Java, what's my best approach for managing the SolrCloud environment? I operate a FastCGI environment, so I have the persistence to cache the state of the "cloud". So far I see good utilisation of the collections API being my best bet? Any other

words with spaces within

2016-02-22 Thread Francisco Andrés Fernández
Hi all, I'm extracting some text from pdf. As result, some important words end with spaces between characters. I know they are words but, don't know how to make Solr detect and index them. For example, I could have the word "Subtitle" that I want to detect, written like "S u b t i t l e". If I woul

Re: SolrCloud, Best performance directly from C

2016-02-22 Thread Shawn Heisey
On 2/22/2016 1:40 PM, Robert Brown wrote: > As a pure C user, without wishing to use Java, what's my best approach > for managing the SolrCloud environment? The most responsive client you would be able to write would use the C binding for zookeeper, to keep track of clusterstate like CloudSolrClie

Slow HTTP responses Solr on CDH 5.3 release , solr server behind NAT

2016-02-22 Thread Wyatt Rivers
Hey all, I am using openstack sahara to launch a CDH 5.3 cluster. The cluster has an internal network and each node has a floating IP attached to it. Essentially it is sitting behind a router. Here is my problem any http requests that coming from outside the internal network it takes abo

Re: Exception SolrServerException: No live SolrServers available to handle this request:

2016-02-22 Thread Binoy Dalal
Are you sure all your solr servers are up and listening? If you're using zookeeper, check if zookeeper has your nodes listed in the cluster state. On Tue, 23 Feb 2016, 00:45 Mugeesh Husain wrote: > I am getting no live node exception i dont know why > > In my schema define ro field such as > >

Re: words with spaces within

2016-02-22 Thread Binoy Dalal
Is there some set pattern to how these words occur or do they occur randomly in the text, i.e., somewhere it'll be "subtitle" and somewhere "s u b t i t l e"? On Tue, 23 Feb 2016, 05:01 Francisco Andrés Fernández wrote: > Hi all, > I'm extracting some text from pdf. As result, some important wor

Re: words with spaces within

2016-02-22 Thread Walter Underwood
This happens for fonts where Tika does not have font metrics. Open the document in Adobe Reader, then use document info to find the list of fonts. Then post this question to the Tika list. Fix it in Tika, don’t patch it in Solr. wunder Walter Underwood wun...@wunderwood.org http://observer.wund

Fwd: Best Practice to Design Solr Schema in the case of Multistore market place and frequent update

2016-02-22 Thread Sumit Agarwal
Dear All, Please help to provide input to design requirement sent in previous mail. is there any way to search past months archive "lucene-solr-user mailing list archives"? is it possible to achieve below solution to merge result from Solr and mysql? How to search if one field exist in solr and

SOLR cloud startup poniting to zookeeper ensemble

2016-02-22 Thread bbarani
I downloaded the latest version of SOLR (5.5.0) and also installed zookeeper on port 2181,2182,2183 and its running fine. Now when I try to start the SOLR instance using the below command its just showing help content rather than executing the command. bin/solr start -e cloud -z localhost:2181,l

Re: Exception SolrServerException: No live SolrServers available to handle this request:

2016-02-22 Thread Mugeesh Husain
Hi, solr servers are up and listening. and i also check zookeeper clustersterstate.json as below get /clusterstate.json {} cZxid = 0x10013 ctime = Fri Jan 15 01:40:37 IST 2016 mZxid = 0x11799 mtime = Fri Feb 19 18:59:37 IST 2016 pZxid = 0x10013 cversion = 0 dataVersion = 58 aclVersion

Re: SOLR cloud startup - zookeeper ensemble

2016-02-22 Thread bbarani
Ok when I run the below command it looks like its ignoring the double quotes. solr start -c -z "localhost:2181,localhost:2182,localhost:2183" -e cloud This interactive session will help you launch a SolrCloud cluster on your local workstation. To begin, how many Solr nodes would you like to run

Re: Exception SolrServerException: No live SolrServers available to handle this request:

2016-02-22 Thread Binoy Dalal
In the cloud section in the admin console, do you see all your shards in a live state? On Tue, 23 Feb 2016, 10:25 Mugeesh Husain wrote: > Hi, > solr servers are up and listening. and i also check zookeeper > clustersterstate.json as below > > get /clusterstate.json > {} > cZxid = 0x10013 >

CLOSE_WAIT and high search latency

2016-02-22 Thread Niraj Aswani
Hi, I am on solr 4.8.1 and running master-slave setup with lots of cores (>3K). Internally I maintain an instance of HTTPSolrServer for each core that is reused for querying the respective cores. A request is received by an intermediary tomcat and forwarded to another tomcat running Solr. Over th

Bug? Cannot use rule with free disk because solr is getting wrong free disk size

2016-02-22 Thread Robert C Delorimier
Environment: Centos 6 Solr version: 5.2.1 Java Version: 7 Adding rules to collection creation did not work because solr does not return the correct value for freedisk Example: http://server2:18983/solr/admin/collections?action=CREATE&rule=replica:*,shard:*,freedisk:%3E24&name=search_create_tes

Index time or query time boost, and help with boost syntax

2016-02-22 Thread jimi.hullegard
Hi, We have a use case where we want to influence the score of the documents based on the document type, and I am a bit unsure what is the best way to achieve this. In essence we have about 100.000 documents, of about 15 different document types. And we more or less want to tweak the score diff

Possible reasons for multiple searchers for same core in Solr 4.6.1

2016-02-22 Thread Dhritiman Das
Hi, We are using Solr 4.6.1 in our application and have written some custom plugins/components at many places. We are facing a issue and needed your views for debugging the same. Issue: After application starts up, after sometime, we see multiple searchers opened for the same core ( we have seen

Re: Best practices for Solr (how to update jar files safely)

2016-02-22 Thread Ramkumar R. Aiyengar
I side with Toke on this. Enterprise bare metal machines often have hundreds of gigs of memory and tens of CPU cores -- you would have to fit multiple instances in a machine to make use of them to circumvent huge heaps. If this is not a common case now, it could well be in the future the way hardw