Re: Java heap space error

2014-07-24 Thread Marcello Lorenzi
make any other change than this.. rest of the settings are default. Do i need to set garbage collection strategy? On Thu, Jul 24, 2014 at 9:49 AM, Marcello Lorenzi <mailto:mlore...@sorint.it>> wrote: Hi, Did you set a Garbage collection strategy on your JVM ? Marcello

Re: Java heap space error

2014-07-24 Thread Marcello Lorenzi
Hi, Did you set a Garbage collection strategy on your JVM ? Marcello On 07/24/2014 03:32 PM, Ameya Aware wrote: Hi I am in process of indexing around 2,00,000 documents. I have increase java jeap space to 4 GB using below command : java -Xmx4096M -Xms4096M -jar start.jar Still after indexin

Heap size and Solr 4.3

2013-12-16 Thread Marcello Lorenzi
Hi All, we have deployed on our production environment a new Solr 4.3 instance (2 nodes with SolrCloud) but this morning one node gone on outofmemory status and we have noticed that the JVM uses a lot of Old Gen space during the normal lifecycle. What are the items that improve this high usag

Re: SolR vs large PDF

2013-11-27 Thread Marcello Lorenzi
anyway) to offload the PDF parsing amongst as many clients as you can afford. Here's a way to get started: http://searchhub.org/2012/02/14/indexing-with-solrj/ Best, Erick On Wed, Nov 27, 2013 at 10:00 AM, Marcello Lorenzi wrote: Hi All, on our test environment we have implemented a ne

SolR vs large PDF

2013-11-27 Thread Marcello Lorenzi
Hi All, on our test environment we have implemented a new search engine based on Solr 4.3 with 2 instances hosted on different servers and 1 shard present on each servlet container. During some stress test we noticed a bottleneck into crawling of large PDF file that blocks the serving of resu

Re: PDF indexing issues

2013-11-18 Thread Marcello Lorenzi
You should check the Apache PDFBox project. A similar question: https://issues.apache.org/jira/browse/PDFBOX-940 2013/11/15 Marcello Lorenzi Hi, during you testing of Apache SOLR 4.3, we have noticed some errors occurred for PDF indexing: ERROR - 2013-11-15 15:14:2

PDF indexing issues

2013-11-15 Thread Marcello Lorenzi
Hi, during you testing of Apache SOLR 4.3, we have noticed some errors occurred for PDF indexing: ERROR - 2013-11-15 15:14:26.248; org.apache.pdfbox.pdmodel.font.PDCIDFont; Error: Could not parse predefined CMAP file for 'PDFXC30-Indentity0-UCS2' ERROR - 2013-11-15 15:14:36.108; org.apache.p

Re: Solr xml img parsing exception

2013-11-15 Thread Marcello Lorenzi
there is no matching . -- Jack Krupansky -Original Message- From: Marcello Lorenzi Sent: Thursday, November 14, 2013 9:26 AM To: solr-user@lucene.apache.org Subject: Solr xml img parsing exception Hi, I have installed a Solr 4.3 instance and we have configured manifoldcf to pass web content to

Re: Solr xml img parsing exception

2013-11-14 Thread Marcello Lorenzi
ckson wrote: It looks like bad data. The XML you're sending to Solr looks mal-formed, so I suspect this is completely outside of Solr's purview. Best, Erick On Thu, Nov 14, 2013 at 9:26 AM, Marcello Lorenzi wrote: Hi, I have installed a Solr 4.3 instance and we have configured man

Solr xml img parsing exception

2013-11-14 Thread Marcello Lorenzi
Hi, I have installed a Solr 4.3 instance and we have configured manifoldcf to pass web content to the shard collection, but during the crawling we have noticed a lot of this exception: ERROR - 2013-11-14 15:13:57.954; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: