Re: How to search for phrase "IAE_UPC_0001"

2014-08-03 Thread Harald Kirsch
; do not work (returning zero documents) when the doc ref is in the format IAE_UPC_0001 (ie using the underscore character as the delimiter). I'm assuming the underscore is a special character but have tried looking at the solr wiki but can't find anything to say what the problem is

Re: Solr vs ElasticSearch

2014-08-03 Thread Harald Kirsch
Except if I missed it, nobody yet pointed to http://solr-vs-elasticsearch.com/ which seems to be fairly up-to-date. As for performance, I would expect that it is very hard to find one of the two technologies to be generally ahead. Except for plain blunders that may be lurking in the code, I w

Re: SolrCloud without NRT and indexing only on the master

2014-07-30 Thread Harald Kirsch
done at some point? If you don't have NRT, and you set your commit frequency to something reasonably large, then I don't see the "cost" of SolrCloud, but I guess it depends on the frequency of your updates. On 30 July 2014 08:22, Harald Kirsch wrote: Thanks Erick, for the c

Re: SolrCloud without NRT and indexing only on the master

2014-07-30 Thread Harald Kirsch
like. http://lucene.472066.n3.nabble.com/Best-practice-for-rebuild-index-in-SolrCloud-td4054574.html From time to time such recipes are mentioned in the list. On Tue, Jul 29, 2014 at 12:39 PM, Harald Kirsch wrote: Hi all, from the Solr documentation I find two options how replication

SolrCloud without NRT and indexing only on the master

2014-07-29 Thread Harald Kirsch
Hi all, from the Solr documentation I find two options how replication of an indexing is handled: a) SolrCloud indexes on master and all slaves in parallel to support NRT (near realtime search) b) Legacy replication where only the master does the indexing and slave receive index copies onc

Re: java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2014-07-28 Thread Harald Kirsch
Hi, the stack trace points to tika, which is likely in the process of extracting indexable plain text from some document. Tika's job is one of the dirtiest you can think of in the whole indexing business. We throw all kinds of more or less documented/broken/misguided/ill-designed/cruft/trunc

Re: Java heap Space error

2014-07-23 Thread Harald Kirsch
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) How can i fix this? Thanks, Ameya -- Harald Kirsch Raytion GmbH Kaiser-Friedrich-Ring 74 40547 Duesseldorf Fon +49 211 53883-216 Fax

Re: Solr irregularly having QTime > 50000ms, stracing solr cures the problem

2014-07-20 Thread Harald Kirsch
-- View this message in context: http://lucene.472066.n3.nabble.com/Solr-irregularly-having-QTime-5ms-stracing-solr-cures-the-problem-tp4146047p4147512.html Sent from the Solr - User mailing list archive at Nabble.com. -- Harald Kirsch Raytion GmbH Kaiser-Friedrich-Ring 74 40547 Duesseldo

Re: Solr irregularly having QTime > 50000ms, stracing solr cures the problem

2014-07-14 Thread Harald Kirsch
This problem seems to completely disappear under load. I started making load tests despite fearing them to be useless. It turns out that there are no more 5 ms delays under load. Harald. On 09.07.2014 09:50, Harald Kirsch wrote: Good point. I will see if I can get the necessary access

Re: Reference numbers for major page fauls per seconds, index size, query throughput

2014-07-14 Thread Harald Kirsch
fl=id). That should reduce the disk seeks due to assembling the docs. But 4 qps for simple term queries seems very slow at first blush. FWIW, Erick On Thu, Jul 10, 2014 at 7:30 AM, Harald Kirsch wrote: Hi everyone, currently I am taking some performance measurements on a Solr installation and

Re: Solr irregularly having QTime > 50000ms, stracing solr cures the problem

2014-07-14 Thread Harald Kirsch
Thanks IJ for the link. I am not sure this can solve my problem, because I have only one machine in play anyway. Harald. On 12.07.2014 20:49, IJ wrote: GUess - I had the same issues as you. Was resolved http://lucene.472066.n3.nabble.com/Slow-QTimes-5-seconds-for-Small-sized-Collections-td4143

Reference numbers for major page fauls per seconds, index size, query throughput

2014-07-10 Thread Harald Kirsch
Hi everyone, currently I am taking some performance measurements on a Solr installation and I am trying to figure out if what I see mostly fits expectations: The data is as follows: - solr 4.8.1 - 8 millon documents - mostly office documents with real text content, stored - index size on dis

Re: Solr irregularly having QTime > 50000ms, stracing solr cures the problem

2014-07-09 Thread Harald Kirsch
d way. Knowing exactly what's happening at the transport level is worth a month of guessing and poking. On Jul 8, 2014, at 3:53 AM, Harald Kirsch wrote: Hi all, This is what happens when I run a regular wget query to log the current number of documents indexed: 2014-07-08:07:23:2

Re: Solr irregularly having QTime > 50000ms, stracing solr cures the problem

2014-07-08 Thread Harald Kirsch
, "Harald Kirsch" wrote: Hi all, This is what happens when I run a regular wget query to log the current number of documents indexed: 2014-07-08:07:23:28 QTime=20 numFound="5720168" 2014-07-08:07:24:28 QTime=12 numFound="5721126" 2014-07-08:07:25:28 QTime=19 numFo

Solr irregularly having QTime > 50000ms, stracing solr cures the problem

2014-07-08 Thread Harald Kirsch
Hi all, This is what happens when I run a regular wget query to log the current number of documents indexed: 2014-07-08:07:23:28 QTime=20 numFound="5720168" 2014-07-08:07:24:28 QTime=12 numFound="5721126" 2014-07-08:07:25:28 QTime=19 numFound="5721126" 2014-07-08:07:27:18 QTime=50071 numFound=

Re: SolrCloud shard splitting keeps failing

2013-10-08 Thread Harald Kirsch
Hello Kalle, we noticed the same problem some weeks ago: http://lucene.472066.n3.nabble.com/Share-splitting-at-23-million-documents-gt-OOM-td4085064.html Would be interesting to hear if there is more positive feedback this time. We finally concluded that it may be worth to start with many shar

Share splitting at 23 million documents -> OOM

2013-08-16 Thread Harald Kirsch
Regards, Harald. -- Harald Kirsch Raytion GmbH Kaiser-Friedrich-Ring 74 40547 Duesseldorf Fon +49-211-550266-0 Fax +49-211-550266-19 http://www.raytion.com