Re: Tuning solr for large index with rapid writes

2016-05-02 Thread Stephen Lewis
Thanks for the good suggestions on read traffic. I have been simulating reads through parsing our elb logs and replaying them from a fleet of test servers acting as frontends using Siege . We are hoping to tune mostly based on exact use case, and so this seems th

Re: Tuning solr for large index with rapid writes

2016-05-02 Thread Erick Erickson
Bram: That works. I try to monitor the number of 0-hit queries when I generate a test set on the theory that those are _usually_ groups of random terms I've selected that aren't a good model. So it's often a sequence like "generate my list, see which ones give 0 results and remove them". Rinse, re

Re: Tuning solr for large index with rapid writes

2016-04-30 Thread Bram Van Dam
> If I'm reading this right, you have 420M docs on a single shard? > Yep, you were reading it right. Is Erick mentioned, it's hard to give concrete sizing advice, but we've found 120M to be the magic number. When a shard contains more than 120M documents, performance goes down rapidly & GC pauses

Re: Tuning solr for large index with rapid writes

2016-04-30 Thread Bram Van Dam
On 29/04/16 16:33, Erick Erickson wrote: > You have one huge advantage when doing prototyping, you can > mine your current logs for real user queries. It's actually > surprisingly difficult to generate, say, 10,000 "realistic" queries. And > IMO you need something approaching that number to insure

Re: Tuning solr for large index with rapid writes

2016-04-29 Thread Erick Erickson
Good luck! You have one huge advantage when doing prototyping, you can mine your current logs for real user queries. It's actually surprisingly difficult to generate, say, 10,000 "realistic" queries. And IMO you need something approaching that number to insure that you're queries don't hit the cac

Re: Tuning solr for large index with rapid writes

2016-04-27 Thread Stephen Lewis
​> If I'm reading this right, you have 420M docs on a single shard? Yep, you were reading it right. Thanks for your guidance. We will do various prototyping following "the sizing exercise". Best, Stephen On Tue, Apr 26, 2016 at 6:17 PM, Erick Erickson wrote: > ​​ > If I'm reading this right, yo

Re: Tuning solr for large index with rapid writes

2016-04-26 Thread Erick Erickson
If I'm reading this right, you have 420M docs on a single shard? If that's true you are pushing the envelope of what I've seen work and be performant. Your OOM errors are the proverbial 'smoking gun' that you're putting too many docs on too few nodes. You say that the document count is "growing qu

Tuning solr for large index with rapid writes

2016-04-26 Thread Stephen Lewis
Hello, I'm looking for some guidance on the best steps for tuning a solr cloud cluster which is heavy on writes. We are currently running a solr cloud fleet composed of one core, one shard, and three nodes. The cloud is hosted in AWS, and each solr node is on its own linux r3.2xl instance with 8 c

Re: Tuning Solr caches with high commit rates (NRT)

2010-12-02 Thread Peter Sturge
t; searching. > > When i perform an update. the search-instance dont get the new documents. > when i start a commit on searcher he found it. how can i say the searcher > that he alwas look not only the "old" index. automatic refresh ? XD > -- > View this message in context

Re: Tuning Solr caches with high commit rates (NRT)

2010-12-02 Thread stockii
only the "old" index. automatic refresh ? XD -- View this message in context: http://lucene.472066.n3.nabble.com/Tuning-Solr-caches-with-high-commit-rates-NRT-tp1461275p2005738.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-16 Thread Peter Sturge
Many thanks, Peter K. for posting up on the wiki - great! Yes, fc = field cache. Field Collapsing is something very nice indeed, but is entirely different. As Erik mentions in the wiki post, using per-segment faceting can be a huge boon to performance. It does require the latest Solr trunk build

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Koji Sekiguchi
(10/11/16 8:36), Jonathan Rochkind wrote: In Solr 1.4, facet.method=enum DOES work on multi-valued fields, I'm pretty certain. Correct, and I didn't say that facet.method=enum doesn't work for multiValued/tokenized field in my previous mail. I think Koji's explanation is based on before So

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Jonathan Rochkind
Koji Sekiguchi wrote: Usually, you do not need to set facet.method because Solr automatically uses most appropriate facet method for each field type: boolean: TermEnum multiValued/tokenized: UnInvertedField other than those above: FieldCache As I understand it, in Solr 1.4, (and I may NOT un

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Koji Sekiguchi
(10/11/16 6:43), Dennis Gearon wrote: fc='field collapsing'? fc of facet.method=fc stands for Lucene's FieldCache. enum of facet.method=enum stands for Lucene's TermEnum. Usually, you do not need to set facet.method because Solr automatically uses most appropriate facet method for each field t

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Peter Karich
Mon, November 15, 2010 1:37:00 PM Subject: Re: Tuning Solr caches with high commit rates (NRT) Hi Jonathan, I am too using fc because it simply was faster. Not sure if this can be applied in general. I will add this info to the wiki. Regards, Peter. Awesome. I'm not sure his point 1 abo

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Jonathan Rochkind
you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Peter Karich To: solr-user@lucene.apache.org Sent: Mon, November 15, 2010 1:37:00 PM Subject: Re: Tuning

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Dennis Gearon
ecurity/?p=4501&tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Peter Karich To: solr-user@lucene.apache.org Sent: Mon, November 15, 2010 1:37:00 PM Subject: Re: Tuning Solr caches with high commit rates (NRT) Hi Jonathan, I am too using

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Peter Karich
Hi Jonathan, I am too using fc because it simply was faster. Not sure if this can be applied in general. I will add this info to the wiki. Regards, Peter. Awesome. I'm not sure his point 1 about facet.method=enum is still valid in Solr 1.4+. The "fc" facet.method was changed significantly

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Jonathan Rochkind
Awesome. I'm not sure his point 1 about facet.method=enum is still valid in Solr 1.4+. The "fc" facet.method was changed significantly in 1.4, and generally no longer takes a lot of memory -- for facets with "many" unique values, method fc in fact should take less than enum, I think? Peter Ka

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Peter Karich
Just in case someone is interested: I put the emails of Peter Sturge with some minor edits in the wiki: http://wiki.apache.org/solr/NearRealtimeSearchTuning I found myself search the thread again and again ;-) Feel free to add and edit content! Regards, Peter. Hi Erik, I thought this woul

Re: Tuning Solr caches with high commit rates (NRT)

2010-10-11 Thread Anders Melchiorsen
Hi, why do you need to change the lockType? Does a readonly instance need locks at all? thanks, Anders. On Tue, 14 Sep 2010 15:00:54 +0200, Peter Karich wrote: > Peter Sturge, > > this was a nice hint, thanks again! If you are here in Germany anytime I > can invite you to a beer or an apfels

Re: Tuning Solr

2010-10-05 Thread Jay Hill
Removing those components is not likely to impact performance very much, if at all. I would focus on other areas when tuning performance, such as looking memory usage and configuration, query design, etc. But there isn't any harm in removing them either. Why not do some load tests with the componen

Tuning Solr

2010-10-04 Thread Floyd Wu
Hi there, If I dont need Morelikethis, spellcheck, highlight. Can I remove this configuration section in solrconfig.xml? In other workd, does solr load and use these SearchComponet on statup and suring runtime? Remove this configuration will or will not speedup query? Thanks

Re: Tuning solr

2010-10-01 Thread Gora Mohanty
On Sat, Oct 2, 2010 at 5:21 AM, Stavros Korokithakis wrote: [...] > Is there a guide for tuning solr somewhere? We have about a million > documents (the documents are 8 fields, one of which is the full text of > webpages) and we'd like to give solr a bit more memory and generally

Tuning solr

2010-10-01 Thread Stavros Korokithakis
Hello all, I'm sure this has been asked many times before, but I couldn't find anything from browsing a few months: Is there a guide for tuning solr somewhere? We have about a million documents (the documents are 8 fields, one of which is the full text of webpages) and we'd like

RE: Tuning Solr caches with high commit rates (NRT)

2010-09-30 Thread Bruce Ritchie
> One strategy that I like, but haven't found in discussion lists is > auto-limiting cache size/warming based on available resources (similar > to the way file system caches use free memory). This would allow > caches to adjust to their memory environment as indexes grow. I've written such a cache

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Peter Sturge
t; From: Erick Erickson >> Subject: Re: Tuning Solr caches with high commit rates (NRT) >> To: solr-user@lucene.apache.org >> Date: Friday, September 17, 2010, 1:05 PM >> Near Real Time... >> >> Erick >> >> On Fri, Sep 17, 2010 at 12:55 PM, Dennis Gearon wrote

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Andy
Does Solr use Lucene NRT? --- On Fri, 9/17/10, Erick Erickson wrote: > From: Erick Erickson > Subject: Re: Tuning Solr caches with high commit rates (NRT) > To: solr-user@lucene.apache.org > Date: Friday, September 17, 2010, 1:05 PM > Near Real Time... > > Erick > &

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Dennis Gearon
rick Erickson > Subject: Re: Tuning Solr caches with high commit rates (NRT) > To: solr-user@lucene.apache.org > Date: Friday, September 17, 2010, 10:05 AM > Near Real Time... > > Erick > > On Fri, Sep 17, 2010 at 12:55 PM, Dennis Gearon wrote: > > > BTW, what i

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Erick Erickson
> Laugh at http://www.yert.com/film.php > > > --- On Fri, 9/17/10, Peter Sturge wrote: > > > From: Peter Sturge > > Subject: Re: Tuning Solr caches with high commit rates (NRT) > > To: solr-user@lucene.apache.org > > Date: Friday, September 17, 2010, 2:18 AM > >

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Dennis Gearon
BTW, what is NRT? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Fri, 9/17/10, Peter Sturge wrote: > From: Peter Sturge > Subject: Re: Tuning

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Peter Sturge
Hi, It's great to see such a fantastic response to this thread - NRT is alive and well! I'm hoping to collate this information and add it to the wiki when I get a few free cycles (thanks Erik for the heads up). In the meantime, I thought I'd add a few tidbits of additional information that might

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-14 Thread Peter Karich
Peter Sturge, this was a nice hint, thanks again! If you are here in Germany anytime I can invite you to a beer or an apfelschorle ! :-) I only needed to change the lockType to none in the solrconfig.xml, disable the replication and set the data dir to the master data dir! Regards, Peter Karich.

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-14 Thread Peter Karich
Hi Peter, this scenario would be really great for us - I didn't know that this is possible and works, so: thanks! At the moment we are doing similar with replicating to the readonly instance but the replication is somewhat lengthy and resource-intensive at this datavolume ;-) Regards, Peter. > 1

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Dennis Gearon
our question > > simon > > > > I've only heard about them in the last 2 weeks here on > the list. > > Dennis Gearon > > > > Signature Warning > > > > EARTH has a Right To Life, > >  otherwise we all die. > > > > Re

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Simon Willnauer
- On Sun, 9/12/10, Jason Rutherglen wrote: > >> From: Jason Rutherglen >> Subject: Re: Tuning Solr caches with high commit rates (NRT) >> To: solr-user@lucene.apache.org >> Date: Sunday, September 12, 2010, 7:52 PM >> Yeah there's no patch... I think >&

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
therglen wrote: > >> From: Jason Rutherglen >> Subject: Re: Tuning Solr caches with high commit rates (NRT) >> To: solr-user@lucene.apache.org >> Date: Sunday, September 12, 2010, 7:52 PM >> Yeah there's no patch... I think >> Yonik can write it. :-)  Ya

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
Hi Erik, I thought this would be good for the wiki, but I've not submitted to the wiki before, so I thought I'd put this info out there first, then add it if it was deemed useful. If you could let me know the procedure for submitting, it probably would be worth getting it into the wiki (couldn't d

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
1. You can run multiple Solr instances in separate JVMs, with both having their solr.xml configured to use the same index folder. You need to be careful that one and only one of these instances will ever update the index at a time. The best way to ensure this is to use one for writing only, and the

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
The balanced segment merging is a really cool idea. I'll definetely have a look at this, thanks! One thing I forgot to mention in the original post is we use a mergeFactor of 25. Somewhat on the high side, so that incoming commits aren't trying to merge new data into large segments. 25 is a good b

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Dennis Gearon
9/12/10, Jason Rutherglen wrote: > From: Jason Rutherglen > Subject: Re: Tuning Solr caches with high commit rates (NRT) > To: solr-user@lucene.apache.org > Date: Sunday, September 12, 2010, 7:52 PM > Yeah there's no patch... I think > Yonik can write it. :-)  Yah... The &

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Jason Rutherglen
Yeah there's no patch... I think Yonik can write it. :-) Yah... The Lucene version shouldn't matter. The distributed faceting theoretically can easily be applied to multiple segments, however the way it's written for me is a challenge to untangle and apply successfully to a working patch. Also I

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Chris Haggstrom
Thanks, Peter. This is really great info. One setting I've found to be very useful for the problem of overlapping onDeskSearchers is to reduce the value of maxWarmingSearchers in solrconfig.xml. I've reduced this to 1, so if a slave is already busy doing pre-warming, it won't try to also pre-

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Lance Norskog
Bravo! Other tricks: here is a policy for deciding when to merge segments that attempts to balance merging with performance. It was contributed by LinkedIn- they also run index&search in the same instance (not Solr, a different Lucene app). lucene/contrib/misc/src/java/org/apache/lucene/inde

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Sturge
Hi Jason, I've tried some limited testing with the 4.x trunk using fcs, and I must say, I really like the idea of per-segment faceting. I was hoping to see it in 3.x, but I don't see this option in the branch_3x trunk. Is your SOLR-1606 patch referred to in SOLR-1617 the one to use with 3.1? There

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Jason Rutherglen
Peter, Are you using per-segment faceting, eg, SOLR-1617? That could help your situation. On Sun, Sep 12, 2010 at 12:26 PM, Peter Sturge wrote: > Hi, > > Below are some notes regarding Solr cache tuning that should prove > useful for anyone who uses Solr with frequent commits (e.g. <5min). > >

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Karich
Peter, thanks a lot for your in-depth explanations! Your findings will be definitely helpful for my next performance improvement tests :-) Two questions: 1. How would I do that: > or a local read-only instance that reads the same core as the indexing > instance (for the latter, you'll need som

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Dennis Gearon
Peter Sturge > Subject: Tuning Solr caches with high commit rates (NRT) > To: solr-user@lucene.apache.org > Date: Sunday, September 12, 2010, 9:26 AM > Hi, > > Below are some notes regarding Solr cache tuning that > should prove > useful for anyone who uses Sol

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Erick Erickson
Peter: This kind of information is extremely useful to document, thanks! Do you have the time/energy to put it up on the Wiki? Anyone can edit it by creating a logon. If you don't, would it be OK if someone else did it (with attribution, of course)? I guess that by bringing it up I'm volunteering

Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Sturge
Hi, Below are some notes regarding Solr cache tuning that should prove useful for anyone who uses Solr with frequent commits (e.g. <5min). Environment: Solr 1.4.1 or branch_3x trunk. Note the 4.x trunk has lots of neat new features, so the notes here are likely less relevant to the 4.x environmen

Re: Help with tuning solr

2007-02-13 Thread Mike Klaas
On 2/13/07, Ian Meyer <[EMAIL PROTECTED]> wrote: On 2/13/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > Yes, sorting by fields does take up memory (the fieldcache). > 256M is pretty small for a 5M doc index. > If you have any more memory slots, spring for some more memory (a > little over $100 for

Re: Help with tuning solr

2007-02-13 Thread Ian Meyer
On 2/13/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: Yes, sorting by fields does take up memory (the fieldcache). 256M is pretty small for a 5M doc index. If you have any more memory slots, spring for some more memory (a little over $100 for 1GB). Yeah, I'll see if I can give solr a bit more.

Re: Help with tuning solr

2007-02-13 Thread Yonik Seeley
Yes, sorting by fields does take up memory (the fieldcache). 256M is pretty small for a 5M doc index. If you have any more memory slots, spring for some more memory (a little over $100 for 1GB). Lucene also likes to have free memory left over available for OS cache - otherwise searches start to

Help with tuning solr

2007-02-13 Thread Ian Meyer
All, I'm having some performance issues with solr. I will give some background on our setup and implementation of solr. I'm completely open to reworking everything if the way we are currently doing things are not optimal. I'll try to be as verbose as I can in explaining all of this, but feel free