Re: High disk write usage

Antonio De Miguel Wed, 05 Jul 2017 15:02:56 -0700

Hi erik.

What i want to said is that we have enough memory to store shards, and
furthermore, JVMs heapspaces


Machine has 400gb of RAM. I think we have enough.

We have 10 JVM running on the machine, each of one using 16gb.

Shard size is about 8gb.

When we have query or indexing peaks our problem are the CPU ussage and the
disk io, but we have a lot of unused memory.









El 5/7/2017 19:04, "Erick Erickson" <erickerick...@gmail.com> escribió:

> bq: We have enough physical RAM to store full collection and 16Gb for each
> JVM.
>
> That's not quite what I was asking for. Lucene uses MMapDirectory to
> map part of the index into the OS memory space. If you've
> over-allocated the JVM space relative to your physical memory that
> space can start swapping. Frankly I'd expect your query performance to
> die if that was happening so this is a sanity check.
>
> How much physical memory does the machine have and how much memory is
> allocated to _all_ of the JVMs running on that machine?
>
> see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-
> on-64bit.html
>
> Best,
> Erick
>
>
> On Wed, Jul 5, 2017 at 9:41 AM, Antonio De Miguel <deveto...@gmail.com>
> wrote:
> > Hi Erik! thanks for your response!
> >
> > Our soft commit is 5 seconds. Why generates I/0 a softcommit? first
> notice.
> >
> >
> > We have enough physical RAM to store full collection and 16Gb for each
> > JVM.  The collection is relatively small.
> >
> > I've tried (for testing purposes)  disabling transactionlog (commenting
> > <updateLog>)... but cluster does not go up. I'll try writing into
> separated
> > drive, nice idea...
> >
> >
> >
> >
> >
> >
> >
> >
> > 2017-07-05 18:04 GMT+02:00 Erick Erickson <erickerick...@gmail.com>:
> >
> >> What is your soft commit interval? That'll cause I/O as well.
> >>
> >> How much physical RAM and how much is dedicated to _all_ the JVMs on a
> >> machine? One cause here is that Lucene uses MMapDirectory which can be
> >> starved for OS memory if you use too much JVM, my rule of thumb is
> >> that _at least_ half of the physical memory should be reserved for the
> >> OS.
> >>
> >> Your transaction logs should fluctuate but even out. By that I mean
> >> they should increase in size but every hard commit should truncate
> >> some of them so I wouldn't expect them to grow indefinitely.
> >>
> >> One strategy is to put your tlogs on a separate drive exactly to
> >> reduce contention. You could disable them too at a cost of risking
> >> your data. That might be a quick experiment you could run though,
> >> disable tlogs and see what that changes. Of course I'd do this on my
> >> test system ;).
> >>
> >> But yeah, Solr will use a lot of I/O in the scenario you are outlining
> >> I'm afraid.
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Jul 5, 2017 at 8:08 AM, Antonio De Miguel <deveto...@gmail.com>
> >> wrote:
> >> > thanks Markus!
> >> >
> >> > We already have SSD.
> >> >
> >> > About changing topology.... we probed yesterday with 10 shards, but
> >> system
> >> > goes more inconsistent than with the current topology (5x10). I dont
> know
> >> > why... too many traffic perhaps?
> >> >
> >> > About merge factor.. we set default configuration for some days... but
> >> when
> >> > a merge occurs system overload. We probed with mergefactor of 4 to
> >> improbe
> >> > query times and trying to have smaller merges.
> >> >
> >> > 2017-07-05 16:51 GMT+02:00 Markus Jelsma <markus.jel...@openindex.io
> >:
> >> >
> >> >> Try mergeFactor of 10 (default) which should be fine in most cases.
> If
> >> you
> >> >> got an extreme case, either create more shards and consider better
> >> hardware
> >> >> (SSD's)
> >> >>
> >> >> -----Original message-----
> >> >> > From:Antonio De Miguel <deveto...@gmail.com>
> >> >> > Sent: Wednesday 5th July 2017 16:48
> >> >> > To: solr-user@lucene.apache.org
> >> >> > Subject: Re: High disk write usage
> >> >> >
> >> >> > Thnaks a lot alessandro!
> >> >> >
> >> >> > Yes, we have very big physical dedicated machines, with a topology
> of
> >> 5
> >> >> > shards and10 replicas each shard.
> >> >> >
> >> >> >
> >> >> > 1. transaction log files are increasing but not with this rate
> >> >> >
> >> >> > 2.  we 've probed with values between 300 and 2000 MB... without
> any
> >> >> > visible results
> >> >> >
> >> >> > 3.  We don't use those features
> >> >> >
> >> >> > 4. No.
> >> >> >
> >> >> > 5. I've probed with low and high mergefacors and i think that is
> the
> >> >> point.
> >> >> >
> >> >> > With low merge factor (over 4) we 've high write disk rate as i
> said
> >> >> > previously
> >> >> >
> >> >> > with merge factor of 20, writing disk rate is decreasing, but now,
> >> with
> >> >> > high qps rates (over 1000 qps) system is overloaded.
> >> >> >
> >> >> > i think that's the expected behaviour :(
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > 2017-07-05 15:49 GMT+02:00 alessandro.benedetti <
> a.benede...@sease.io
> >> >:
> >> >> >
> >> >> > > Point 2 was the ram Buffer size :
> >> >> > >
> >> >> > > *ramBufferSizeMB* sets the amount of RAM that may be used by
> Lucene
> >> >> > >          indexing for buffering added documents and deletions
> before
> >> >> they
> >> >> > > are
> >> >> > >          flushed to the Directory.
> >> >> > >          maxBufferedDocs sets a limit on the number of documents
> >> >> buffered
> >> >> > >          before flushing.
> >> >> > >          If both ramBufferSizeMB and maxBufferedDocs is set, then
> >> >> > >          Lucene will flush based on whichever limit is hit first.
> >> >> > >
> >> >> > > <ramBufferSizeMB>100</ramBufferSizeMB>
> >> >> > > <maxBufferedDocs>1000</maxBufferedDocs>
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > -----
> >> >> > > ---------------
> >> >> > > Alessandro Benedetti
> >> >> > > Search Consultant, R&D Software Engineer, Director
> >> >> > > Sease Ltd. - www.sease.io
> >> >> > > --
> >> >> > > View this message in context: http://lucene.472066.n3.
> >> >> > > nabble.com/High-disk-write-usage-tp4344356p4344386.html
> >> >> > > Sent from the Solr - User mailing list archive at Nabble.com.
> >> >> > >
> >> >> >
> >> >>
> >>
>

Re: High disk write usage

Reply via email to