I have an other question, does the number of segment affect speed for
update index?

2012/10/10 jame vaalet <jamevaa...@gmail.com>

> Guys,
> thanks for all the inputs, I was continuing my research to know more about
> segments in Lucene. Below are my conclusion, please correct me if am wrong.
>
>    1. Segments are independent sub-indexes in seperate file, while indexing
>    its better to create new segment as it doesnt have to modify an existing
>    file. where as while searching, smaller the segment the better it is
> since
>    you open x (not exactly x but xn a value proportional to x) physical
> files
>    to search if you have got x segments in the index.
>    2. since lucene has memory map concept, for each file/segment in index a
>    new m-map file is created and mapped to the physcial file in disk. Can
>    someone explain or correct this in detail, i am sure there are lot many
>    people wondering how m-map works while you merge or optimze index
> segments.
>
>
>
> On 6 October 2012 07:41, Otis Gospodnetic <otis.gospodne...@gmail.com
> >wrote:
>
> > If I were you.... and not knowing all your details...
> >
> > I would optimize indices that are static (not being modified) and
> > would optimize down to 1 segment.
> > I would do it when search traffic is low.
> >
> > Otis
> > --
> > Search Analytics - http://sematext.com/search-analytics/index.html
> > Performance Monitoring - http://sematext.com/spm/index.html
> >
> >
> > On Fri, Oct 5, 2012 at 4:27 PM, jame vaalet <jamevaa...@gmail.com>
> wrote:
> > > Hi Eric,
> > > I  am in a major dilemma with my index now. I have got 8 cores each
> > around
> > > 300 GB in size and half of them are deleted documents in it and above
> > that
> > > each has got around 100 segments as well. Do i issue a expungeDelete
> and
> > > allow the merge policy to take care of the segments or optimize them
> into
> > > single segment. Search performance is not at par compared to usual solr
> > > speed.
> > > If i have to optimize what segment number should i choose? my RAM size
> > > around 120 GB and JVM heap is around 45 GB (oldGen being 30 GB). Pleas
> > > advice !
> > >
> > > thanks.
> > >
> > >
> > > On 6 October 2012 00:00, Erick Erickson <erickerick...@gmail.com>
> wrote:
> > >
> > >> because eventually you'd run out of file handles. Imagine a
> > >> long-running server with 100,000 segments. Totally
> > >> unmanageable.
> > >>
> > >> I think shawn was emphasizing that RAM requirements don't
> > >> depend on the number of segments. There are other
> > >> resources that file consume however.
> > >>
> > >> Best
> > >> Erick
> > >>
> > >> On Fri, Oct 5, 2012 at 1:08 PM, jame vaalet <jamevaa...@gmail.com>
> > wrote:
> > >> > hi Shawn,
> > >> > thanks for the detailed explanation.
> > >> > I have got one doubt, you said it doesn matter how many segments
> index
> > >> have
> > >> > but then why does solr has this merge policy which merges segments
> > >> > frequently?  why can it leave the segments as it is rather than
> > merging
> > >> > smaller one's into bigger one?
> > >> >
> > >> > thanks
> > >> > .
> > >> >
> > >> > On 5 October 2012 05:46, Shawn Heisey <s...@elyograg.org> wrote:
> > >> >
> > >> >> On 10/4/2012 3:22 PM, jame vaalet wrote:
> > >> >>
> > >> >>> so imagine i have merged the 150 Gb index into single segment,
> this
> > >> would
> > >> >>> make a single segment of 150 GB in memory. When new docs are
> > indexed it
> > >> >>> wouldn't alter this 150 Gb index unless i update or delete the
> older
> > >> docs,
> > >> >>> right? will 150 Gb single segment have problem with memory
> swapping
> > at
> > >> OS
> > >> >>> level?
> > >> >>>
> > >> >>
> > >> >> Supplement to my previous reply:  the real memory mentioned in the
> > last
> > >> >> paragraph does not include the memory that the OS uses to cache
> disk
> > >> >> access.  If more memory is needed and all the free memory is being
> > used
> > >> by
> > >> >> the disk cache, the OS will throw away part of the disk cache (a
> > >> >> near-instantaneous operation that should never involve disk I/O)
> and
> > >> give
> > >> >> that memory to the application that requests it.
> > >> >>
> > >> >> Here's a very good breakdown of how memory gets used with
> > MMapDirectory
> > >> in
> > >> >> Solr.  It's applicable to any program that uses memory mapping, not
> > just
> > >> >> Solr:
> > >> >>
> > >> >>
> > http://java.dzone.com/**articles/use-lucene%E2%80%99s-**mmapdirectory<
> > >> http://java.dzone.com/articles/use-lucene%E2%80%99s-mmapdirectory>
> > >> >>
> > >> >> Thanks,
> > >> >> Shawn
> > >> >>
> > >> >>
> > >> >
> > >> >
> > >> > --
> > >> >
> > >> > -JAME
> > >>
> > >
> > >
> > >
> > > --
> > >
> > > -JAME
> >
>
>
>
> --
>
> -JAME
>



-- 
from Jun Wang

Reply via email to