Guys, thanks for all the inputs, I was continuing my research to know more about segments in Lucene. Below are my conclusion, please correct me if am wrong.
1. Segments are independent sub-indexes in seperate file, while indexing its better to create new segment as it doesnt have to modify an existing file. where as while searching, smaller the segment the better it is since you open x (not exactly x but xn a value proportional to x) physical files to search if you have got x segments in the index. 2. since lucene has memory map concept, for each file/segment in index a new m-map file is created and mapped to the physcial file in disk. Can someone explain or correct this in detail, i am sure there are lot many people wondering how m-map works while you merge or optimze index segments. On 6 October 2012 07:41, Otis Gospodnetic <otis.gospodne...@gmail.com>wrote: > If I were you.... and not knowing all your details... > > I would optimize indices that are static (not being modified) and > would optimize down to 1 segment. > I would do it when search traffic is low. > > Otis > -- > Search Analytics - http://sematext.com/search-analytics/index.html > Performance Monitoring - http://sematext.com/spm/index.html > > > On Fri, Oct 5, 2012 at 4:27 PM, jame vaalet <jamevaa...@gmail.com> wrote: > > Hi Eric, > > I am in a major dilemma with my index now. I have got 8 cores each > around > > 300 GB in size and half of them are deleted documents in it and above > that > > each has got around 100 segments as well. Do i issue a expungeDelete and > > allow the merge policy to take care of the segments or optimize them into > > single segment. Search performance is not at par compared to usual solr > > speed. > > If i have to optimize what segment number should i choose? my RAM size > > around 120 GB and JVM heap is around 45 GB (oldGen being 30 GB). Pleas > > advice ! > > > > thanks. > > > > > > On 6 October 2012 00:00, Erick Erickson <erickerick...@gmail.com> wrote: > > > >> because eventually you'd run out of file handles. Imagine a > >> long-running server with 100,000 segments. Totally > >> unmanageable. > >> > >> I think shawn was emphasizing that RAM requirements don't > >> depend on the number of segments. There are other > >> resources that file consume however. > >> > >> Best > >> Erick > >> > >> On Fri, Oct 5, 2012 at 1:08 PM, jame vaalet <jamevaa...@gmail.com> > wrote: > >> > hi Shawn, > >> > thanks for the detailed explanation. > >> > I have got one doubt, you said it doesn matter how many segments index > >> have > >> > but then why does solr has this merge policy which merges segments > >> > frequently? why can it leave the segments as it is rather than > merging > >> > smaller one's into bigger one? > >> > > >> > thanks > >> > . > >> > > >> > On 5 October 2012 05:46, Shawn Heisey <s...@elyograg.org> wrote: > >> > > >> >> On 10/4/2012 3:22 PM, jame vaalet wrote: > >> >> > >> >>> so imagine i have merged the 150 Gb index into single segment, this > >> would > >> >>> make a single segment of 150 GB in memory. When new docs are > indexed it > >> >>> wouldn't alter this 150 Gb index unless i update or delete the older > >> docs, > >> >>> right? will 150 Gb single segment have problem with memory swapping > at > >> OS > >> >>> level? > >> >>> > >> >> > >> >> Supplement to my previous reply: the real memory mentioned in the > last > >> >> paragraph does not include the memory that the OS uses to cache disk > >> >> access. If more memory is needed and all the free memory is being > used > >> by > >> >> the disk cache, the OS will throw away part of the disk cache (a > >> >> near-instantaneous operation that should never involve disk I/O) and > >> give > >> >> that memory to the application that requests it. > >> >> > >> >> Here's a very good breakdown of how memory gets used with > MMapDirectory > >> in > >> >> Solr. It's applicable to any program that uses memory mapping, not > just > >> >> Solr: > >> >> > >> >> > http://java.dzone.com/**articles/use-lucene%E2%80%99s-**mmapdirectory< > >> http://java.dzone.com/articles/use-lucene%E2%80%99s-mmapdirectory> > >> >> > >> >> Thanks, > >> >> Shawn > >> >> > >> >> > >> > > >> > > >> > -- > >> > > >> > -JAME > >> > > > > > > > > -- > > > > -JAME > -- -JAME