RE: Trying to understand SOLR memory requirements

Steven A Rowe Wed, 18 Jan 2012 11:19:14 -0800

Hi Dave,

Try 'ant usage' from the solr/ directory.


Steve

> -----Original Message-----
> From: Dave [mailto:dla...@gmail.com]
> Sent: Wednesday, January 18, 2012 2:11 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Trying to understand SOLR memory requirements
> 
> Ok, I've been able to pull the code from SVN, build it, and compile my
> SpellingQueryConverter against it. However, I'm at a loss as to where to
> find / how to build the solr.war file?
> 
> On Tue, Jan 17, 2012 at 8:59 AM, Robert Muir <rcm...@gmail.com> wrote:
> 
> > I committed it already: so you can try out branch_3x if you want.
> >
> > you can either wait for a nightly build or compile from svn
> > (http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/).
> >
> > On Tue, Jan 17, 2012 at 8:35 AM, Dave <dla...@gmail.com> wrote:
> > > Thank you Robert, I'd appreciate that. Any idea how long it will take
> to
> > > get a fix? Would I be better switching to trunk? Is trunk stable
> enough
> > for
> > > someone who's very much a SOLR novice?
> > >
> > > Thanks,
> > > Dave
> > >
> > > On Mon, Jan 16, 2012 at 10:08 PM, Robert Muir <rcm...@gmail.com>
> wrote:
> > >
> > >> looks like https://issues.apache.org/jira/browse/SOLR-2888.
> > >>
> > >> Previously, FST would need to hold all the terms in RAM during
> > >> construction, but with the patch it uses offline sorts/temporary
> > >> files.
> > >> I'll reopen the issue to backport this to the 3.x branch.
> > >>
> > >>
> > >> On Mon, Jan 16, 2012 at 8:31 PM, Dave <dla...@gmail.com> wrote:
> > >> > I'm trying to figure out what my memory needs are for a rather
> large
> > >> > dataset. I'm trying to build an auto-complete system for every
> > >> > city/state/country in the world. I've got a geographic database,
> and
> > have
> > >> > setup the DIH to pull the proper data in. There are 2,784,937
> > documents
> > >> > which I've formatted into JSON-like output, so there's a bit of
> data
> > >> > associated with each one. Here is an example record:
> > >> >
> > >> > Brooklyn, New York, United States?{ |id|: |2620829|,
> > >> > |timezone|:|America/New_York|,|type|: |3|, |country|: { |id| :
> |229|
> > },
> > >> > |region|: { |id| : |3608| }, |city|: { |id|: |2616971|,
> |plainname|:
> > >> > |Brooklyn|, |name|: |Brooklyn, New York, United States| }, |hint|:
> > >> > |2300664|, |label|: |Brooklyn, New York, United States|, |value|:
> > >> > |Brooklyn, New York, United States|, |title|: |Brooklyn, New York,
> > United
> > >> > States| }
> > >> >
> > >> > I've got the spellchecker / suggester module setup, and I can
> confirm
> > >> that
> > >> > everything works properly with a smaller dataset (i.e. just a
> couple
> > of
> > >> > countries worth of cities/states). However I'm running into a big
> > problem
> > >> > when I try to index the entire dataset. The
> > >> dataimport?command=full-import
> > >> > works and the system comes to an idle state. It generates the
> > following
> > >> > data/index/ directory (I'm including it in case it gives any
> > indication
> > >> on
> > >> > memory requirements):
> > >> >
> > >> > -rw-rw---- 1 root   root   2.2G Jan 17 00:13 _2w.fdt
> > >> > -rw-rw---- 1 root   root    22M Jan 17 00:13 _2w.fdx
> > >> > -rw-rw---- 1 root   root    131 Jan 17 00:13 _2w.fnm
> > >> > -rw-rw---- 1 root   root   134M Jan 17 00:13 _2w.frq
> > >> > -rw-rw---- 1 root   root    16M Jan 17 00:13 _2w.nrm
> > >> > -rw-rw---- 1 root   root   130M Jan 17 00:13 _2w.prx
> > >> > -rw-rw---- 1 root   root   9.2M Jan 17 00:13 _2w.tii
> > >> > -rw-rw---- 1 root   root   1.1G Jan 17 00:13 _2w.tis
> > >> > -rw-rw---- 1 root   root     20 Jan 17 00:13 segments.gen
> > >> > -rw-rw---- 1 root   root    291 Jan 17 00:13 segments_2
> > >> >
> > >> > Next I try to run the suggest?spellcheck.build=true command, and I
> get
> > >> the
> > >> > following error:
> > >> >
> > >> > Jan 16, 2012 4:01:47 PM org.apache.solr.spelling.suggest.Suggester
> > build
> > >> > INFO: build()
> > >> > Jan 16, 2012 4:03:27 PM org.apache.solr.common.SolrException log
> > >> > SEVERE: java.lang.OutOfMemoryError: GC overhead limit exceeded
> > >> >  at java.util.Arrays.copyOfRange(Arrays.java:3209)
> > >> > at java.lang.String.<init>(String.java:215)
> > >> >  at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
> > >> > at
> > org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:184)
> > >> >  at
> > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:203)
> > >> > at
> > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172)
> > >> >  at
> > org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:509)
> > >> > at
> > >>
> >
> org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:719)
> > >> >  at
> > >>
> org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:309)
> > >> > at
> > >> >
> > >>
> >
> org.apache.lucene.search.spell.HighFrequencyDictionary$HighFrequencyIterat
> or.isFrequent(HighFrequencyDictionary.java:75)
> > >> >  at
> > >> >
> > >>
> >
> org.apache.lucene.search.spell.HighFrequencyDictionary$HighFrequencyIterat
> or.hasNext(HighFrequencyDictionary.java:125)
> > >> > at
> > >>
> org.apache.lucene.search.suggest.fst.FSTLookup.build(FSTLookup.java:157)
> > >> >  at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70)
> > >> > at
> > org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133)
> > >> >  at
> > >> >
> > >>
> >
> org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckCo
> mponent.java:109)
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa
> ndler.java:173)
> > >> >  at
> > >> >
> > >>
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas
> e.java:129)
> > >> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
> > >> >  at
> > >> >
> > >>
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :356)
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:252)
> > >> >  at
> > >> >
> > >>
> >
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
> er.java:1212)
> > >> > at
> > >>
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> > >> >  at
> > >> >
> > >>
> >
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
> )
> > >> > at
> > >>
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> > >> >  at
> > >>
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> > >> > at
> > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> > >> >  at
> > >> >
> > >>
> >
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
> llection.java:230)
> > >> > at
> > >> >
> > >>
> >
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
> 114)
> > >> >  at
> > >>
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> > >> > at org.mortbay.jetty.Server.handle(Server.java:326)
> > >> >  at
> > >>
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> > >> > at
> > >> >
> > >>
> >
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnect
> ion.java:928)
> > >> >
> > >> >
> > >> > I also get an error if after the dataimport command completes, I
> just
> > >> exit
> > >> > the SOLR process and restart it:
> > >> >
> > >> > Jan 16, 2012 4:06:15 PM org.apache.solr.common.SolrException log
> > >> > SEVERE: java.lang.OutOfMemoryError: Java heap space
> > >> > at org.apache.lucene.util.fst.NodeHash.rehash(NodeHash.java:158)
> > >> > at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:128)
> > >> >  at
> org.apache.lucene.util.fst.Builder.compileNode(Builder.java:161)
> > >> > at
> > org.apache.lucene.util.fst.Builder.compilePrevTail(Builder.java:247)
> > >> >  at org.apache.lucene.util.fst.Builder.add(Builder.java:364)
> > >> > at
> > >> >
> > >>
> >
> org.apache.lucene.search.suggest.fst.FSTLookup.buildAutomaton(FSTLookup.ja
> va:486)
> > >> >  at
> > >>
> org.apache.lucene.search.suggest.fst.FSTLookup.build(FSTLookup.java:179)
> > >> > at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70)
> > >> >  at
> > org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133)
> > >> > at
> > org.apache.solr.spelling.suggest.Suggester.reload(Suggester.java:153)
> > >> >  at
> > >> >
> > >>
> >
> org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
> .newSearcher(SpellCheckComponent.java:675)
> > >> > at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1181)
> > >> >  at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> > >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> > >> >  at
> > >> >
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
> java:886)
> > >> > at
> > >> >
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> :908)
> > >> >  at java.lang.Thread.run(Thread.java:662)
> > >> >
> > >> > Jan 16, 2012 4:06:15 PM org.apache.solr.core.SolrCore
> registerSearcher
> > >> > INFO: [places] Registered new searcher Searcher@34b0ede5 main
> > >> >
> > >> >
> > >> >
> > >> > Basically this means once I've run a full-import, I cannot exit the
> > SOLR
> > >> > process because I receive this error no matter what when I restart
> the
> > >> > process. I've tried with different -Xmx arguments, and I'm really
> at a
> > >> loss
> > >> > at this point. Is there any guideline to how much RAM I need? I've
> got
> > >> 8GB
> > >> > on this machine, although that could be increased if necessary.
> > However,
> > >> I
> > >> > can't understand why it would need so much memory. Could I have
> > something
> > >> > configured incorrectly? I've been over the configs several times,
> > trying
> > >> to
> > >> > get them down to the bare minimum.
> > >> >
> > >> > Thanks for any assistance!
> > >> >
> > >> > Dave
> > >>
> > >>
> > >>
> > >> --
> > >> lucidimagination.com
> > >>
> >
> >
> >
> > --
> > lucidimagination.com
> >

RE: Trying to understand SOLR memory requirements

Reply via email to