Hi Dave, Try 'ant usage' from the solr/ directory.
Steve > -----Original Message----- > From: Dave [mailto:dla...@gmail.com] > Sent: Wednesday, January 18, 2012 2:11 PM > To: solr-user@lucene.apache.org > Subject: Re: Trying to understand SOLR memory requirements > > Ok, I've been able to pull the code from SVN, build it, and compile my > SpellingQueryConverter against it. However, I'm at a loss as to where to > find / how to build the solr.war file? > > On Tue, Jan 17, 2012 at 8:59 AM, Robert Muir <rcm...@gmail.com> wrote: > > > I committed it already: so you can try out branch_3x if you want. > > > > you can either wait for a nightly build or compile from svn > > (http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/). > > > > On Tue, Jan 17, 2012 at 8:35 AM, Dave <dla...@gmail.com> wrote: > > > Thank you Robert, I'd appreciate that. Any idea how long it will take > to > > > get a fix? Would I be better switching to trunk? Is trunk stable > enough > > for > > > someone who's very much a SOLR novice? > > > > > > Thanks, > > > Dave > > > > > > On Mon, Jan 16, 2012 at 10:08 PM, Robert Muir <rcm...@gmail.com> > wrote: > > > > > >> looks like https://issues.apache.org/jira/browse/SOLR-2888. > > >> > > >> Previously, FST would need to hold all the terms in RAM during > > >> construction, but with the patch it uses offline sorts/temporary > > >> files. > > >> I'll reopen the issue to backport this to the 3.x branch. > > >> > > >> > > >> On Mon, Jan 16, 2012 at 8:31 PM, Dave <dla...@gmail.com> wrote: > > >> > I'm trying to figure out what my memory needs are for a rather > large > > >> > dataset. I'm trying to build an auto-complete system for every > > >> > city/state/country in the world. I've got a geographic database, > and > > have > > >> > setup the DIH to pull the proper data in. There are 2,784,937 > > documents > > >> > which I've formatted into JSON-like output, so there's a bit of > data > > >> > associated with each one. Here is an example record: > > >> > > > >> > Brooklyn, New York, United States?{ |id|: |2620829|, > > >> > |timezone|:|America/New_York|,|type|: |3|, |country|: { |id| : > |229| > > }, > > >> > |region|: { |id| : |3608| }, |city|: { |id|: |2616971|, > |plainname|: > > >> > |Brooklyn|, |name|: |Brooklyn, New York, United States| }, |hint|: > > >> > |2300664|, |label|: |Brooklyn, New York, United States|, |value|: > > >> > |Brooklyn, New York, United States|, |title|: |Brooklyn, New York, > > United > > >> > States| } > > >> > > > >> > I've got the spellchecker / suggester module setup, and I can > confirm > > >> that > > >> > everything works properly with a smaller dataset (i.e. just a > couple > > of > > >> > countries worth of cities/states). However I'm running into a big > > problem > > >> > when I try to index the entire dataset. The > > >> dataimport?command=full-import > > >> > works and the system comes to an idle state. It generates the > > following > > >> > data/index/ directory (I'm including it in case it gives any > > indication > > >> on > > >> > memory requirements): > > >> > > > >> > -rw-rw---- 1 root root 2.2G Jan 17 00:13 _2w.fdt > > >> > -rw-rw---- 1 root root 22M Jan 17 00:13 _2w.fdx > > >> > -rw-rw---- 1 root root 131 Jan 17 00:13 _2w.fnm > > >> > -rw-rw---- 1 root root 134M Jan 17 00:13 _2w.frq > > >> > -rw-rw---- 1 root root 16M Jan 17 00:13 _2w.nrm > > >> > -rw-rw---- 1 root root 130M Jan 17 00:13 _2w.prx > > >> > -rw-rw---- 1 root root 9.2M Jan 17 00:13 _2w.tii > > >> > -rw-rw---- 1 root root 1.1G Jan 17 00:13 _2w.tis > > >> > -rw-rw---- 1 root root 20 Jan 17 00:13 segments.gen > > >> > -rw-rw---- 1 root root 291 Jan 17 00:13 segments_2 > > >> > > > >> > Next I try to run the suggest?spellcheck.build=true command, and I > get > > >> the > > >> > following error: > > >> > > > >> > Jan 16, 2012 4:01:47 PM org.apache.solr.spelling.suggest.Suggester > > build > > >> > INFO: build() > > >> > Jan 16, 2012 4:03:27 PM org.apache.solr.common.SolrException log > > >> > SEVERE: java.lang.OutOfMemoryError: GC overhead limit exceeded > > >> > at java.util.Arrays.copyOfRange(Arrays.java:3209) > > >> > at java.lang.String.<init>(String.java:215) > > >> > at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122) > > >> > at > > org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:184) > > >> > at > > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:203) > > >> > at > > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172) > > >> > at > > org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:509) > > >> > at > > >> > > > org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:719) > > >> > at > > >> > org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:309) > > >> > at > > >> > > > >> > > > org.apache.lucene.search.spell.HighFrequencyDictionary$HighFrequencyIterat > or.isFrequent(HighFrequencyDictionary.java:75) > > >> > at > > >> > > > >> > > > org.apache.lucene.search.spell.HighFrequencyDictionary$HighFrequencyIterat > or.hasNext(HighFrequencyDictionary.java:125) > > >> > at > > >> > org.apache.lucene.search.suggest.fst.FSTLookup.build(FSTLookup.java:157) > > >> > at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70) > > >> > at > > org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133) > > >> > at > > >> > > > >> > > > org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckCo > mponent.java:109) > > >> > at > > >> > > > >> > > > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa > ndler.java:173) > > >> > at > > >> > > > >> > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas > e.java:129) > > >> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372) > > >> > at > > >> > > > >> > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java > :356) > > >> > at > > >> > > > >> > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav > a:252) > > >> > at > > >> > > > >> > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl > er.java:1212) > > >> > at > > >> > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > > >> > at > > >> > > > >> > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216 > ) > > >> > at > > >> > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > > >> > at > > >> > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > > >> > at > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > > >> > at > > >> > > > >> > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo > llection.java:230) > > >> > at > > >> > > > >> > > > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java: > 114) > > >> > at > > >> > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > > >> > at org.mortbay.jetty.Server.handle(Server.java:326) > > >> > at > > >> > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > > >> > at > > >> > > > >> > > > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnect > ion.java:928) > > >> > > > >> > > > >> > I also get an error if after the dataimport command completes, I > just > > >> exit > > >> > the SOLR process and restart it: > > >> > > > >> > Jan 16, 2012 4:06:15 PM org.apache.solr.common.SolrException log > > >> > SEVERE: java.lang.OutOfMemoryError: Java heap space > > >> > at org.apache.lucene.util.fst.NodeHash.rehash(NodeHash.java:158) > > >> > at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:128) > > >> > at > org.apache.lucene.util.fst.Builder.compileNode(Builder.java:161) > > >> > at > > org.apache.lucene.util.fst.Builder.compilePrevTail(Builder.java:247) > > >> > at org.apache.lucene.util.fst.Builder.add(Builder.java:364) > > >> > at > > >> > > > >> > > > org.apache.lucene.search.suggest.fst.FSTLookup.buildAutomaton(FSTLookup.ja > va:486) > > >> > at > > >> > org.apache.lucene.search.suggest.fst.FSTLookup.build(FSTLookup.java:179) > > >> > at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70) > > >> > at > > org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133) > > >> > at > > org.apache.solr.spelling.suggest.Suggester.reload(Suggester.java:153) > > >> > at > > >> > > > >> > > > org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener > .newSearcher(SpellCheckComponent.java:675) > > >> > at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1181) > > >> > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > >> > at > > >> > > > >> > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor. > java:886) > > >> > at > > >> > > > >> > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java > :908) > > >> > at java.lang.Thread.run(Thread.java:662) > > >> > > > >> > Jan 16, 2012 4:06:15 PM org.apache.solr.core.SolrCore > registerSearcher > > >> > INFO: [places] Registered new searcher Searcher@34b0ede5 main > > >> > > > >> > > > >> > > > >> > Basically this means once I've run a full-import, I cannot exit the > > SOLR > > >> > process because I receive this error no matter what when I restart > the > > >> > process. I've tried with different -Xmx arguments, and I'm really > at a > > >> loss > > >> > at this point. Is there any guideline to how much RAM I need? I've > got > > >> 8GB > > >> > on this machine, although that could be increased if necessary. > > However, > > >> I > > >> > can't understand why it would need so much memory. Could I have > > something > > >> > configured incorrectly? I've been over the configs several times, > > trying > > >> to > > >> > get them down to the bare minimum. > > >> > > > >> > Thanks for any assistance! > > >> > > > >> > Dave > > >> > > >> > > >> > > >> -- > > >> lucidimagination.com > > >> > > > > > > > > -- > > lucidimagination.com > >