Hi, Erick Thanks for your advice. My mergeFactor is set to 10, so it's impossible have so many segments, specially some .fdx, .fdt file is just empty. And sometime indexing is working fine, ended with 200+ files in data dir. My deployment is having two core and two shard for every core, using autocommit , DIH is used for pull data from DB, merge policies is using TieredMergePolicy. there is nothing customized.
I am wondering how could empty .fdx file generated. may be some config in indexConfig is wrong. My final index is about 20G, having 40m+ docs. here is part of my solrconfig.xml --------------------- <ramBufferSizeMB>32</ramBufferSizeMB> <maxBufferedDocs>1000000</maxBufferedDocs> <mergeFactor>10</mergeFactor> <updateHandler class="solr.DirectUpdateHandler2"> <autoCommit> <maxTime>15000</maxTime> <openSearcher>false</openSearcher> </autoCommit> </updateHandler> ----------------------------- PS, I found an other kind of log, but I am not sure it's the reason or the consequence. I am planing to open debug log, to gather more information tomorrow. 2012-10-14 10:13:19,854 ERROR update.CommitTracker - auto commit error...:java.io.FileNotFoundException: _cwj.fdt at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:266) at org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:177) at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:103) at org.apache.lucene.index.IndexWriter.prepareFlushedSegment(IndexWriter.java:2126) at org.apache.lucene.index.DocumentsWriter.publishFlushedSegment(DocumentsWriter.java:495) at org.apache.lucene.index.DocumentsWriter.finishFlush(DocumentsWriter.java:474) at org.apache.lucene.index.DocumentsWriterFlushQueue$SegmentFlushTicket.publish(DocumentsWriterFlushQueue.java:201) at org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(DocumentsWriterFlushQueue.java:119) at org.apache.lucene.index.DocumentsWriterFlushQueue.tryPurge(DocumentsWriterFlushQueue.java:148) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:435) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:551) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2657) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2793) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2773) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:531) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:214) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2012/10/15 Erick Erickson <erickerick...@gmail.com> > I have no idea how you managed to get so many files in > your index directory, but that's definitely weird. How it > relates to your "file not found", I'm not quite sure, but it > could be something as simple as you've run out of file > handles. > > So you could try upping the number of > file handles as a _temporary_ fix just to see if that's > the problem. See your op-system's manuals for > how. > > If it does work, then I'd run an optimize > down to one segment and remove all the segment > files _other_ than that one segment. NOTE: this > means things like .fdt, .fdx, .tii files etc. NOT things > like segments.gen and segments_1. Make a > backup of course before you try this. > > But I think that's secondary. To generate this many > fiels I suspect you've started a lot of indexing > jobs that you then abort (hard kill?). To get this > many files I'd guess it's something programmatic, > but that's a guess. > > How are you committing? Autocommit? From a SolrJ > (or equivalent) program? Have you implemented any > custom merge policies? > > But to your immediate problem. You can try running > CheckIndex (here's a tutorial from 2.9, but I think > it's still good): > http://java.dzone.com/news/lucene-and-solrs-checkindex > > If that doesn't help (and you can run it in diagnostic mode, > without the --fix flag to see what it _would_ do) then I'm > afraid you'll probably have to re-index. > > And you've got to get to the root of why you have so > many segment files. That number is just crazy.... > > Best > Erick > > On Sun, Oct 14, 2012 at 11:20 PM, Jun Wang <wangjun...@gmail.com> wrote: > > PS, I have found that there lots of segment in index directory, and most > of > > them is empty, like . totoal file number is 35314 in index directory. > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3n.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3o.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3o.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3p.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3p.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3q.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3q.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3r.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3r.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3s.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3s.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3t.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3t.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3u.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3u.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3v.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3v.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3w.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3w.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3x.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3x.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3y.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3y.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3z.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3z.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k40.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k40.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k41.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k41.fdx > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k42.fdt > > -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k42.fdx > > > > > > > > > > 2012/10/15 Jun Wang <wangjun...@gmail.com> > > > >> I have encounter the a FileNotFoundException exception occasionally when > >> indexing, it's not occur every time. Anyone have some clue? Here is > >> the traceback: > >> > >> 2012-10-14 11:37:28,105 ERROR core.SolrCore - > >> java.io.FileNotFoundException: > >> /home/admin/run/deploy/solr/core_p_shard2/data/index/_cwo.fnm (No such > file > >> or directory) > >> at java.io.RandomAccessFile.open(Native Method) > >> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216) > >> at > >> org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:218) > >> at > >> > org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:232) > >> at > >> > org.apache.lucene.codecs.lucene40.Lucene40FieldInfosReader.read(Lucene40FieldInfosReader.java:47) > >> at > >> > org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:101) > >> at > >> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:55) > >> at > >> > org.apache.lucene.index.ReadersAndLiveDocs.getReader(ReadersAndLiveDocs.java:120) > >> at > >> > org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:267) > >> at > >> > org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2928) > >> at > >> > org.apache.lucene.index.DocumentsWriter.applyAllDeletes(DocumentsWriter.java:180) > >> at > >> > org.apache.lucene.index.DocumentsWriter.postUpdate(DocumentsWriter.java:310) > >> at > >> > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:386) > >> at > >> > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1430) > >> at > >> > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:210) > >> at > >> > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61) > >> at > >> > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > >> at > >> > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:432) > >> at > >> > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:315) > >> at > >> > org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:230) > >> at > >> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:157) > >> at > >> > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) > >> at > >> > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > >> at > >> > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656) > >> at > >> > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:454) > >> at > >> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:275) > >> at > >> > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) > >> at > >> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) > >> at > >> > org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) > >> > >> > >> > > > > > > -- > > from Jun Wang > -- from Jun Wang