One of the hard-core Lucene guys is going to have to help you out. Or you may have to write some custom code to fix the index for any such shard. If you have deleted any documents, it may be sufficient to simply optimize the index.

-- Jack Krupansky

-----Original Message----- From: yamazaki
Sent: Wednesday, May 7, 2014 8:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Too many documents Exception

Tanks, Jack.

Is there a way to suppress setting this exception?

For example,
<maxMergeDocs>2147483647</maxMergeDocs> ?


When this exception occurs, Index will not be read.
If solrcloud  is used, some data not read.

shard1 documents 2^31-1 over
shard2 documents 2^31-1 not over

shard1 down. shard1 index is dead.

-- yamazaki


2014-05-07 11:01 GMT+09:00 Jack Krupansky <j...@basetechnology.com>:
Lucene only supports 2^31-1 documents in an index, so Solr can only support
2^31-1 documents in a single shard.

I think it's a bug that Lucene doesn't throw an exception when more than
that number of documents have been inserted. Instead, you get this error
when Solr tries to read such an overstuffed index.

-- Jack Krupansky

-----Original Message----- From: [Tech Fun]山崎
Sent: Tuesday, May 6, 2014 8:54 PM
To: solr-user@lucene.apache.org
Subject: Too many documents Exception


Hello everybody,

Solr 4.3.1(and 4.7.1), Num Docs + Deleted Docs >
2147483647(Integer.MAX_VALUE) over
Caused by: java.lang.IllegalArgumentException: Too many documents,
composite IndexReaders cannot exceed 2147483647

It seems to be trouble similar to the unresolved e-mail.
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/browser

If How can I fix this?
This Solr Specification?


log.

ERROR org.apache.solr.core.CoreContainer  – Unable to create core:
collection1
org.apache.solr.common.SolrException: Error opening new searcher
   at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
   at org.apache.solr.core.SolrCore.<init>(SolrCore.java:618)
   at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
   at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550)
   at org.apache.solr.core.SolrCore.<init>(SolrCore.java:796)
   ... 13 more
Caused by: org.apache.solr.common.SolrException: Error opening Reader
   at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172)
   at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:183)
   at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:179)
   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414)
   ... 15 more
Caused by: java.lang.IllegalArgumentException: Too many documents,
composite IndexReaders cannot exceed 2147483647
   at
org.apache.lucene.index.BaseCompositeReader.<init>(BaseCompositeReader.java:77)
   at
org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:368)
   at
org.apache.lucene.index.StandardDirectoryReader.<init>(StandardDirectoryReader.java:42)
   at
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71)
   at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
   at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88)
   at
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34)
   at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169)
   ... 18 more
ERROR org.apache.solr.core.CoreContainer  –
null:org.apache.solr.common.SolrException: Unable to create core:
collection1
   at
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993)
   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
   at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
   at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
   at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
   at org.apache.solr.core.SolrCore.<init>(SolrCore.java:618)
   at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
   ... 10 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550)
   at org.apache.solr.core.SolrCore.<init>(SolrCore.java:796)
   ... 13 more
Caused by: org.apache.solr.common.SolrException: Error opening Reader
   at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172)
   at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:183)
   at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:179)
   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414)
   ... 15 more
Caused by: java.lang.IllegalArgumentException: Too many documents,
composite IndexReaders cannot exceed 2147483647
   at
org.apache.lucene.index.BaseCompositeReader.<init>(BaseCompositeReader.java:77)
   at
org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:368)
   at
org.apache.lucene.index.StandardDirectoryReader.<init>(StandardDirectoryReader.java:42)
   at
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71)
   at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
   at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88)
   at
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34)
   at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169)
   ... 18 more


sample solrconfig.xml

<?xml version="1.0" encoding="UTF-8" ?>
<config>
 <luceneMatchVersion>LUCENE_43</luceneMatchVersion>

 <lib dir="/opt/solr/dist" regex="solr-cell-\d.*\.jar" />
 <lib dir="/opt/solr/contrib/extraction/lib" regex=".*\.jar" />

 <lib dir="/opt/solr/dist" regex="solr-clustering-\d.*\.jar" />
 <lib dir="/opt/solr/contrib/clustering/lib" regex=".*\.jar" />

 <lib dir="/opt/solr/dist" regex="solr-langid-\d.*\.jar" />
 <lib dir="/opt/solr/contrib/langid/lib" regex=".*\.jar" />

 <lib dir="/opt/solr/dist" regex="solr-velocity-\d.*\.jar" />
 <lib dir="/opt/solr/contrib/velocity/lib" regex=".*\.jar" />

 <dataDir>${solr.data.dir:}</dataDir>

 <directoryFactory name="DirectoryFactory"

class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>

 <codecFactory class="solr.SchemaCodecFactory"/>

 <indexConfig>
   <ramBufferSizeMB>256</ramBufferSizeMB>
   <lockType>${solr.lock.type:native}</lockType>
 </indexConfig>

 <jmx />

 <updateHandler class="solr.DirectUpdateHandler2">
   <updateLog>
     <str name="dir">${solr.ulog.dir:}</str>
   </updateLog>
   <autoCommit>
     <maxDocs>10000</maxDocs>
     <maxTime>60000</maxTime>
     <openSearcher>false</openSearcher>
   </autoCommit>
   <autoSoftCommit>
     <maxDocs>10</maxDocs>
     <maxTime>1000</maxTime>
   </autoSoftCommit>
 </updateHandler>

 <query>
   <maxBooleanClauses>1024</maxBooleanClauses>
   <filterCache class="solr.FastLRUCache"
                size="16384"
                initialSize="4096"
                autowarmCount="1024"/>
   <queryResultCache class="solr.FastLRUCache"
                    size="16384"
                    initialSize="4096"
                    autowarmCount="1024"/>
   <documentCache class="solr.FastLRUCache"
                  size="16384"
                  initialSize="4096"
                  autowarmCount="1024"/>
   <enableLazyFieldLoading>true</enableLazyFieldLoading>
   <queryResultWindowSize>20</queryResultWindowSize>
   <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
   <useColdSearcher>false</useColdSearcher>
   <maxWarmingSearchers>2</maxWarmingSearchers>
 </query>

 <requestDispatcher handleSelect="false" >
   <requestParsers enableRemoteStreaming="true"
                   multipartUploadLimitInKB="2048000"
                   formdataUploadLimitInKB="2048"/>
   <httpCaching never304="true" />
 </requestDispatcher>

 <requestHandler name="/select" class="solr.SearchHandler">
   <lst name="defaults">
      <str name="echoParams">explicit</str>
      <int name="rows">10</int>
      <str name="df">text</str>
   </lst>
 </requestHandler>

 <requestHandler name="/update" class="solr.UpdateRequestHandler">
 </requestHandler>

<requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler">
   <lst name="defaults">
     <str name="stream.contentType">application/json</str>
   </lst>
 </requestHandler>

 <requestHandler name="/admin/" class="solr.admin.AdminHandlers" />

 <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
   <lst name="invariants">
     <str name="q">solrpingquery</str>
   </lst>
   <lst name="defaults">
     <str name="echoParams">all</str>
   </lst>
 </requestHandler>

 <queryResponseWriter name="json" class="solr.JSONResponseWriter">
   <str name="content-type">text/plain; charset=UTF-8</str>
 </queryResponseWriter>
</config>


sample scheme.xml

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="twitter" version="1.5">
 <!-- types -->
 <types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
   <fieldType name="long" class="solr.TrieLongField"
precisionStep="0" positionIncrementGap="0"/>
   <fieldType name="tlong" class="solr.TrieLongField"
precisionStep="8" positionIncrementGap="0"/>
   <fieldType name="tdate" class="solr.TrieDateField"
precisionStep="6" positionIncrementGap="0"/>
   <fieldType name="text_cjk" class="solr.TextField"
positionIncrementGap="100">
     <analyzer>
       <charFilter class="solr.MappingCharFilterFactory"/>
       <tokenizer class="solr.StandardTokenizerFactory"/>
       <filter class="solr.CJKWidthFilterFactory"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.CJKBigramFilterFactory" outputUnigrams="true"/>
     </analyzer>
   </fieldType>
 </types>

 <!-- fields -->
 <fields>
   <field name="key" type="string" indexed="true" stored="true"
required="true" />
   <field name="status_id" type="tlong" indexed="true" stored="true"
required="true"/>
   <field name="text" type="text_cjk" indexed="true" stored="true"
required="true"/>
   <field name="from_user_id_str" type="string" indexed="true"
stored="true" required="true"/>
   <field name="created_at" type="tdate" indexed="true" stored="true"
required="true"/>
   <field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
 </fields>
 <uniqueKey>key</uniqueKey>
 <defaultSearchField>text</defaultSearchField>
 <solrQueryParser defaultOperator="AND"/>
</schema>



sample data add source code, python

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import datetime
# use https://github.com/toastdriven/pysolr
from pysolr import(
   Solr,
)


def main():
   s_time = datetime.datetime.utcnow()
   print 'start.: ({})'.format(str(s_time))

   solr = Solr('http://localhost:8983/solr/collection1', timeout=60)

   docs = []

   max_range = 22 * (10 ** 8)  # Java Integer.MAX_VALUE over
   for x in xrange(1, max_range):
       docs.append(
           {
               'key': '{}'.format(x),
               'status_id': x,
               'text': '{} 番目の記事'.format(x).decode('utf-8'),
               'from_user_id_str': '1',
               'created_at': '2014-05-01T20:06:53Z',
           }
       )

       if x % (10 ** 4) == 0:
           solr.add(docs)
           solr.commit()
           docs = []

           e_time = datetime.datetime.utcnow()

           print '{} end.: ({})'.format(x, str(e_time - s_time))

   solr.add(docs)
   solr.commit()
   docs = []

   e_time = datetime.datetime.utcnow()

   print 'end.: ({})'.format(str(e_time - s_time))

if __name__ == '__main__':
   main()



--
----
山崎 一大 Tech Fun 株式会社
mailto:yamaz...@techfun.jp
〒110-0015 東京都台東区東上野1-7-15 野村不動産東上野ビル3階
TEL: 03-5816-0331(代)  FAX: 03-5816-0332
会社Web: http://techfun.co.jp/
教育Web: http://techfun.jp/

Reply via email to