One of the hard-core Lucene guys is going to have to help you out. Or you
may have to write some custom code to fix the index for any such shard. If
you have deleted any documents, it may be sufficient to simply optimize the
index.
-- Jack Krupansky
-----Original Message-----
From: yamazaki
Sent: Wednesday, May 7, 2014 8:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Too many documents Exception
Tanks, Jack.
Is there a way to suppress setting this exception?
For example,
<maxMergeDocs>2147483647</maxMergeDocs> ?
When this exception occurs, Index will not be read.
If solrcloud is used, some data not read.
shard1 documents 2^31-1 over
shard2 documents 2^31-1 not over
shard1 down. shard1 index is dead.
-- yamazaki
2014-05-07 11:01 GMT+09:00 Jack Krupansky <j...@basetechnology.com>:
Lucene only supports 2^31-1 documents in an index, so Solr can only
support
2^31-1 documents in a single shard.
I think it's a bug that Lucene doesn't throw an exception when more than
that number of documents have been inserted. Instead, you get this error
when Solr tries to read such an overstuffed index.
-- Jack Krupansky
-----Original Message----- From: [Tech Fun]山崎
Sent: Tuesday, May 6, 2014 8:54 PM
To: solr-user@lucene.apache.org
Subject: Too many documents Exception
Hello everybody,
Solr 4.3.1(and 4.7.1), Num Docs + Deleted Docs >
2147483647(Integer.MAX_VALUE) over
Caused by: java.lang.IllegalArgumentException: Too many documents,
composite IndexReaders cannot exceed 2147483647
It seems to be trouble similar to the unresolved e-mail.
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/browser
If How can I fix this?
This Solr Specification?
log.
ERROR org.apache.solr.core.CoreContainer – Unable to create core:
collection1
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:618)
at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: Error opening new
searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:796)
... 13 more
Caused by: org.apache.solr.common.SolrException: Error opening Reader
at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172)
at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:183)
at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:179)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414)
... 15 more
Caused by: java.lang.IllegalArgumentException: Too many documents,
composite IndexReaders cannot exceed 2147483647
at
org.apache.lucene.index.BaseCompositeReader.<init>(BaseCompositeReader.java:77)
at
org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:368)
at
org.apache.lucene.index.StandardDirectoryReader.<init>(StandardDirectoryReader.java:42)
at
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88)
at
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34)
at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169)
... 18 more
ERROR org.apache.solr.core.CoreContainer –
null:org.apache.solr.common.SolrException: Unable to create core:
collection1
at
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993)
at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: Error opening new
searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:618)
at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
... 10 more
Caused by: org.apache.solr.common.SolrException: Error opening new
searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:796)
... 13 more
Caused by: org.apache.solr.common.SolrException: Error opening Reader
at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172)
at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:183)
at
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:179)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414)
... 15 more
Caused by: java.lang.IllegalArgumentException: Too many documents,
composite IndexReaders cannot exceed 2147483647
at
org.apache.lucene.index.BaseCompositeReader.<init>(BaseCompositeReader.java:77)
at
org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:368)
at
org.apache.lucene.index.StandardDirectoryReader.<init>(StandardDirectoryReader.java:42)
at
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88)
at
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34)
at
org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169)
... 18 more
sample solrconfig.xml
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<luceneMatchVersion>LUCENE_43</luceneMatchVersion>
<lib dir="/opt/solr/dist" regex="solr-cell-\d.*\.jar" />
<lib dir="/opt/solr/contrib/extraction/lib" regex=".*\.jar" />
<lib dir="/opt/solr/dist" regex="solr-clustering-\d.*\.jar" />
<lib dir="/opt/solr/contrib/clustering/lib" regex=".*\.jar" />
<lib dir="/opt/solr/dist" regex="solr-langid-\d.*\.jar" />
<lib dir="/opt/solr/contrib/langid/lib" regex=".*\.jar" />
<lib dir="/opt/solr/dist" regex="solr-velocity-\d.*\.jar" />
<lib dir="/opt/solr/contrib/velocity/lib" regex=".*\.jar" />
<dataDir>${solr.data.dir:}</dataDir>
<directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
<codecFactory class="solr.SchemaCodecFactory"/>
<indexConfig>
<ramBufferSizeMB>256</ramBufferSizeMB>
<lockType>${solr.lock.type:native}</lockType>
</indexConfig>
<jmx />
<updateHandler class="solr.DirectUpdateHandler2">
<updateLog>
<str name="dir">${solr.ulog.dir:}</str>
</updateLog>
<autoCommit>
<maxDocs>10000</maxDocs>
<maxTime>60000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxDocs>10</maxDocs>
<maxTime>1000</maxTime>
</autoSoftCommit>
</updateHandler>
<query>
<maxBooleanClauses>1024</maxBooleanClauses>
<filterCache class="solr.FastLRUCache"
size="16384"
initialSize="4096"
autowarmCount="1024"/>
<queryResultCache class="solr.FastLRUCache"
size="16384"
initialSize="4096"
autowarmCount="1024"/>
<documentCache class="solr.FastLRUCache"
size="16384"
initialSize="4096"
autowarmCount="1024"/>
<enableLazyFieldLoading>true</enableLazyFieldLoading>
<queryResultWindowSize>20</queryResultWindowSize>
<queryResultMaxDocsCached>200</queryResultMaxDocsCached>
<useColdSearcher>false</useColdSearcher>
<maxWarmingSearchers>2</maxWarmingSearchers>
</query>
<requestDispatcher handleSelect="false" >
<requestParsers enableRemoteStreaming="true"
multipartUploadLimitInKB="2048000"
formdataUploadLimitInKB="2048"/>
<httpCaching never304="true" />
</requestDispatcher>
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">text</str>
</lst>
</requestHandler>
<requestHandler name="/update" class="solr.UpdateRequestHandler">
</requestHandler>
<requestHandler name="/update/json"
class="solr.JsonUpdateRequestHandler">
<lst name="defaults">
<str name="stream.contentType">application/json</str>
</lst>
</requestHandler>
<requestHandler name="/admin/" class="solr.admin.AdminHandlers" />
<requestHandler name="/admin/ping" class="solr.PingRequestHandler">
<lst name="invariants">
<str name="q">solrpingquery</str>
</lst>
<lst name="defaults">
<str name="echoParams">all</str>
</lst>
</requestHandler>
<queryResponseWriter name="json" class="solr.JSONResponseWriter">
<str name="content-type">text/plain; charset=UTF-8</str>
</queryResponseWriter>
</config>
sample scheme.xml
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="twitter" version="1.5">
<!-- types -->
<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true"
/>
<fieldType name="long" class="solr.TrieLongField"
precisionStep="0" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField"
precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tdate" class="solr.TrieDateField"
precisionStep="6" positionIncrementGap="0"/>
<fieldType name="text_cjk" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<charFilter class="solr.MappingCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.CJKWidthFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.CJKBigramFilterFactory" outputUnigrams="true"/>
</analyzer>
</fieldType>
</types>
<!-- fields -->
<fields>
<field name="key" type="string" indexed="true" stored="true"
required="true" />
<field name="status_id" type="tlong" indexed="true" stored="true"
required="true"/>
<field name="text" type="text_cjk" indexed="true" stored="true"
required="true"/>
<field name="from_user_id_str" type="string" indexed="true"
stored="true" required="true"/>
<field name="created_at" type="tdate" indexed="true" stored="true"
required="true"/>
<field name="_version_" type="long" indexed="true" stored="true"
multiValued="false"/>
</fields>
<uniqueKey>key</uniqueKey>
<defaultSearchField>text</defaultSearchField>
<solrQueryParser defaultOperator="AND"/>
</schema>
sample data add source code, python
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import datetime
# use https://github.com/toastdriven/pysolr
from pysolr import(
Solr,
)
def main():
s_time = datetime.datetime.utcnow()
print 'start.: ({})'.format(str(s_time))
solr = Solr('http://localhost:8983/solr/collection1', timeout=60)
docs = []
max_range = 22 * (10 ** 8) # Java Integer.MAX_VALUE over
for x in xrange(1, max_range):
docs.append(
{
'key': '{}'.format(x),
'status_id': x,
'text': '{} 番目の記事'.format(x).decode('utf-8'),
'from_user_id_str': '1',
'created_at': '2014-05-01T20:06:53Z',
}
)
if x % (10 ** 4) == 0:
solr.add(docs)
solr.commit()
docs = []
e_time = datetime.datetime.utcnow()
print '{} end.: ({})'.format(x, str(e_time - s_time))
solr.add(docs)
solr.commit()
docs = []
e_time = datetime.datetime.utcnow()
print 'end.: ({})'.format(str(e_time - s_time))
if __name__ == '__main__':
main()
--
----
山崎 一大 Tech Fun 株式会社
mailto:yamaz...@techfun.jp
〒110-0015 東京都台東区東上野1-7-15 野村不動産東上野ビル3階
TEL: 03-5816-0331(代) FAX: 03-5816-0332
会社Web: http://techfun.co.jp/
教育Web: http://techfun.jp/