rahulgoswami commented on issue #12356:
URL: https://github.com/apache/lucene/issues/12356#issuecomment-1583204847
Based on the idea that @romseygeek proposed above, I tried the below change
in readByte(log pos). It works sometimes, but testing on a 5 million+ dataset I
get a CorruptIndexException some time into the indexing. Not sure why that
should happen, and only after making this change. Thoughts ?
@Override
public final byte readByte(long pos) throws IOException {
long index = pos - bufferStart;
if (index < 0 || index >= buffer.limit()) {
if(index<0) {
bufferStart=Math.min(pos, (bufferStart-bufferSize) < 0 ?
0:(bufferStart-bufferSize));
}else {
bufferStart = pos;
}
buffer.limit(0); // trigger refill() on read
seekInternal(pos);
refill();
index = 0;
}
return buffer.get((int) index);
}
Exception:
2023-06-08 03:14:58.040 ERROR (qtp391506011-24) [ x:techproducts]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Server error
writing document id c55ea05bded8d478d1942535f30c5791001 to the index =>
org.apache.solr.common.SolrException: Server error writing document id
c55ea05bded8d478d1942535f30c5791001 to the index
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:246)
org.apache.solr.common.SolrException: Server error writing document id
c55ea05bded8d478d1942535f30c5791001 to the index
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:246)
~[?:?]
.....
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter
is closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:886)
~[?:?]
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:900)
~[?:?]
at
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1477)
~[?:?]
at
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1473)
~[?:?]
at
org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:973)
~[?:?]
at
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:342)
~[?:?]
at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:294)
~[?:?]
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:241)
~[?:?]
... 68 more
Caused by: org.apache.lucene.index.CorruptIndexException: invalid state:
base=0, docID=93773
(resource=SimpleFSIndexInput(path="C:\Work\Solr\solr-8.11.1\example\techproducts\solr\techproducts\data\index\_13xx.fdt"))
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.copyChunks(CompressingStoredFieldsWriter.java:560)
~[?:?]
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.merge(CompressingStoredFieldsWriter.java:633)
~[?:?]
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:228) ~[?:?]
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:105)
~[?:?]
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4788) ~[?:?]
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4392)
~[?:?]
at
org.apache.solr.update.SolrIndexWriter.merge(SolrIndexWriter.java:201) ~[?:?]
at
org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5951)
~[?:?]
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
~[?:?]
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
~[?:?]
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]