MarcusSorealheis commented on issue #12167: URL: https://github.com/apache/lucene/issues/12167#issuecomment-1533798450
Well, this has been one hell of a treasure hunt. Based on the stack trace, the addressable problem starts here: https://github.com/apache/lucene/blob/caeabf39309a91997d361b4104bda105d16ae720/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java#L1235 After adding a bunch of log statements and blowing up the GC.. What are learned from the logs and is evident in the error message, that we are trying to read term bytes longer than the array.length. At first glance this does not seem possible. Could there be some data corruption issue in `input.readBytes(term.bytes, prefixLength, suffixLength);` from https://github.com/apache/lucene/blob/caeabf39309a91997d361b4104bda105d16ae720/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java#L1105 I'm not sure what could be going on but it seems potentially like a serious issue. Have we previously observed data corruption issues on input streams? I'm really stumped as to where the 813 (or any number `> 80` is coming from). If it's important or helpful, I could look into it more. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org