rahulgoswami commented on code in PR #14607:
URL: https://github.com/apache/lucene/pull/14607#discussion_r2509109896
##########
lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java:
##########
@@ -402,11 +389,38 @@ private static void parseSegmentInfos(
}
long totalDocs = 0;
+
for (int seg = 0; seg < numSegments; seg++) {
String segName = input.readString();
byte[] segmentID = new byte[StringHelper.ID_LENGTH];
input.readBytes(segmentID, 0, segmentID.length);
- Codec codec = readCodec(input);
+ Codec codec = null;
+ try {
+ codec = readCodec(input);
Review Comment:
Please disregard my previous comment since my understanding of the
VERSION_74 and VERSION_86 variables was incorrect. They are just versions
assigned to the "segments" codec, and the variable names just represent the
Lucene version in which the "segments" file version was assigned. I now fully
understand what's happening and why the assertion fails when you changed to
`format = CodecUtil.checkHeaderNoMagic(input, "segments", VERSION_86,
VERSION_CURRENT);`
Moving to VERSION_86 means that CodecUtil.checkHeaderNoMagic() now throws
IndexFormatTooOldException for indexes before Lucene 8.6.0. The test fails the
assertion for version 8.0.0 on the line you mentioned since it now finds
indexStatus.clean=false. That is because the below call in
checker.checkIndex() fails:
` infos = SegmentInfos.readCommit(dir, fileName, 0);`
And the reason it passes for >=8.6.0 is because the last param (min
supported version) is 0, which means it proceeds to do an individual segment
check down the line, which passes because 8.x codecs are shipped and available
in class path!
**Essentially today if someone just removes the indexCreatedVersionMajor
check in SegmentInfos, they should be able to read indexes created in 8.x even
with Lucene 11 (!!)**
By similar reasoning with VERSION_74, the test was only failing for indexes
with version 7.4 to < 8.0.0 previously because the codec is not present. Maybe
a message in IndexFormatTooOldException representing both possibilities (too
old version or absent Lucene-backward-codecs) is a reasonable solution?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]