dungba88 opened a new issue, #12697: URL: https://github.com/apache/lucene/issues/12697
### Description After writing the FSTStore-backed FST to DataOutput, and specifying a different DataOutput for meta, if we try to read from these (using the FST public ctor) we will get the following the exception: ``` java.lang.ArrayIndexOutOfBoundsException: Index 17 out of bounds for length 17 at __randomizedtesting.SeedInfo.seed([CBCB30F6D2F8FEA1:821F24747AC56DDD]:0) at org.apache.lucene.store.ByteArrayDataInput.readVLong(ByteArrayDataInput.java:133) at org.apache.lucene.util.fst.FST.<init>(FST.java:494) at org.apache.lucene.util.fst.FST.<init>(FST.java:443) ``` The reason is that, when writing to metadata, if the FST is backed by FSTStore, it would not write the numBytes: https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/fst/FST.java#L555-L562 The numBytes is instead written by FSTStore to the main DataOutput: https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/fst/OnHeapFSTStore.java Thus if we set the metaOut and dataOut as the same DataOutput, they will subsequently write the numBytes correctly. However if we use different DataOutput, the metaOut will thus lack of the numBytes and cause the index out of bounds exception. To illustrate: When writing on the same DataOutput ``` [ HEADER ] [ EMPTY_OUTPUT_FLAG ] [ EMPTY_OUTPUT ] [INPUT_TYPE ] [ START_NODE ] [ NUM_BYTES ] [ MAIN ] ``` When writing on the different DataOutput ``` metaOut: [ HEADER ] [ EMPTY_OUTPUT_FLAG ] [ EMPTY_OUTPUT ] [INPUT_TYPE ] [ START_NODE ] dataOut: [ NUM_BYTES ] [ MAIN ] ``` I can put a fix to this ### Version and environment details _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org