mikemccand commented on pull request #128:
URL: https://github.com/apache/lucene/pull/128#issuecomment-849677698


   OK, I have good news and bad news.
   
   Good news first!  I wrote a [simple little Python 
tool](https://github.com/mikemccand/luceneutil/commit/77ef7e6708ccaed7077ef83e009da8e3b91f45ad)
 to randomly flip a random bit in a random file in a provided directory.
   
   Bad news!  I ran the tool, confirmed it seems to flip just the one bit, then 
ran the `CheckIndex` here, and no corruption was detected!!  Then I also ran 
`CheckIndex` from a clean `main` checkout, and we still fail to detect the 
corruption.  WTF?  Surely the bit flip would alter the checksum and we should 
have detected that in `CheckIndex`?  Or is it possible `CheckIndex` does not 
actually fully `checkIntegrity` too?
   
   For the record, this is how I ran the new bit-flipper tool:
   
   ```
    python3 -u /l/util/src/python/flip_random_bit.py 
/l/indices/trunk.nightly.index.prev.broken/index -seed 7 -real
   ```
   
   and this is its output:
   
   ```
   python3 -u /l/util.nightly/src/python/flip_random_bit.py 
/l/indices/trunk.nightly.index.prev.broken/index -seed 7 -real
   
   RANDOM SEED: 0x7
   
   Directory has 302 files:
     _32.fdm
     _32.fdt
     _32.fdx
     _32.fnm
     _32.kdd
     _32.kdi
     _32.kdm
     _32.nvd
     ...
       _h2.fdm
     _h2.fdt
     _h2.fdx
     _h2.fnm
     _h2.kdd
     _h2.kdi
     _h2.kdm
     _h2.nvd
     _h2.nvm
     _h2.si
     _h2_Lucene90HnswVectorFormat_0.vec
     _h2_Lucene90HnswVectorFormat_0.vem
     _h2_Lucene90HnswVectorFormat_0.vex
     _h2_Lucene90_0.doc
     _h2_Lucene90_0.dvd
     _h2_Lucene90_0.dvm
     _h2_Lucene90_0.pos
     _h2_Lucene90_0.tim
     _h2_Lucene90_0.tip
     _h2_Lucene90_0.tmd
     segments_2
     write.lock
   
   **WARNING**: this tool will soon corrupt bit 39544 (of 152368 bits) in 
/l/indices/trunk.nightly.index.prev.broken/index/_gm.kdi!!!
   
   Be really certain this is what you want... you have 5 seconds to change your 
mind!
   
   5...
   4...
   3...
   2...
   1...
   
   **BOOOOOOOM**
   ```
   
   And then `cmp` and `diff` confirm the file is indeed changed, yet 
`CheckIndex` (with or without this PR) doesn't catch it.  I'll try a few more 
bit flips.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to