I think we are massively overengineering this. We shouldn't need to future-proof tests for changes that *might* be added in the future, and having 32x32 hashes in a table at the start of the test is not a big deal (it took less than a second to generate and it'll take the same amount of time if we extend to 64 bytes in the future) but it does illustrate that we are perhaps taking a bad approach to testing here.
I wrote these tests begrudgingly because I believed that this would be the quickest way to get the code included. If we're going into a back and forth from here then let's go from the top: === New change: computes CRC32 in 8-byte blocks where the data is long enough and is aligned New codepaths we need to hit: reading 8-byte blocks, and reading unaligned data === Let's look at some the scenarios for how the code could potentially be broken, and how we could detect this: === Bug: code incorrectly generates checksum in the slice-by-8 code Detection: test with a string at least 8 bytes long (more if we take alignment into account), hits the slice-by-8 codepath Detection: test with a string more than bytes long, hits the byte-by-byte codepath === === Bug: code triggers a CPU fault when reading an unaligned word Detection: test with a string that starts on an unaligned word (string cannot end on an unaligned word), hits the slice-by-8 codepath and confirms it can handle unaligned data correctly === It doesn't matter how large words are, reading from an aligned word +1 or -1 will always be unaligned. If there are other potential bugs that we've missed then please let me know. If the proposed detections do not detect these bugs then please let me know. I can't imagine a bug that would be caught by testing a 32x32 loop that wouldn't be caught by the 3 detections mentioned here. On Tue, 22 Oct 2024 at 14:26, Bruno Haible <br...@clisp.org> wrote: > Hi Simon, > > > How about an aggregated trick, incrementally hashing each output crc > value and at the end just compare the hash against an expected value? > Instead of comparing each incremental crc value against a table. > > While this needs fewer expected values, it makes debugging harder > in case of a failure. > > Unit tests frequently fail during development, and I find it very useful > to be able to debug the particular failing step directly, using gdb. > Delaying a failure until the end of the computation makes debugging hard. > > Bruno > > > >