I have implemented a tool for translation validation (similar to Alive2
for LLVM). The tool takes GIMPLE IR for two functions and checks that the
second is a refinement of the first. That is,
* The returned value is the same for both functions for all input that
does not invoke undefined behavior.
* The second does not have any undefined behavior that the first does not
have.
* The global memory is the same after invoking both functions with input
that does not invoke undefined behavior.
The implementation is rather limited, but I have already found a few bugs
in GCC (PR 106513, 106523, 106744, 106883, and 106884) by running the tool
on the c-c++-common, gcc.c-torture, gcc.dg, and gcc.misc-tests test
suites.
The tool is available at https://github.com/kristerw/pysmtgcc and there
is some additional information in the blog post
https://kristerw.github.io/2022/09/13/translation-validation/
/Krister