I am currently working on the file system reliability issues. I have a disk driver that is able to simulate crash disk sites after injected power failures (inspired by two OSDI'14 papers about crash sites, and they found interesting bugs in many production systems like database). This disk is compatible with the Linux block driver semantics (refer to https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt), and may create many crash sites that pending blocks are partially flushed into the disk.
Our tool finds that a typical compiler (e.g., gcc) may suffer the issue of crash inconsistency. Specifically, there is a chance that for the binary output file (e.g., a .o file): 1. its timestamp is updated and gmake considers this file is up-to-date. 2. its actual data is not persisted to the disk. On an ext4 filesystem (default setting) of a typical Linux distribution, we observed that there is a chance of leaving a 0-byte output file whose timestamp is updated. In more relaxed settings (e.g., old-time filesystems), a system crash would leave partially corrupted file in the filesystem with timestamp updated (e.g., several blocks are missing but with a correct header). Note that this is NOT a defect for gcc or gmake as they have nothing to do with the crash semantics. However, if the user continues the incremental build after system crash, the entire thing would proceed, gmake will consider the generated .o file is up-to-date and proceed into the next stages, finally leading to incorrect outputs. Though it is not a software defect, and is expected to be very rarely in practice. Neverthless, gmake is supposed to be general and to run on any platform. I am wondering if we should make users aware of this phenomenon (e.g., adding a section in the document). Regards, Yanyan Jiang 蒋炎岩 Institute of Computer Software, Dept. of Computer Science, Nanjing University _______________________________________________ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make