On Mon, Mar 16, 2015 at 09:42:11PM -0700, Keith Lofstrom wrote: > The most recent closed source Memtest86 v6.0 has a ramhammer test. > Available for gratis download to a USB key.
Proprietary memtest v6.0 will only load with UEFI boot, which I only have on my desktop compute server, not my older laptops. After 10 hours of testing with ramhammer on the server with 12GB of Patriot Viper 3 DDR3 Non-ECC memory, no errors yet. Many problems can make a memory more sensitive to the ramhammer exploit. DRAM stores a "1" or "0" charge on a tiny capacitor in a bit cell. Address lines turn on a bit cell transistor that connects this capacitor to a precharged bit line (with much higher capacitance). This makes a small voltage change on the bit line, which is amplified back to a 1 or 0 with a sense amplifier connected directly to the bit line. That provides a signal suitable for readout, and also refreshes the charge on the bit cell capacitor. Throughout this analog amplification process, there are many opportunities for signal error. If the bit lines are not given enough time to settle out from the previous operation to a perfectly neutral precharge level, then some of the previous signal will mix into the current read operation. If the power supplies feeding the RAM chip are out of specification (worn-out electrolytic capacitors can cause this), similar errors can occur. A computer system that is marginally functional under gentler conditions may generate many errors under extreme stress, and the first symptoms might be this bit leakage problem. Since the bit cell transistors are slightly leaky, refresh is necessary, and most RAM designs specify refresh within 60 milliseconds or sooner. Random access refreshes most bit cells often, but an additional refresh counter in the RAM controller circuit makes sure by cycling through all the addresses every 60 milliseconds, delaying operations if necessary. These refresh cycles can reduce performance by a small fraction of a percent, but mostly they happen when memory access is idle. Cache means more idle time, multicore access means less. Multicore multithreaded operation makes ramhammer patterns difficult to create. Bit cell leakage approximately doubles for every 10C of temperature increase. More airflow and cooler operation increases reliability, durability, and security. Noctua fans (made in Austria) move a lot of air relatively quietly. Ramhammer vulnerability is probably rare and difficult to exploit. On the other hand, all so-called digital systems are actually analog systems with low but not zero levels of error. Stressing them (overclocking, heat, bad power abnormal operation) increases errors exponentially. Error correction is cryptographically insecure. Hard disks store analog signals as well, and use probabilistic Viterbi decoding filters to convert these (again tiny) signals into mostly reliable binary bits. But not perfectly reliable. So, some entropy leaks into most "binary" processes, and with it physical and cryptographic vulnerability. As long as we insist on clocking von Neumann (shared memory) architectures at the ragged edge of failure, rather than using Harvard architectures (with separate memories for instructions and data) in low stress conditions, failure is a fact of life. Von Neumann is cheaper, as is using the same communication channels for data and control, but some of the money saved on hardware will be spent cleaning up after security breaches. Cheap, fast, good. Pick two. Keith -- Keith Lofstrom [email protected] _______________________________________________ PLUG mailing list [email protected] http://lists.pdxlinux.org/mailman/listinfo/plug
