On 2016-10-17, Karel Gardas <[email protected]> wrote: > 1) use machine with proper ECC support > 2) man sendbug -- and following it report your OpenBSD kernel misbehavior
This can be a hard thing to report. When the machine totally locks up, it is very difficult to get the information needed to make a bug report, often it is not known exactly how to trigger it, or whether it's software bug, bit flip, or a hardware fault. Sometimes you can get useful information from monitoring the machine in the run-up to a failure - symon (in ports) can be useful for logging things to a remote machine at an interval which is often fast enough to give clues into what might be happening. But unless you have a reproducible case, or something which happens randomly but fairly often, you can be watching for a long time and not really not exactly what to be monitoring. On the other hand if you do have a *reproducible* way to trigger such a bug, that's of great interest. > On Mon, Oct 17, 2016 at 3:48 PM, Tinker <[email protected]> wrote: >> Sometimes a machine goes unresponsive. In this case, a non-ECC RAM machine. >> >> The reason could be that something in the hardware or kernel failed, e.g. a >> bit flip error [1]. >> >> In this case (for a non-kernel developer), tough luck, and the proper thing >> would be to reboot, and keep statistics over failures on that machine and >> replace the hardware should the crashes go above some frequency threshold. If you're not running an up-to-date release, please do so: stefan@'s work on amap in the 5.9-6.0 timeframe certainly helps some cases - one of the post-6.0 errata might also apply with very large allocations, so 6.0-stable or -current would be advisable.

