I am seeing the same behavior with the following setup while running quantum computations that generate large 50GB+ scratch files. Linux <nodename> 3.2.61-030261-generic #201407112035 SMP Sat Jul 12 00:36:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
The problem seems to occur with similar kernel error messages at random times in the middle of mostly writing (sometimes reading) to the large scratch files on the RAID array. I set the array up using the intel Rapid Storage technology ROM. I'm using the Ubuntu 14.04 (saw with 12.04 too) flavor but the 3.2 kernel. I'm running RAID10 with 4 1TB drives. The system is running off a separate disk. The RAID array is only for the large scratch files. Are we running out of address space or buffers? Hope this info is useful to someone more knowledgeable about I/O than me. Jonathan