Looking at a test case provided by Bud Davis, I see that we're writing out (and reading back) each element of the array individually, which then gets blocked and written by the library in page-sized chunks:
write(3, "\0\0\0\0\0\0\0\0\1\0\0\0\2\0\0\0\3\0\0\0\4\0\0\0\5\0\0"..., 8192) = 81 92 write(3, "\377\7\0\0\0\10\0\0\1\10\0\0\2\10\0\0\3\10\0\0\4\10\0\0"..., 8192) = 8 192 ... mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2000001a000 munmap(0x2000001a000, 16384) = 0 mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0x4000) = 0x2000001a000 munmap(0x2000001a000, 16384) = 0 Surely we should be writing out (and reading back) the entire array (or at least columns) at a time. I will also note in passing that mapping and unmapping memory is a relatively expensive operation. You only want to do it if you (1) actually require the sharing semantics or (2) can map enough to make the overhead pay off. The 16k sliding window seen here does not meet either requirement. In the case of these unformatted reads, the most efficient thing is a read directly into the array and not fiddling with mmap at all. -- Summary: Unformatted i/o on large arrays inefficient Product: gcc Version: 3.5.0 Status: UNCONFIRMED Severity: enhancement Priority: P2 Component: libfortran AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rth at gcc dot gnu dot org CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16339