Alexander Puchmayr schrieb: > Am Mittwoch 03 Juni 2009 schrieb Florian Philipp: >> Do you have a spare network adapter, maybe an older 100MBit PCI card? >> Maybe we should rule out a hardware fault on your ethernet chipset first. >> > I already thought on this, but the results of my tests dont indicate a > hardware fault on the ethernet chipset, because: > > * I can run a ping -f to the machine, it runs for hours without the > slightest problem > * As long as files transfered are small enough (i.e. they fit in the cache > buffer on the server) and the server has enough time to write back it to > the disk, there is no problem > * If I explicitly force the ethernet link to be 100FD instead of gigabit, > the is also no problem. So I don't expect any error using another 100MBit > card.
I would cross-check that anyway just to be sure. Other nic, other kernel-module ... etc > For me it looks like as if the following is happening: > > * Memory gets filled up with cached files, no problem so far > * If no more physical ram is available, the system tries to free some memory > internally, e.g. by flushing the caches. > * If releasing cache entries and writing back data to their respective > files does not perform fast enough, an internal memory allocation may not > succeed, and I see the "page allocation failure" messages, with different > processes/kernel threads in the first line. > * I assume that most of the internal kernel threads don't get a problem in > this situation, but there may be some critical parts where we do. Hence, it > might just be a matter of probability whether it encounters such a critical > part, and the probabilty increases with the MB/s the data is put to the NFS > server. errm, I dunno ... but how would then smaller and slower nfs-servers run fine? Sounds unlikely to me. Any special network-settings used? buffer-sizes, MTU, jumbo frames? switch problems (you seem to have tried auto-negotiation off already). Stefan