Alexander Puchmayr schrieb:
> Am Mittwoch 03 Juni 2009 schrieb Florian Philipp:
>> Do you have a spare network adapter, maybe an older 100MBit PCI card?
>> Maybe we should rule out a hardware fault on your ethernet chipset first.
>>
> I already thought on this, but the results of my tests dont indicate a 
> hardware fault on the ethernet chipset, because:
> 
> * I can run a ping -f to the machine, it runs for hours without the 
> slightest problem
> * As long as files transfered are small enough (i.e. they fit in the cache 
> buffer on the server) and the server has enough time to write back it to 
> the disk, there is no problem
> * If I explicitly force the ethernet link to be 100FD instead of gigabit, 
> the is also no problem. So I don't expect any error using another 100MBit 
> card.

I would cross-check that anyway just to be sure. Other nic, other
kernel-module ... etc

> For me it looks like as if the following is happening:
> 
> * Memory gets filled up with cached files, no problem so far
> * If no more physical ram is available, the system tries to free some memory 
> internally, e.g. by flushing the caches. 
> *  If releasing cache entries and writing back data to their respective 
> files does not perform fast enough, an internal memory allocation may not 
> succeed, and I see the "page allocation failure" messages, with different 
> processes/kernel threads in the first line.
> * I assume that most of the internal kernel threads don't get a problem in 
> this situation, but there may be some critical parts where we do. Hence, it 
> might just be a matter of probability whether it encounters such a critical 
> part, and the probabilty increases with the MB/s the data is put to the NFS 
> server.

errm, I dunno ... but how would then smaller and slower nfs-servers run
fine? Sounds unlikely to me.

Any special network-settings used? buffer-sizes, MTU, jumbo frames?
switch problems (you seem to have tried auto-negotiation off already).

Stefan

Reply via email to