Hi folks, this message is rather lengthy. If you don't feel like reading all of it please don't bother to answer. You'll need the whole lot to get the picture. ;-)
I have run into a weird network problem with 1Gb NICs. It involves these two boxes: Box A P4 2.8Ghz HT 512GB ram Tigon Gb NIC (module tg3) IDE drives Box B Xeon 2.6Ghz HT 512GB ram Intel Pro/1000 Gb NIC (module e1000) SCSI RAID5 The two of them are connected by a cross-over cable. So nothing else is on that network, kinda peer-to-peer connection. Both boxes are running *exactly* the same gentoo software. I emerged it on one box, tarred it up, copied it over to the other one and made the config changes like IP addresses, names and such. Kernel is 2.6.12-gentoo-r6. Of course, box B loads the SCSI modules. All file transfers I am talking about are done with a file "all.tar.bz2" of the size of 1088MB. Both boxes are idle otherwise. Neither box runs services like FTP or HTTP. So I have to resort to other protocols to transfer files. Both do run NFS and SSH. Case 1: I log into A and NFS mount B's /tmp on A's /mnt/floppy and cd to /tmp. "cp /mnt/floppy/all.tar.bz2 ." (receiving on A) as well as "cp all.tar.bz2 /mnt/floppy" (sending from A) result in a sustained transfer rate of 2xMB/s. That's to be expected because it involves an IDE drive on A, and that's about the limit of current IDE drives (though 1Gb NICs can transfer data at about 4 to 5 times that rate). It also confirms that both Gb NICs are performing though it doesn't confirm they are getting near their theoretical limits (the latter unimportant in this case). Case 2: I log into A and sftp into B. "get all.tar.bz2" (receiving on A) transfers the file at 2xMB/s, same as in case 1. CPU utilisation is up to 40-50% due to encryption. Still, encryption does not slow down the transfer rate by any significant amount. This can be expected with the CPUs involved. Case 3: I log into A and sftp into B. "put all.tar.bz2" (sending from A) transfers the file at 3.7MB/s!!!!! This is far slower than on a 100baseT network where I get transfer rates of about 10MB/s with the network being the bottleneck rather than the harddisks. CPU utilisation is down to about 10%, indicating that something else than encryption is throttling the transfer. This is odd! Case 4: I log into B and try to NFS mount A's /tmp to B's /mnt/floppy. It returns with an RPC timeout. So I can't do the "cp" test from B. Case 5: I log into B and sftp into A. It sits there for about 10 seconds before presenting me with a password prompt. ???? After, I get transfer rates close to case 2 and case 3, just the other way round. I am puzzled. First I thought that the Gb NIC on box A is somehow kaput but case 1 surely shows it is performing. What the heck is going on here? I would be deeply indebted to any person on this list that could shed some light on this. Any hint what to investigate would be highly appreciated. Really. This has troubled me for the last three days and I would go as far as ship you a Windhoek Lager. ;-) Uwe -- 95% of all programmers rate themselves among the top 5% of all software developers. - Linus Torvalds http://www.uwix.iway.na (last updated: 20.06.2004) -- gentoo-user@gentoo.org mailing list