Re: [Beowulf] Whats up with these newer Intel NICs?

jeff.johnson Tue, 25 Sep 2007 10:58:42 -0700

Joe Landman wrote:

(responses embedded)

Jeff Johnson wrote:

Joe,


   I think you may be dealing with a PCIe fifo issue.


Hi Jeff:

  Possibly.  I had thought about that.  I was thinking more along the
lines of "it is a motherboard NIC, so we don't need no steenkeen high
performance things like 64 bit buffers ..."

The controller in question is a LOM component but not quite a "desktopgrade" chip. It is a server class controller. IMHO, Intel and othersilicon spinners haven't quite grasped the difference between"enterprise grade" and "hpc grade". I think it is highly likely thatthis chip can cut your storage cluster mustard but the driver and i/ooptions may not run well for your application without some Kentucky windage.

   I have seen issues with the Intel PCIe gigabit ethernet onboard parts
when compared to PCIe slot cards and PCIX cards like the ones you are
testing. Specifically the partitioning of the controller's buffers
between rcv and xmit operations (internal to the controller chip itself)
and the controller's relationship with the PCIe buffer on the
northbridge. PCIe, being serial, has different challenges when reaching
the top end of a device's performance capabilities. In this case you are
suffering some buffer throttling.


I played with some (OS/NIC) buffer settings, txqueuelen, and a few other
tunables.  Nothing seems to have impacted it.

The way the Intel controller and e1000 driver interact is that the e1000driver sets up the rcv buffer at initialization time and the *remainder*is left for xmit. This is not something that can be adjusted usingethtool or a modload option. You have to get into the e1000 driversource, find the rcv buffer size definition and then change it to suityour evil needs. Recompile and enjoy. Here is where the Kentucky windagecomes in as you may have to try a few values. Lather, rinse, repeatuntil you get it right.

   By default the buffers are partitioned for "one size fits most"
scenario. If you know your i/o profile you can use ethtool (or modify


Yeah ...

e1000 driver source) to repartition the controller's fifo to favor rcv
or xmit operations. This results in better performance in situations
where you know you will have heavier writes over reads or vice versa.


Of course, though without knowing your workload in advance you can't
really tune this.

Aside from that, I can't say I have seen many people tune their storage
clusters for workloads of one particular type.  You basically never know
what users will throw your way, and you really don't want one "corner
case" test being the important thing that drives down overall performance.

Unless you are building a generic use resource it is possible to figureout if the environment is favoring reads over writes, etc. You don'thave to be exact. Now you are dealing with a 50/50 balance in terms ofyour ethernet and PCIe rcv/xmit buffer resources. Moving to 60/40 infavor of one direction could make the difference in terms of exhaustingyour buffer resources and experiencing the slow down.

I could bury a 8 node mpich run of Pallas on the 82573 (first gen Intelgigabite PCIe LOM) until I monkied with the buffer settings. RunningPallas, even at very small message sizes, the buffers were gettingburied so bad that it wasn't a matter of slow down but rapidlyincrementing dropped packets.


Any chance you are running jumbo frames? If so, turn it off and retest.

Also, use the e1000.sourceforge,net driver. If you are using a driverfrom Intel or a distro, ditch it.

One of the comments in your original message is key. PCIX works, PCIe isslower. With PCIe being serial you have the ethernet buffering and thePCIe buffering to contend with as well.

   *OR* it is because you are using a Supermicro motherboard..  =)


Owie ... that left a mark ...

Try deploying a 256 node cluster with a motherboard defect that theywouldn't acknowledge. That leaves a mark as well as hefty bar tabs.

I thought it was that I hadn't given the appropriate HPC deity their
burnt (processor) offering ...

The gods prefer FBDIMMs these days..


--
Best Regards,

Jeff Johnson
Vice President
Engineering/Technology
Western Scientific, Inc
[EMAIL PROTECTED]
http://www.wsm.com

5444 Napa Street - San Diego, CA 92110
Tel 800.443.6699  +001.619.220.6580
Fax +001.619.220.6590

"Braccae tuae aperiuntur"

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Whats up with these newer Intel NICs?

Reply via email to