http://www.hpcwire.com/hpc/1805360.html
http://www.pnl.gov/topstory.asp?id=275
4620 2.2 GHz quad-core Barcelona, presumably 2-socket nodes, 2GB/core
with pretty agressive IO setup.
if anyone involved in this cluster is reading the list,
it would be most appreciated to see some comments re:
On Sat, 29 Sep 2007, Ivan Paganini wrote:
> I sniffed the network in the store nodes interface, and i got lots
> of TCP lost fragment, previos lost fragments, ack lost fragments
> and TCP window size full.
Some suggestions would be to check that all network interfaces are
negotiating gigabit bac
and ran using
mpich.ch_mx -v -machinefile list -np 4 ./program
This still involves ethernet?
I think that would work fine. you can simply run tcpdump
on the eth interface one of the target machines to test, though.
my experience is that it's naive to assume a vendor has a clue:
_someone_ at t
Hello Mark!
2007/9/29, Mark Hahn <[EMAIL PROTECTED]>:
> > I sniffed the network in the store nodes interface, and i got lots of
> > TCP lost fragment, previos lost fragments, ack lost fragments and TCP
> > window size full. The GPFS is now heavily used.
>
> so this indicates that you have a seriou
I sniffed the network in the store nodes interface, and i got lots of
TCP lost fragment, previos lost fragments, ack lost fragments and TCP
window size full. The GPFS is now heavily used.
so this indicates that you have a serious ethernet problem, no?
The myrinet connection was working right,
Thank you, Bruce, I will try as soon I have access to the cluster.
I already contacted Myricom support, John, and they are working to try
to solve this, but still no solution to the problem. mx_counters in
the two nodes that I am trying the test mpich programs dont show
anything unusual:
1 ports
On Fri, 2007-09-28 at 17:43 -0300, Ivan Paganini wrote:
> Hello everybody,
>
> I am beginning to take care of an IBM's JS21. The cluster consists of
> The myrinet connection was working right, but sometimes a user program
> just got stuck - one of the processes was sleeping, and all others
> were