Re: [Beowulf] Correct networking solution for 16-core nodes

2006-08-01 Thread Mark Hahn
With your previous suggestions 8 months ago we bought a Tyan S4881 server with 8 dual-core Opteron CPUs with 64GB RAM. Now we will buy new ones (2 more for the time being) and we will eventually planning to form a cluster from these servers, which will have at most 8 boxes. Now, as you guess, the

Re: [Beowulf] Re: Feedback on large pages in Linux ([EMAIL PROTECTED])

2006-08-01 Thread Eric Thibodeau
Le dimanche 30 juillet 2006 00:53, Robert G. Brown a écrit : [...snip...] > Y'know, except for actually going to a classroom and teaching, I could > do most of the rest of what I do sitting right where I'm sitting now, in > between trips down to the lake in search of dinner... > > rgb We all k

Re: [Beowulf] scheduler and perl

2006-08-01 Thread Chris Dagdigian
As Joe mention, the way we handle this is by using cluster schedulers sitting on robust hardware platforms that are capable of handling large numbers of job submissions without problems. Grid Engine and Platform LSF are two capable products that come to mind and scale well. The fact that

Re: [Beowulf] scheduler and perl

2006-08-01 Thread Stuart Midgley
Rather than banning them or publishing behaviour, we simply disable their access to the queue and won't re-instate it until they have contacted us to indicate they have rectified their ways. eg. rather than enforce hard disk quotas, which may result in a job loosing data, we simply detect w

Re: [Beowulf] scheduler and perl

2006-08-01 Thread Joe Landman
Diego M. Vadell wrote: Maybe you can collect some logs and make a list of misbehaving users. Then you can warn the users that you are collecting that information, that bypassing the system will degrade the performance for everybody, and that you may post that information. Sometimes the perspec

Re: [Beowulf] scheduler and perl

2006-08-01 Thread Diego M. Vadell
> Hi Jerry: >> the other example is that use system call and ssh >> to each node and run stuff and bypass the scheduler... Torque 2.1.2 has just been released. It comes with a pam module that, if I understood it right, makes harder (though not impossible) for users to bypass the batch system. The

RE: [Beowulf] scheduler and perl

2006-08-01 Thread Xu, Jerry
Hi, Thanks, Joe. I am not meaning to "ban" anything immediately, I am just curious how often this happen to the HPC community. Perl/shell is really strong tool, one example is to use loop to submit huge mount of jobs and puts burden on scheduler server, the other example is to have one job sit idl

[Beowulf] what is the "best" file system

2006-08-01 Thread Xu, Jerry
Hi, all: We are going to build new cluster and we are thinking about various solutions for file I/O. We have lots embarrassed parallel applications that read huge data (giga byte) only in the beginning and write only at the end concurrently, and each node run job independently. What is the bes

Re: [Beowulf] scheduler and perl

2006-08-01 Thread Joe Landman
Hi Jerry: Xu, Jerry wrote: Hi, Thanks, Joe. I am not meaning to "ban" anything immediately, I am just curious how often this happen to the HPC community. Perl/shell is really strong tool, one example is to use loop to submit huge mount of jobs and puts burden on scheduler server, Thats what t

Re: [Beowulf] scheduler and perl

2006-08-01 Thread Joe Landman
Hi Jerry: Its generally a good idea to talk to your users, understand what it is they are doing, and see if you can help them, rather than simply "banning" things. The result of bans of deeply embedded practices usually results in some ... exciting ... meetings, emails, and telephone calls

[Beowulf] scheduler and perl

2006-08-01 Thread Xu, Jerry
Hi, I am maintaining a cluster while lots user uses perl to submit tons of jobs which seems to me like abusing the system. Does everybody meet the same situation? Many user us system call in the perl to do "qsub", shall I ban this? I don't know exactly why it is bad, but it looks to me really bad

[Beowulf] Oscar & MPICH2

2006-08-01 Thread Reza Mirani
Dear Sir,   I want to use Oscar software that include the MPICH2, or upgrade my Oscar MPICH to MPICH2 , but I dont khnow how .   Please give me some helps about upgrading oscar libraries or Oscar verjen that include MPICH2.   Best Regards.**Reza MiraniElectronic department Iran A

[Beowulf] Correct networking solution for 16-core nodes

2006-08-01 Thread Tahir Malas
Hi All, With your previous suggestions 8 months ago we bought a Tyan S4881 server with 8 dual-core Opteron CPUs with 64GB RAM. Now we will buy new ones (2 more for the time being) and we will eventually planning to form a cluster from these servers, which will have at most 8 boxes. Now, as you gues