Hi Jerry:

Xu, Jerry wrote:
Hi, Thanks, Joe.
 I am not meaning to "ban" anything immediately, I am just curious how often
this happen to the HPC community.
Perl/shell is really strong tool, one example is to use loop to submit huge
mount of jobs and puts burden on scheduler server,

Thats what the scheduler is for though. Some can't handle large loads of jobs very well. We have had no trouble with users/customers dumping thousands of jobs into LSF and SGE. Other schedulers may or may not be able to handle this well. I have had conversations with some folks who believe that one should never have more than 50 or so jobs in queue at any one time. I don't agree with that, but they indicated that their queuing system breaks if they tried.

the other example is to have
one job sit idle and frequently to use system call to detect the job status and
resubmit jobs again and again;

Depends upon whether or not it runs in a scheduler bubble. That is, under your scheduler, do your node allocations. Then have your server thread handle distribution to client threads on the allocated nodes. This is fine if they implement it well. mpiBLAST is a variant of this using MPI and an internal scheduler. You can run it nicely in an existing larger resource manager.

the other example is that use system call and ssh
to each node and run stuff and bypass the scheduler...

Ok, this one isn't good. You should see if they can be persuaded to work within the job scheduler via some method. Otherwise it can be painful.

It just drives me crazy
sometime.

Understood.

--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to