Hi Jerry:
Xu, Jerry wrote:
Hi, Thanks, Joe.
I am not meaning to "ban" anything immediately, I am just curious how often
this happen to the HPC community.
Perl/shell is really strong tool, one example is to use loop to submit huge
mount of jobs and puts burden on scheduler server,
Thats what the scheduler is for though. Some can't handle large loads
of jobs very well. We have had no trouble with users/customers dumping
thousands of jobs into LSF and SGE. Other schedulers may or may not be
able to handle this well. I have had conversations with some folks who
believe that one should never have more than 50 or so jobs in queue at
any one time. I don't agree with that, but they indicated that their
queuing system breaks if they tried.
the other example is to have
one job sit idle and frequently to use system call to detect the job status and
resubmit jobs again and again;
Depends upon whether or not it runs in a scheduler bubble. That is,
under your scheduler, do your node allocations. Then have your server
thread handle distribution to client threads on the allocated nodes.
This is fine if they implement it well. mpiBLAST is a variant of this
using MPI and an internal scheduler. You can run it nicely in an
existing larger resource manager.
the other example is that use system call and ssh
to each node and run stuff and bypass the scheduler...
Ok, this one isn't good. You should see if they can be persuaded to
work within the job scheduler via some method. Otherwise it can be painful.
It just drives me crazy
sometime.
Understood.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf