Management sometimes reads these and says “why do we keep getting these
requests, just install everything.”
Sounds like to need to sit down with management and talk to them, too.
In general "just install everything" is not a good idea.
Our node images are in RAM, generally, so putting a bunch of extra stuff in
them for no good reason does have an impact, although it’s less than it was
when installed memory was lower, and it also all needs to be transferred on a
cold start.
All the more reason to restrict what software is installed on the
compute nodes. The Blue Gene /L system I maintained when I was at RU (as
does all Blue Genes) used a minimal OS that was resident in RAM only. As
a result, all applications had to be cross-compiled on the login node
and linked statically, since the CNK (compute node kernel - the name of
the compute node OS) didn't include all the dynamic libraries (I also
think the OS didn't even provide a dynamic linker!). For the Blue
Gene/Q, they did start supporting dynamically linked executables, but I
don't know what changed to the OS to allow that.
If you're going with a diskless OS like that, I think you need to be
very sparing in what you include in your image. If management wants you
to 'install everything', on the compute nodes, I think you'll need to
switch to a disk-based OS to keep your sanity.
Prentice
On 10/23/2018 02:35 PM, Ryan Novosielski wrote:
On Oct 23, 2018, at 2:10 PM, Greg Lindahl <lind...@pbm.com> wrote:
On Tue, Oct 23, 2018 at 05:48:00PM +0000, Ryan Novosielski wrote:
We’re getting some complaints that there’s not enough stuff in the
compute node images, and that we should just boot compute nodes to
the login node image
It's probably worth your while sitting down with your users and
learning how they want to use the tool, instead of telling them.
In general, good advice; not totally applicable here. An example would be that
we’re frequently asked to install things by users that are already installed,
just a different way (by modules, or whatever else) than the user is used to or
that the steps in their documentation said (eg. “Can you please run the
following: yum install whatever — I tried but I don’t have root access.”
Management sometimes reads these and says “why do we keep getting these
requests, just install everything.” We also have another group that works more
closely with users, have identified cases where “it works on the login node”
and often extrapolate the solution without relaying the problem (or
understanding the architecture). We’ll get to the bottom of that, but I wanted
to know more generally what sites are doing.
Our node images are in RAM, generally, so putting a bunch of extra stuff in
them for no good reason does have an impact, although it’s less than it was
when installed memory was lower, and it also all needs to be transferred on a
cold start.
--
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf