Management sometimes reads these and says “why do we keep getting these 
requests, just install everything.”

Sounds  like to need to sit down with management and talk to them, too. In general "just install everything" is not a good idea.

Our node images are in RAM, generally, so putting a bunch of extra stuff in 
them for no good reason does have an impact, although it’s less than it was 
when installed memory was lower, and it also all needs to be transferred on a 
cold start.

All the more reason to restrict what software is installed on the compute nodes. The Blue Gene /L system I maintained when I was at RU (as does all Blue Genes) used a minimal OS that was resident in RAM only. As a result, all applications had to be cross-compiled on the login node and linked statically, since the CNK (compute node kernel - the name of the compute node OS) didn't include all the dynamic libraries (I also think the OS didn't even provide a dynamic linker!). For the Blue Gene/Q, they did start supporting dynamically linked executables, but I don't know what changed to the OS to allow that.

If you're going with a diskless OS like that, I think you need to be very sparing in what you include in your image. If management wants you to 'install everything', on the compute nodes, I think you'll need to switch to a disk-based OS to keep your sanity.

Prentice

On 10/23/2018 02:35 PM, Ryan Novosielski wrote:
On Oct 23, 2018, at 2:10 PM, Greg Lindahl <lind...@pbm.com> wrote:

On Tue, Oct 23, 2018 at 05:48:00PM +0000, Ryan Novosielski wrote:

We’re getting some complaints that there’s not enough stuff in the
compute node images, and that we should just boot compute nodes to
the login node image
It's probably worth your while sitting down with your users and
learning how they want to use the tool, instead of telling them.
In general, good advice; not totally applicable here. An example would be that 
we’re frequently asked to install things by users that are already installed, 
just a different way (by modules, or whatever else) than the user is used to or 
that the steps in their documentation said (eg. “Can you please run the 
following: yum install whatever — I tried but I don’t have root access.” 
Management sometimes reads these and says “why do we keep getting these 
requests, just install everything.” We also have another group that works more 
closely with users, have identified cases where “it works on the login node” 
and often extrapolate the solution without relaying the problem (or 
understanding the architecture). We’ll get to the bottom of that, but I wanted 
to know more generally what sites are doing.

Our node images are in RAM, generally, so putting a bunch of extra stuff in 
them for no good reason does have an impact, although it’s less than it was 
when installed memory was lower, and it also all needs to be transferred on a 
cold start.

--
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB C630, Newark
      `'

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to