On 09/16/2012 02:52 PM, Jeffrey Rossiter wrote:> The intention is for the system to be > used for scientific computation.
That doesn't narrow it down much. > I am trying to decide on a linux > distribution to use. I suggest doing it yourself based on whatever popular linux distro you have experience with. Assuming general linux systems administrator proficiency, it's not particularly hard. I'd suggest starting with Scientific linux (especially if your applications assume it) or Debian/Ubuntu (which seem to have larger repositories). I'd lean towards Ubuntu if you are running new hardware since Sandy Bridge (new intel) and Bulldozer (new AMD) seem to benefit from the latest kernels. Then add: * Cobbler for PXE installing (or functionally similar software), network configuration, dhcp, dns, mac address, IP addresses, etc. * Puppet/Chef for configuration management (everything post-install) * Torque/Slurm for batch queue * Environmental modules or similar to help let users easily load the needed libraries/apps/environment they need in a reproducible way. * Ganglia/cacti/munin for graphing resource utilization. * /share/apps/<application name>-<version number> for anything you install that's not in the the repositories. Get nodes to netboot, netinstall, and mount a shared /home. Once users start using it listen to their needs and adapt accordingly. Some suggestions: * If your campus has a standard username for each user, use it. * Use ssh certs for user authentication, you really don't want your user's passwords, nor do they want to type it often. * start a wiki for documentation, allow users to edit it. * Have environmental modules output the name/version on module load, much easier to figure out what a user has done when you have the exact info to reproduce a run in the run's output. * set hardware physically to always netboot, then depend on the central server to decide if it should be from local disk or a new install. * Have compute nodes use host based ssh keys for auth (not user ssh keys) * Have head node use user based keys for login, do not allow ~/.ssh/authorized_keys * Allow exactly one ssh key per user. * Keep your configuration files in git or similar version control. Or if managed by puppet/chef, keep puppet/chef files in version control. * Strongly encourage any users writing source code to use a distributed version control system like git. * Be very very clear on the status/lack of backups. Be clear that loss of files will happen and it's only a matter of time. * Use software RAID. > Does it matter all that much? Not particularly. Random commercial software seems to assume RHEL based distros. Ubuntu/Debian seems to have the largest repositories (read that as the most likely to have a user request handled by apt-get install). > Any advice would be > greatly appreciated. You didn't mention your current experience, if the above sounds daunting then start with warewulf. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf