* if the /var filesystem is shared, race conditions happen (all nodes
want to write on the same files). I had this problem and moved to a
local /var filesystem.

indeed, shared /var is simply a bug.  non-shared NFS /var is viable,
but generally pointless.

* if /var is local (which it may because the disks do exist), the
whole point of central point for easy admin vanishes, because I would

eh?

had to create all the /var structure that packages need to work, on
each node (would be easier to do: "for $node; ssh $install_cmd; done",
than guessing which dirs I need to create or files to copy).

but if your nodes are nfs-root, you won't be installing anything on them:
you'll be installing on the nfs-root.

* if /var is tmpfs all forensics are certainly gone after failure
(Murphy told me this one ;).

syslog is very happy to log over the network.

Everything I read on the subject do underline the advantages of
diskless approaches but miss to alert to this problem and/or to solve
it. On the other side, the distributed approach tools (where every
node is autonomous) seem to be halted (as systemimager - which is used
in the Oscar project) or discontinued, or truly overblown for my
reference scale (IBM's xCat); so it really seems that I'm missing

there's also OneSIS.

something.

The question is what you do about this ?

setting up your own nfs-root cluster is a simple exercise.  if you're not
very familiar with *nix booting/daemons/init scripts, it will take a few tries to get the config right, but the end result is pretty simple and
robust.  remote syslog, preferably with console-over-net (ipmi sol,
netconsole) means that there's nothing interesting on the local /var.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to