Hi All,
Quick question. Is there a way to limit the runtime on a partition only for
salloc ? I would like for batch jobs to have a default max runtime of the
partition but interactive jobs to have shortened allowed runtime.
Thanks!
Something I have been impressed with is Netdata
It is in the standard repositories and will auto-detect quite a bit of
things on a node. It is great for real-time monitoring of a node/job.
I also use Prometheus and Grafana for historic data (anything over 5
minutes).
Brian Andrus
On 5/5/20
At a place I worked before, we used XDMOD several years ago. It was a bit
tricky to set up correctly and not exactly intuitive to get started with
data collection as a user (managers, allocation specialists and
other not-super-technical people were most of our users). But when
familiarized with it,
Hello Everyone,
We would like to improve our visibility on our cluster usage.
We have ganglia, and use sacct actually, but I was wondering if there was a web
tool recommended to have both monitoring and accounting (user and admin
friendly) ?
Thanks in advance
Christine
Hi,
I have been working on identifying if there is a way to restrict disk usage
with slurm using cgroup.
I am aware of the TmpFS mechanism but it seems it just used to define the
required space but does not restrict if it grows to a certain extent of a
file.
Any suggestion would be appreciated.