In the topic on avoiding fragmentation Chris Samuel wrote: >Our trick in Slurm is to use the slurmdprolog script to set an XFS project >quota for that job ID on the per-job directory (created by a plugin which >also makes subdirectories there that it maps to /tmp and /var/tmp for the >job) on the XFS partition used for local scratch on the node.
I had never thought of that, and it is a very neat thing to do. What I would like to discuss is the more general topic of clearing files from 'fast' storage. Many sites I have seen have dedicated fast/parallel storage which is referred to as scratch space. The intention is to use this scratch space for the duration of a project, as it is expensive. However I have often seen that the scratch space i used as permanent storage, contrary to the intentions of whoever sized it, paid for it and installed it. I feel that the simplistic 'run a cron job and delete files older than N days' is outdated. My personal take is that heirarchical storage is the answere, automatically pushing files to slower and cheaper tiers. But the thought struck me - in the Slurm prolog script create a file called THESE-FILES-WILL-SELF-DESTRUCT-IN-14-DAYS Then run a cron job to decrement the figure 14 I guess that doesnt cope with running multiple jobs on the same data set - but then again running a job marks that data as 'hot' an dyou reset the timer to 14 days. What do most sites do for scratch space?
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf