Re: [slurm-users] enabling job script archival

Paul Edmon Thu, 28 Sep 2023 11:04:45 -0700

Yes it was later than that. If you are 23.02 you are good. We've beenrunning with storing job_scripts on for years at this point and thatpart of the database only uses up 8.4G. Our entire database takes up29G on disk. So its about 1/3 of the database. We also have databasecompression which helps with the on disk size. Raw uncompressed ourdatabase is about 90G. We keep 6 months of data in our active database.


-Paul Edmon-


On 9/28/2023 1:57 PM, Ryan Novosielski wrote:

Sorry for the duplicate e-mail in a short time: do you know (oranyone) when the hashing was added? Was planning to enable this on21.08, but we then had to delay our upgrade to it. I’m assuming laterthan that, as I believe that’s when the feature was added.
On Sep 28, 2023, at 13:55, Ryan Novosielski <novos...@rutgers.edu> wrote:
Thank you; we’ll put in a feature request for improvements in thatarea, and also thanks for the warning? I thought of that in passing,but the real world experience is really useful. I could easily seewanting that stuff to be retained less often than the main records,which is what I’d ask for.
I assume that archiving, in general, would also remove this stuff,since old jobs themselves will be removed?
--
#BlackLivesMatter
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State |         Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~RBHS Campus|| \\ of NJ | Office of Advanced Research Computing - MSBA555B, Newark
     `'
On Sep 28, 2023, at 13:48, Paul Edmon <ped...@cfa.harvard.edu> wrote:

Slurm should take care of it when you add it.
So far as horror stories, under previous versions our database sizeballooned to be so massive that it actually prevented us fromupgrading and we had to drop the columns containing the job_scriptand job_env. This was back before slurm started hashing the scriptsso that it would only store one copy of duplicate scripts. Afterthis point we found that the job_script database stayed at a fairlyreasonable size as most users use functionally the same script eachtime. However the job_env continued to grow like crazy as there arevariables in our environment that change fairly consistentlydepending on where the user is. Thus job_envs ended up being toomassive to keep around and so we had to drop them. Frankly we neverreally used them for debugging. The job_scripts though are superuseful and not that much overhead.
In summary my recommendation is to only store job_scripts. job_envsadd too much storage for little gain, unless your job_envs arebasically the same for each user in each location.
Also it should be noted that there is no way to prune outjob_scripts or job_envs right now. So the only way to get rid ofthem if they get large is to 0 out the column in the table. You canask SchedMD for the mysql command to do this as we had to do it hereto our job_envs.
-Paul Edmon-

On 9/28/2023 1:40 PM, Davide DelVento wrote:
In my current slurm installation, (recently upgraded to slurmv23.02.3), I only have
AccountingStoreFlags=job_comment

I now intend to add both

AccountingStoreFlags=job_script
AccountingStoreFlags=job_env

leaving the default 4MB value for max_script_size
Do I need to do anything on the DB myself, or will slurm take careof the additional tables if needed?
Any comments/suggestions/gotcha/pitfalls/horror_stories to share? Iknow about the additional diskspace and potentially load needed,and with our resources and typical workload I should be okay with that.
Thanks!

Re: [slurm-users] enabling job script archival

Reply via email to