That’s kind of what I’m looking for, but I’d like to modify the partition limit for an account rather than for a user. Something like:
sacctmgr modify account name=gbstest partition=batch grpjobs=1 Using sacctmgr to add a partition for a user works fine, unfortunately, partition isn’t one of the options for modifying an account. Any idea for limiting at the account+partition level rather than account+user+partition? Setting things for users seems to work as expected, unless I submit a job with multiple partitions. I have two partitions batch and burst. I set a limit of grpjobs=1 for batch. I submit jobs to partition “batch,burst” and it starts more than 1 jobs in the batch partition. I thought the others would need to go into the “burst” partition. Here’s an example of what I’m seeing. [gbs35@sltest ~]$ grep PartitionName=b /etc/slurm/slurm.conf PartitionName=batch Nodes=ALL Default=no MaxTime=INFINITE State=UP CpuBind=core OverSubscribe=no DenyAccounts=open PriorityTier=100 PartitionName=burst Nodes=ALL Default=no MaxTime=INFINITE State=UP CpuBind=core OverSubscribe=no DenyAccounts=open PriorityTier=50 [gbs35@sltest ~]$ sacctmgr show account name=gbstest withass format=account,cluster,partition,user,grpcpus,grpjobs Account Cluster Partition User GrpCPUs GrpJobs ---------- ---------- ---------- ---------- -------- ------- gbstest sltest gbstest sltest burst gbs35 gbstest sltest batch gbs35 1 [gbs35@sltest ~]$ squeue -a JOBID USER PARTITION NODES CPUS ST TIME_LEFT START_TIME NODELIST(R 91 gbs35 batch,burst 1 1 PD 30:00 2019-01-16T11:19:00 (Nodes req 90 gbs35 batch,burst 1 1 PD 30:00 2019-01-16T10:49:00 (Nodes req 89 gbs35 batch,burst 1 1 PD 30:00 2019-01-16T10:19:22 (Nodes req 88 gbs35 batch 1 1 R 27:37 2019-01-16T09:56:36 sltest 83 gbs35 batch 1 1 R 20:23 2019-01-16T09:49:22 sltest 84 gbs35 batch 1 1 R 20:23 2019-01-16T09:49:22 sltest 85 gbs35 batch 1 1 R 20:23 2019-01-16T09:49:22 sltest 86 gbs35 batch 1 1 R 20:23 2019-01-16T09:49:22 sltest 87 gbs35 batch 1 1 R 20:23 2019-01-16T09:49:22 sltest 81 gbs35 batch 1 1 R 20:20 2019-01-16T09:49:19 sltest 82 gbs35 batch 1 1 R 20:20 2019-01-16T09:49:19 sltest Taking a look at another job, it appears that the “limit” info is getting added to the wrong partition for the association for this job. From slurmctld.log I see: [2019-01-16T10:41:22.883] debug: sched: Running job scheduler [2019-01-16T10:41:22.883] debug2: found 1 usable nodes from config containing sltest [2019-01-16T10:41:22.883] debug3: _pick_best_nodes: JobId=184 idle_nodes 1 share_nodes 1 [2019-01-16T10:41:22.883] debug2: select_p_job_test for JobId=184 [2019-01-16T10:41:22.883] debug5: powercapping: checking JobId=184 : skipped, capping disabled [2019-01-16T10:41:22.883] debug3: select/cons_res: _add_job_to_res: JobId=184 act 0 [2019-01-16T10:41:22.883] debug3: select/cons_res: adding JobId=184 to part batch row 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, qos normal grp_used_tres_run_secs(cpu) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, qos normal grp_used_tres_run_secs(mem) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, qos normal grp_used_tres_run_secs(node) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, qos normal grp_used_tres_run_secs(billing) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, qos normal grp_used_tres_run_secs(fs/disk) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, qos normal grp_used_tres_run_secs(vmem) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, qos normal grp_used_tres_run_secs(pages) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 30(gbstest/gbs35/burst) grp_used_tres_run_secs(cpu) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 30(gbstest/gbs35/burst) grp_used_tres_run_secs(mem) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 30(gbstest/gbs35/burst) grp_used_tres_run_secs(node) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 30(gbstest/gbs35/burst) grp_used_tres_run_secs(billing) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 30(gbstest/gbs35/burst) grp_used_tres_run_secs(fs/disk) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 30(gbstest/gbs35/burst) grp_used_tres_run_secs(vmem) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 30(gbstest/gbs35/burst) grp_used_tres_run_secs(pages) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 27(gbstest/(null)/(null)) grp_used_tres_run_secs(cpu) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 27(gbstest/(null)/(null)) grp_used_tres_run_secs(mem) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 27(gbstest/(null)/(null)) grp_used_tres_run_secs(node) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 27(gbstest/(null)/(null)) grp_used_tres_run_secs(billing) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 27(gbstest/(null)/(null)) grp_used_tres_run_secs(fs/disk) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 27(gbstest/(null)/(null)) grp_used_tres_run_secs(vmem) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 27(gbstest/(null)/(null)) grp_used_tres_run_secs(pages) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 1(root/(null)/(null)) grp_used_tres_run_secs(cpu) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 1(root/(null)/(null)) grp_used_tres_run_secs(mem) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 1(root/(null)/(null)) grp_used_tres_run_secs(node) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 1(root/(null)/(null)) grp_used_tres_run_secs(billing) is 1800 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 1(root/(null)/(null)) grp_used_tres_run_secs(fs/disk) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 1(root/(null)/(null)) grp_used_tres_run_secs(vmem) is 0 [2019-01-16T10:41:22.884] debug2: acct_policy_job_begin: after adding JobId=184, assoc 1(root/(null)/(null)) grp_used_tres_run_secs(pages) is 0 [2019-01-16T10:41:22.884] debug3: sched: JobId=184 initiated [2019-01-16T10:41:22.884] sched: Allocate JobId=184 NodeList=sltest #CPUs=1 Partition=batch You can see that the job is in Partition=batch, but the “acct_policy_job_begin” stuff has the association of (gbstest/gbs35/burst) I would have thought (gbstest/gbs35/batch) would make more sense. Somewhere the pointer to the correct pointer isn’t making it through. ----- Gary Skouson From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of Thomas M. Payerle Sent: Tuesday, January 15, 2019 12:57 PM To: Slurm User Community List <slurm-users@lists.schedmd.com> Subject: Re: [slurm-users] Accounting configuration Generally, the add, modify, etc sacctmgr commands want an "user" or "account" entity, but can modify associations though this. E.g., if user baduser should have GrpTRESmin of cpu=1000 set on partition special, use something like sacctmgr add user name=baduser partition=special account=testacct grptresmin=cpu=1000 if there is no association for that user, account and partition already, or sacctmgr modify user where user=baduser partition=special set grptresmin=cpu=1000 To place the restriction on an account instead, add/modify the account with a partition field. On Tue, Jan 15, 2019 at 11:33 AM Skouson, Gary <gb...@psu.edu<mailto:gb...@psu.edu>> wrote: Slurm accounting info is stored based on user, cluster, partition and account. I'd like to be able to enforce limits for an account based on the partition it's running in. Sadly, I'm not seeing how to use sacctmgr to change the partition as part of the association. The add, modify and delete seem to only apply to user, account and cluster entities. How do I add a partition to a particular account association, and set GrpTRES for an association that includes a partition. I know I can change the partition configuration in slurm.conf and use AllowAccounts, but that doesn't change the usage limits on a partition for a particular account. Maybe there's another way to work around this that I'm missing. I'd like to be able to use GrpTRESMins to limit overall cumulative account usage. I also want to limit accounts to differing resources (GrpTRES) on some partitions (for preemption/priority etc.) Thoughts? ----- Gary Skouson -- Tom Payerle DIT-ACIGS/Mid-Atlantic Crossroads paye...@umd.edu<mailto:paye...@umd.edu> 5825 University Research Park (301) 405-6135 University of Maryland College Park, MD 20740-3831