On 10/29/19 4:49 PM, Lux, Jim (US 337K) via Beowulf wrote:
True, there’s tons of info in qstat -f, however, doesn’t qstat stop showing my job after it completes, though? Maybe there’s a switch that retrieves “last data”?

Hi Jim,

I think you're looking for tracejob. Without sufficient perms you won't be able to get access to accounting, but should still get the info you need from other logs it queries.

Here's real usage of it, albeit snipped extensively. It shows memory and cpu usage at the end, though it won't say how many cores you used. IMHO that's something you design for. If you find cpu usage to be way lower than runtime, and your code scales out to the number of cores available, you can request less cores until your cpu time roughly approximates your run-time.

ellisw@snip ~ $ sudo tracejob -n1 2100762.snip.panasas.com
/var/spool/torque/mom_logs/20191029: No matching job records located
/var/spool/torque/sched_logs/20191029: No such file or directory

Job: 2100762.snip.panasas.com

10/29/2019 16:33:32  S    enqueuing into route, state 1 hop 1
10/29/2019 16:33:32  S    dequeuing from route, state QUEUED
10/29/2019 16:33:32  S    enqueuing into eng, state 1 hop 1
10/29/2019 16:33:32 S Job Queued at request of s...@snip.panasas.com, owner = s...@snip.panasas.com, job name = pr_one_run, queue = eng
10/29/2019 16:33:32  A    queue=route
10/29/2019 16:33:32  A    queue=eng
10/29/2019 17:16:03  S    Job Run at request of r...@snip.panasas.com
10/29/2019 17:16:06  S    Not sending email: job requested no e-mail
10/29/2019 17:16:06 A user=snip group=users jobname=pr_one_run queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212 start=1572383766 owner=s...@snip.panasas.com exec_host=snip/0

Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr Resource_List.nodect=1 Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr Resource_List.walltime=02:00:00
10/29/2019 17:17:17  S    Not sending email: job requested no e-mail
10/29/2019 17:17:17 S Exit_status=0 resources_used.cput=00:00:11 resources_used.mem=1092436kb resources_used.vmem=2817552kb resources_used.walltime=00:01:14 10/29/2019 17:17:17 A user=snip group=users jobname=pr_one_run queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212 start=1572383766 owner=s...@snip.panasas.com exec_host=snip/0

Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr Resource_List.nodect=1 Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr Resource_List.walltime=02:00:00 session=21205 end=1572383837 Exit_status=0 resources_used.cput=00:00:11 resources_used.mem=1092436kb resources_used.vmem=2817552kb resources_used.walltime=00:01:14
10/29/2019 17:17:18  S    dequeuing from eng, state COMPLETE

Best,

ellis

--
Ellis H. Wilson III, Ph.D.
     www.ellisv3.com
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to