On 10/29/19 4:49 PM, Lux, Jim (US 337K) via Beowulf wrote:
True, there’s tons of info in qstat -f, however, doesn’t qstat stop
showing my job after it completes, though? Maybe there’s a switch that
retrieves “last data”?
Hi Jim,
I think you're looking for tracejob. Without sufficient perms you won't
be able to get access to accounting, but should still get the info you
need from other logs it queries.
Here's real usage of it, albeit snipped extensively. It shows memory
and cpu usage at the end, though it won't say how many cores you used.
IMHO that's something you design for. If you find cpu usage to be way
lower than runtime, and your code scales out to the number of cores
available, you can request less cores until your cpu time roughly
approximates your run-time.
ellisw@snip ~ $ sudo tracejob -n1 2100762.snip.panasas.com
/var/spool/torque/mom_logs/20191029: No matching job records located
/var/spool/torque/sched_logs/20191029: No such file or directory
Job: 2100762.snip.panasas.com
10/29/2019 16:33:32 S enqueuing into route, state 1 hop 1
10/29/2019 16:33:32 S dequeuing from route, state QUEUED
10/29/2019 16:33:32 S enqueuing into eng, state 1 hop 1
10/29/2019 16:33:32 S Job Queued at request of
s...@snip.panasas.com, owner = s...@snip.panasas.com, job name =
pr_one_run, queue = eng
10/29/2019 16:33:32 A queue=route
10/29/2019 16:33:32 A queue=eng
10/29/2019 17:16:03 S Job Run at request of r...@snip.panasas.com
10/29/2019 17:16:06 S Not sending email: job requested no e-mail
10/29/2019 17:16:06 A user=snip group=users jobname=pr_one_run
queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212
start=1572383766 owner=s...@snip.panasas.com exec_host=snip/0
Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.nodect=1
Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.walltime=02:00:00
10/29/2019 17:17:17 S Not sending email: job requested no e-mail
10/29/2019 17:17:17 S Exit_status=0 resources_used.cput=00:00:11
resources_used.mem=1092436kb resources_used.vmem=2817552kb
resources_used.walltime=00:01:14
10/29/2019 17:17:17 A user=snip group=users jobname=pr_one_run
queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212
start=1572383766 owner=s...@snip.panasas.com exec_host=snip/0
Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.nodect=1
Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr
Resource_List.walltime=02:00:00 session=21205 end=1572383837
Exit_status=0 resources_used.cput=00:00:11
resources_used.mem=1092436kb
resources_used.vmem=2817552kb resources_used.walltime=00:01:14
10/29/2019 17:17:18 S dequeuing from eng, state COMPLETE
Best,
ellis
--
Ellis H. Wilson III, Ph.D.
www.ellisv3.com
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf