condor_q usage
Common usage of condor_q
generic usage
By default, condor_q
only shows queued jobs by the calling user on
the local submit machine. To get all queued jobs, use condor_q -all
instead. To gather all jobs currently known to condor of the local
pool use condor_q -global -all
, but please do not do that very
often as this stresses the systems quite a bit. In general it is
better to use our
overview page.
Detailed list of jobs
Since HTCondor version 8.6, condor_q
only creates a batched
summary. Use the -nobatch
option for more details, e.g.
condor_q -nobatch USER
-- Schedd: atlas9.atlas.local : <10.20.30.9:7863?... @ 03/15/18 15:23:37
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
5198708.0 USER 3/13 16:59 1+22:24:28 R 0 0.3 condor_dagman -p 0 -f -l . -Lockfile /path/to/job
5198713.0 USER 3/13 16:59 1+22:23:13 R 0 1221.0 lalinference_nest ...
To learn on which machine a job runs, add the -run
option to
condor_q
:
condor_q -run -nob|head
-- Schedd: atlas7.atlas.local : <10.20.30.7:16305> @ 01/15/20 17:59:38
ID OWNER SUBMITTED RUN_TIME HOST(S)
5689638.0 USER1 12/28 17:27 18+00:32:32 atlas7.atlas.local
5689642.0 USER1 12/28 17:27 18+00:32:00 atlas7.atlas.local
5689645.0 USER1 12/28 17:28 18+00:31:32 atlas7.atlas.local
5724879.0 USER2 1/14 09:15 1+08:43:47 atlas7.atlas.local
5724884.0 USER2 1/14 09:15 1+08:43:11 slot1_12@a2116.atlas.local
5724896.0 USER2 1/14 09:15 1+08:43:10 slot1_10@a2721.atlas.local
[...]
The first four jobs displayed here are dagman jobs, i.e. they all run on the submit machine submitting and orchastrating jobs running on execute machines. The final two jobs listed are running on two execute machines with a run-time of more than 30 hours (assuming the job was not killed in-between).
Gathering even more information
To learn even more about a single job, one can use the -long
option
for condor_q
which will list all attributes of the job class
ad, e.g. condor_q -long 5724884.0
would yield a list of more than
120 attributes. Most of these values are just useful for Condor
itself, but a couple of these may be of interest for the user.
For example, one learns that there are attributes related to how many
resources were requested at submit time (RequestCpus
, RequestDisk
and RequestMemory
) and how much a job “currently” uses
(ImageSize
).
Custom output
With the knowledge of used attribute names, one can create a
customized output with the -autoformat
option, e.g.
condor_q -constraint 'JobStatus==2' -autoformat RequestCpus RequestMemory ImageSize
will list the requested number of CPU cores and amount of memory (in MByte) along with the size of the currently running job (in kByte). The jobs considered are filtered based on the constraint given).
As this list can be quite long, one can easily sort the output, e.g.
condor_q -constraint 'JobStatus==2' -autoformat RequestCpus \
RequestMemory ImageSize | \
sort --general-numeric-sort -key 3,3 -key 2,2 -key 1,1 | \
uniq --count
1 1 500 10
5 1 500 300
522 4 13000 15000
5 4 13000 100000
71 4 13000 125000
140 4 13000 150000
140 4 13000 175000
103 4 13000 200000
105 4 13000 225000
69 4 13000 250000
93 4 13000 275000
94 4 13000 300000
46 4 13000 325000
22 4 13000 350000
3 4 13000 375000
2 4 13000 400000
3 4 13000 500000
6 4 13000 1000000
5 4 13000 1250000
1 4 13000 1500000
7 4 13000 1750000
5 4 13000 2000000
3 4 13000 2250000
1 4 13000 3000000
1 4 13000 3250000
1 4 13000 3750000
1 4 13000 7500000
3 4 13000 10000000
1772 4 13000 12500000
38 4 13000 15000000