Common usage of condor_q

generic usage

By default, condor_q only shows queued jobs by the calling user on the local submit machine. To get all queued jobs, use condor_q -all instead. To gather all jobs currently known to condor of the local pool use condor_q -global -all, but please do not do that very often as this stresses the systems quite a bit. In general it is better to use our overview page.

Detailed list of jobs

Since HTCondor version 8.6, condor_q only creates a batched summary. Use the -nobatch option for more details, e.g.

condor_q -nobatch USER

-- Schedd: atlas9.atlas.local : <10.20.30.9:7863?... @ 03/15/18 15:23:37
 ID         OWNER            SUBMITTED     RUN_TIME ST PRI SIZE   CMD
5198708.0   USER            3/13 16:59   1+22:24:28 R  0      0.3 condor_dagman -p 0 -f -l . -Lockfile /path/to/job
5198713.0   USER            3/13 16:59   1+22:23:13 R  0   1221.0 lalinference_nest ...

To learn on which machine a job runs, add the -run option to condor_q:

 condor_q -run -nob|head
-- Schedd: atlas7.atlas.local : <10.20.30.7:16305> @ 01/15/20 17:59:38
ID         OWNER            SUBMITTED     RUN_TIME HOST(S)
5689638.0   USER1          12/28 17:27  18+00:32:32 atlas7.atlas.local
5689642.0   USER1          12/28 17:27  18+00:32:00 atlas7.atlas.local
5689645.0   USER1          12/28 17:28  18+00:31:32 atlas7.atlas.local
5724879.0   USER2           1/14 09:15   1+08:43:47 atlas7.atlas.local
5724884.0   USER2           1/14 09:15   1+08:43:11 slot1_12@a2116.atlas.local
5724896.0   USER2           1/14 09:15   1+08:43:10 slot1_10@a2721.atlas.local
[...]

The first four jobs displayed here are dagman jobs, i.e. they all run on the submit machine submitting and orchastrating jobs running on execute machines. The final two jobs listed are running on two execute machines with a run-time of more than 30 hours (assuming the job was not killed in-between).

Gathering even more information

To learn even more about a single job, one can use the -long option for condor_q which will list all attributes of the job class ad, e.g. condor_q -long 5724884.0 would yield a list of more than 120 attributes. Most of these values are just useful for Condor itself, but a couple of these may be of interest for the user.

For example, one learns that there are attributes related to how many resources were requested at submit time (RequestCpus, RequestDisk and RequestMemory) and how much a job “currently” uses (ImageSize).

Custom output

With the knowledge of used attribute names, one can create a customized output with the -autoformat option, e.g.

condor_q -constraint 'JobStatus==2' -autoformat RequestCpus RequestMemory ImageSize

will list the requested number of CPU cores and amount of memory (in MByte) along with the size of the currently running job (in kByte). The jobs considered are filtered based on the constraint given).

As this list can be quite long, one can easily sort the output, e.g.

condor_q -constraint 'JobStatus==2' -autoformat RequestCpus \
RequestMemory ImageSize | \
sort --general-numeric-sort -key 3,3 -key 2,2 -key 1,1 | \
uniq --count

1 1 500 10
5 1 500 300
522 4 13000 15000
5 4 13000 100000
71 4 13000 125000
140 4 13000 150000
140 4 13000 175000
103 4 13000 200000
105 4 13000 225000
69 4 13000 250000
93 4 13000 275000
94 4 13000 300000
46 4 13000 325000
22 4 13000 350000
3 4 13000 375000
2 4 13000 400000
3 4 13000 500000
6 4 13000 1000000
5 4 13000 1250000
1 4 13000 1500000
7 4 13000 1750000
5 4 13000 2000000
3 4 13000 2250000
1 4 13000 3000000
1 4 13000 3250000
1 4 13000 3750000
1 4 13000 7500000
3 4 13000 10000000
1772 4 13000 12500000
38 4 13000 15000000