Typical HTCondor CommandsΒΆ

HTCondor has been configured on the local LSST cluster, and can be used to submit jobs to local resources, or to remote (XSEDE) systems.

There are several commands to be aware of, outlined below:

:condor_q

The command shows which jobs are in the queue. When specified without arguments, it shows only jobs submitted from the machine on which the command was executed. Here’s an example, run on the machine “lsst-dev”:

$ condor_q
-- Schedd: lsst-dev.ncsa.illinois.edu : <141.142.225.160:37253>

ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
7092.0   srp             8/21 09:04   0+03:40:23 R  0   0.3  condor_dagman
7094.0   srp             8/21 09:05   0+00:43:04 I  0   97.7 matrix.sh visit=88
7095.0   srp             8/21 09:05   0+00:42:54 I  0   97.7 matrix.sh visit=88
7096.0   srp             8/21 09:05   0+00:42:54 I  0   97.7 matrix.sh visit=88

4 jobs; 0 completed, 0 removed, 3 idle, 1 running, 0 held, 0 suspended
condor_rm

You can use condor_rm to remove jobs from the queue. If you want to remove job 7096, run:

$ condor_rm 7096
Cluster 7096 has been marked for removal.
$ condor_q

-- Submitter: lsst-dev.ncsa.illinois.edu : <141.142.225.160:37253> : lsst-dev.ncsa.illinois.edu

ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
7092.0   srp             8/21 09:04   0+03:48:26 R  0   0.3  condor_dagman
7094.0   srp             8/21 09:05   0+00:43:04 I  0   97.7 matrix.sh visit=88
7095.0   srp             8/21 09:05   0+00:42:54 I  0   97.7 matrix.sh visit=88
3 jobs; 0 completed, 0 removed, 2 idle, 1 running, 0 held, 0 suspended

Usually the condor_rm command doesn’t instantaneously remove the job from the queue; it may take several seconds for it to be removed.

To remove all the jobs you submitted, use the -all option:

$ condor_rm -all
All jobs marked for removal.
$ condor_q
-- Submitter: lsst-dev.ncsa.illinois.edu : <141.142.225.160:37253> : lsst-dev.ncsa.illinois.edu
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
condor_status

The condor_status command shows the status of machines in your Condor pool.

$ condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime
slot1@lsst-run1.nc LINUX      X86_64 Unclaimed Idle     0.000  1916  0+23:05:16
slot2@lsst-run1.nc LINUX      X86_64 Unclaimed Idle     0.000  1916  0+23:05:19
slot1@lsst-run2.nc LINUX      X86_64 Unclaimed Idle     0.000  1916 11+01:31:35
slot2@lsst-run2.nc LINUX      X86_64 Unclaimed Idle     0.000  1916 11+01:31:58

Total Owner Claimed Unclaimed Matched Preempting Backfill
         X86_64/LINUX     4     0       0         4       0          0        0
                Total     4     0       0         4       0          0        0

Further details on Condor, and other commands are available from the HTCondor manual.