The objective of this page is to document where things go when you submit a job on the UAF. And how to check and/or change the priorities of people.

Architecture diagram


How to figure things out

The above diagram shows that there are multiple components involved in getting a job started. Let's start here by explaining what they are.

  • You can think of "startd" as the actual batch slot that can run your job.
  • Think of the "schedd" as the queue of the batch system that you are submitting into. As you can see from the picture, there is a hierarchy of queues. Each UAF has its own local queue, and then all of those local queues "forward" whatever they have queued into the queue on cmssubmit-r1
  • The "frontend" is the input to the provisioning system. The way we use condor we make a distinction between provisioning resources and scheduling jobs. To be able to schedule a job you first have to provision a resource into the "pool". The pool is implemented on glidein-collector. To summarize:
    • the frontend watches the schedd on cmssubmit-r1 for queued jobs. If it finds any, it tries to provision a batch slot. To do so, a glidein gets submitted to each of the sites you indicated in your DESIRED_sites statement.
    • when the glidein starts at a site, it runs a startd. That startd then calls back to the pool to announce its availability.
    • once there are resources (startd's) in the pool, the schedd can schedule whoever is first in line onto those resources.

To get a job started on a startd the file that define the job thus need to get copied at least 2 times, typically. Once from the UAF to cmssubmit-r1, and a second time from the latter to the startd. While this copying is happening, the job may go into the "H" (hold) state. It will recover from that typically after a few minutes or so. This also means that if you delete the directory for a job after you submit it, your job is guaranteed to go into the "H" state, because it uses files in that directory to communicate the state it is in.

From all this, it should be obvious that getting a job started takes a few minutes or so. It thus makes no sense to have runtimes for a job of only a few minutes. I.e. you should structure the work you do such that execution times per job are an hour or more. You also need to make sure that the sum of all files that define your job don't become too large because each job carries them with it. This includes executable, scripts, libaries, etc. but of course not the files you read via XRootd or alike.

How to figure out why your job isn't running

  • Start with condor_q on the UAF that you submitted the job.
  • If your job is in the "H" state then see below for how to understand why it is "held".
  • If your job is in the "I" state then first check if you have properly set the "DESIRED_Sites" attribute:
    • condor_q -l interger-job-ID |grep DESIRED
    • this should give you something like: DESIRED_Sites = "T2_US_UCSD,T2_US_Nebraska,T2_US_Wisconsin,T2_US_MIT,T1_US_FNAL,T2_US_Purdue" Note that the "" are crucial. Also, any typos will mean that your job may never run.
  • once you have ruled that out, you can try this command:
    • condor_q -analyze integer-job-ID
    • Note that the information you get from this is often too cryptic to understand what is going on.
  • you can do condor_q -help or google for condor_q to learn more.

How to figure out why your job was held

Each job has a description of its state. You can query that description using condor_q -long jobId When condor holds a job it records a (more or less cryptic) reson for doing so.

E.g. a very common reason for a job being held is that your proxy is about to expire. Here's what that would look like:

condor_q -l 27600.0 | grep -i reason
ReleaseReason = undefined
HoldReasonSubCode = 0
HoldReason = "Error from Proxy about to expire"
HoldReasonCode = 4

Similarily, you can also find out details like when your proxy expires:

condor_q -l 27600.0 | grep -i x509
x509UserProxyVOName = "cms"
x509UserProxyExpiration = 1441937795

 date -d @1441937795
Thu Sep 10 19:16:35 PDT 2015

To avoid this particular problem, you will want to extend your proxy lifetime to 72h with "voms-proxy-init -H 72".

How to query the schedd on cmssubmit-r1

  • condor_q -n -pool

Basically, all commands of condor added with -n and -pool as above will talk to the schedd on cmssubmit-r1. Amone the useful commands are:

  • condor_q -help
  • condor_q -analyze
  • condor_q -long

The -analyze and the -long are kind of heavy. So you should run them only against a single job ID. I.e. first do condor_q to figure out what the job IDs are you want to look at, then look at just one of them.

How to get the status of the pool

  • condor_status -pool

This will show you all the startd's connected to the pool at this moment. It will tell you which ones are busy and which ones are idle. An idle resource is one that can be used when a job shows up that is willing to run on it.

How to figure out the relative priority between different users that submit jobs from the UAF

You need special privileges to do this.

  • ssh condor@glidein-collector
  • condor_userprio -all

This then dumps out the priorities for different users based on the names HTCondor knows about.

You then need to figure out who is who based on the GUMS mapping to DN. The DN will have the name in them. E.g.: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=mderdzin/CN=760843/CN=Mark Derdzinski is uscms5606

If I wanted to change the relative priority of different users on the UAF then I'd use the commands:

  • condor_userprio -setfactor
  • condor_userprio -setprio

This affects who gets the next free CPU among all those queued up, and willing to run on that CPU.

E.g. if Joe and Jane both are willing to run at Caltech or UCSD, the the relative priority of them as set here will determine who gets the first free slot at either Caltech or UCSD. If Joe insists on UCSD while Jane is ok with both, then a free slot at Caltech will go to Jane irrespective of any settings here.

How to figure out what DN corresponds to which username inside the UCSD T2 cluster

Note, with "usernames in the UCSD T2 cluster" I mean the names that GUMS maps the DN to at each of the OSG-CEs of the cluster. This username is then used to submit to HTCondor, and thus the name under which the job is known inside the cluster.

The important ones here are those mapped to /DC=ch/DC=cern/OU=computers/CN=cmspilotXY/ here XY is a 2 digit integer, e.g. 01.

e.g. as of August 28th 2015, the DN /DC=ch/DC=cern/OU=computers/CN=cmspilot01/ which is used by the glideinfronend for the UAF, is mapped on our cluster to the username cp0035. So if I want to adjust the relative priority of submissions via the UAF with submissions via CRAB3, or WMAgent, I need to change the relative priority of username cp0035.

How to modify priorities on the cluster

You need superuser privileges to do this.

  • ssh root@osg-gw-1
  • condor_userprio -all

This gives the priorities of all recently queued or running users on the cluster.

are the two ways of changing the priority of the user cp0035. The first sets a multiplicative factor, the second resets the absolute to 1, the lowest number it can be.

HTCondor will start whatever job has the lowest priority number and meets the criteria for an open slot. So setting prio to 1 is equivalent of resetting it to the best prio it can have. Setting the factor to a small integer is the best priority factor you can have.

The absolute priority number is prio x factor.

-- FkW - 2015/08/28

Edit | Attach | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r4 - 2015/09/16 - 09:07:03 - FkW
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback