Difference: FkwSTEP09GlideinWMS (2 vs. 3)

Revision 32009/05/26 - Main.FkW

Line: 1 to 1
 
META TOPICPARENT name="FkwGlideinWMS"
Line: 18 to 18
 Both the development ad production version of CRAB server submit to the same glideinWMS. I.e., we do not operate a development version of glideinWMS at this point for STEP09.
Added:
>
>
In the following, we go through one piece of hardware after the other and note some useful commands to figure out what's going on.

glidein-2

which condor_q
/data/glidecondor/bin/condor_q
condor_q

This will list all jobs presently known to the schedd on glidein-2. As glidein-2 is where the Crab server lives, this means it lists all the jobs that the Crab server has pushed into glideinWMS, and are not yet completed. A typical line looks like this:

8733.0   uscms2294       5/25 13:56   0+00:00:00 I  0   0.0  CMSSW.sh 119      
The first is the condor job Id. You can use that to get all the gory details about this job by doing:
condor_q -long 8733.0 >& junk.log
I redirected it into a file here because you will most likely want to look at this at your leisure, and carefully. There are a few particularly useful pieces of information in this long listing:
NumJobStarts = 0
NumRestarts = 0
If this isn't 0 or 1 then the glidein executing the job probably failed at that site once, and the job got rescheduled. This is a sign that either the site, or the glideinWMS is having trouble.
 

Other Details for STEP09

 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback