OSG User Tutorial
Contents
Introduction
This wiki is meant as a tutorial on submitting a job to the
Open Science Grid as part of the NBCR summer institute.
According to the OSG website the OSG is;
"A distributed computing infrastructure for large-scale scientific research. It is built and operated by a consortium of universities, national laboratories, scientific collaborations and software developers. The OSG is supported by the National Science Foundation and the U.S. Department of Energy's Office of Science."
In practical terms the OSG is a collection of clusters that all share the same middleware and authentication standard. OSG Middleware is provided by the VDT with some OSG specific customizations.
Requirements to submit to the OSG
- OSG recognized Certificate from which you can generate a valid Proxy
- Membership in an OSG recognized Virtual Organization (VO)
- Working OSG Client installation
- Working Condor Schedd (possibly as part of OSG Client Installation)
Sites On the OSG
Sites
Virtual Organizaton Resource Selector
Resources
Gratia Accounting
Setting up your environment
In order to start using the OSG Client software you will need to run the required setup shell script. This script sets things like library and program paths in addition to other requirement environment variables.
source /export/vdt/vdt/setup.sh
Creating Your Proxy
DO NOT RUN THIS COMMAND
All access to the Open Science Grid, storage and computing, is through GSI x509 authentication. Using your x509 certificate a temporary proxy is created that can be used to submit jobs and access storage.
While the OSG does still support straight grid proxies it is recommend that a voms-proxy be used. A voms proxy is a standard x509 proxy with the addition of extended attributes signed by an OSG VO.
In this example I am asking for and receiving a proxy with extended attributes from the
CompBioGrid? VO. By using the
CompBioGrid? VOMS I will be identified as a
CompBioGrid? user on remote sites as opposed to my more typical CMS VO.
eg.
$ voms-proxy-init -debug --voms CompBioGrid
Detected Globus version: 22
Unspecified proxy version, settling on Globus version: 2
Number of bits in key :512
Using configuration file /data/vdt/glite/etc/vomses
Using configuration file $prefix/etc/vomses
Cannot find file or dir: $prefix/etc/vomses
Files being used:
CA certificate file: none
Trusted certificates directory : /data/vdt/globus/TRUSTED_CA
Proxy certificate file : /tmp/x509up_u583
User certificate file: /home/users/tmartin/.globus/usercert.pem
User key file: /home/users/tmartin/.globus/userkey.pem
Output to /tmp/x509up_u583
Your identity: /DC=org/DC=doegrids/OU=People/CN=Terrence Martin 525658
Enter GRID pass phrase:
Creating temporary proxy to /tmp/tmp_x509up_u583_30957 ...++++++++++++
.++++++++++++
Done
Contacting voms.compbiogrid.org:15001 [/DC=org/DC=doegrids/OU=Services/CN=fs0.vcell.uchc.edu] "CompBioGrid" Done
Creating proxy to /tmp/x509up_u583 ..++++++++++++
..++++++++++++
Done
Your proxy is valid until Thu Aug 2 22:46:21 2007
Copying an existing proxy
Since in this tutorial it is not assumed you have a valid x509 grid certificate I have generated a proxy that everyone can copy and use. This proxy is valid only for 12 hours.
In the below example substitute the results from ID. For example my UID is 518, you should use your UID as returned by the id(1) command.
$ id
uid=518(tmartin) gid=518(tmartin) groups=518(tmartin)
$ cp /tmp/proxy-x509up_u583 /tmp/x509up_u518
$ chmod 600 /tmp/x509up_u518
$ chown tmartin:tmartin /tmp/x509up_u518
Testing Authentication
We want to test to make sure we can authenticate to the OSG gatekeeper at the CMS T2 center before we proceed.
$ globusrun -a -r osg-gw-2.t2.ucsd.edu
GRAM Authentication test successful
Creating a Condor-G Submission Script
When interacting with the OSG you will almost always be using Condor-G. Even if the VO you belong to has a custom submission system for submitting jobs almost in every case the system uses condor-g under the hood. While it is possible to use GRAM alone to submit jobs this is very strongly discouraged due to the load it can place on the gatekeepers.
- First create a directory for your submission script
mkdir ~/osg-tutorial
cd osg-tutorial
- Create a condor-g submission script (You can use your favorite editor) called tutorial.cmd and enter the following text.
NOTE: Replace
with your username.
universe=grid
Grid_Resource=gt2 osg-gw-2.t2.ucsd.edu:/jobmanager-condor
executable=/home/<YOURUSERNAME>/osg-tutorial/myjob.sh
stream_output = False
stream_error = False
WhenToTransferOutput = ON_EXIT
transfer_input_files =
transfer_Output_files =
log = /tmp/<YOURUSERNAME>-osg.log
Notification = Never
+Owner = undefined
arguments=10
output = ./myjob.$(Cluster).$(Process).out
error = ./myjob.$(Cluster).$(Process).err
queue
Creating Your Job
This is a very simple job meant as an example. In your favorite editor create the file myjob.sh
#!/bin/sh
# First setup the OSG environment
source $OSG_GRID/setup.sh
# Create a working directory in the local worker node
MYDIR=tmartin-$RANDOM
mkdir $OSG_WN_TMP/$MYDIR
cd $OSG_WN_TMP/$MYDIR
# Grab your application
wget http://hepuser.ucsd.edu/~tmartin/osg-tutorial/makenumbers.tar
# Setup the application
tar xvf makenumbers.tar
chmod 755 makenumbers
# Run the application
./makenumbers myrand.dat
# Create an area to stage the data out to
mkdir -p $OSG_DATA/tmartin-stageout/
# Stageout the results (remove any old results)
/bin/cp -fv myrand.dat $OSG_DATA/tmartin-stageout/
# Cleanup your temporary work area
rm -rf $OSG_WN_TMP/$MYDIR
# Exit
exit 0
Submitting Your Job
condor_submit tutorial.cmd
Checking the Status of Your Job
To check whether your job is still waiting for a slot, running or completed you can check the condor queue local to your submitter.
$ condor_q
-- Submitter: pebble.ucsd.edu : <132.239.236.97:32811> : pebble.ucsd.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
3.0 tmartin 8/2 11:50 0+00:00:00 I 0 9.8 myjob.sh
1 jobs; 1 idle, 0 running, 0 held
In this case the job is still idle waiting for a job slot and a response from the remote gatekeeper. Once the job is running you will see an R instead of an I, and once complete you may briefly see a C for complete while the job downloads any standard IO from the remote gatekeeper.
Getting Your Output
Once your job is complete in the bottom of the standard output file you will find text similar to
$ tail myjob.2.0.out
992
993
994
995
996
997
998
999
1000
`myrand.dat' -> `/osgfs/data/tmartin-stageout/myrand.dat'
The target of the verbose copy is the path to the file you created. With this path you can create a globus-url-copy command to retrieve the data file.
globus-url-copy -vb gsiftp://osg-gw-2.t2.ucsd.edu//osgfs/data/tmartin-stageout/myrand.dat file://localhost//`pwd`/myrand.dat
Glossary
- VDT - Virtual Data Toolkit
- OSG - Open Science Grid
- CE - Compute Element (OSG Gatekeeper)
- VO - Virtual Organization
- Proxy - X509 Proxy generated from a grid certificate
- VOMS - Virtual Organization Management Service
- GUMS - Grid user management system
Authors
-- TerrenceMartin - 02 Aug 2007