March 2007 OSG Workshop Installation Notes
Introduction
This document outlines the steps to installing an OSG CE. It is a supplement to the official OSG CE installation documentation and is meant to streamline a few of the steps to assist in completing a basic CE installation in about 2 hours. All of the information here was obtained from the official OSG Documentation with the exception of a couple of helper scripts.
This installation does not assume you have a batch system installed, but if you do we will install the required packages to hook the OSG CE into the batch system.
Getting Started
The complete instructions for installing an OSG CE are available at
0.5.n CE Install Guide
Check List
- Make sure your system time is synchronized to a remote time server
- Make sure your machine IP address has correct forward and reverse DNS resolution.
- For most setups you probably need to have a mechanism to synchronize Usernames between the CE and the worker nodes.
- Your batch system is installed and working
- Email the hostname for your CE to tmartin@ucsd.edu with subject OSG T3 Install
Firewall Configuration
Even though the install document does this last we really want to get it out of the way first. If you are running a host firewall on your CE via iptables add the following lines to your /etc/sysconfig/iptables script (assuming RHEL based distro).
# MonaLisa
-A RH-Firewall-1-INPUT -m state --state NEW -p tcp -m tcp --dport 9000:9010 -j ACCEPT
# Globus
-A RH-Firewall-1-INPUT -m state --state NEW -p tcp -m tcp --dport 40000:50000 -j ACCEPT
# GRAM
-A RH-Firewall-1-INPUT -m state --state NEW -p tcp -m tcp --dport 2119 -j ACCEPT
# Gridftp
-A RH-Firewall-1-INPUT -m state --state NEW -p tcp -m tcp --dport 2811 -j ACCEPT
Create your OSG base directories.
First you need to select a base directory for your OSG CE. This directory should be different from other parts of the system. I usually create my own directory off root that is unique to the OSG.
eg.
mkdir /osglocal
cd /osglocal
Second if you do not have a separate NFS server you need to create a path that will be exported to all your nodes. If you do have a remote NFS server you should create a subdirectory under its exported file system.
mkdir /osgremote
cd /osgremote
Creating all the Users
The OSG CE requires that you have at least one username for each VO that you support. This installation assumes that you will create a username for each VO currently listed in the official OSG CE installation documentation. In the interests of time I have prepared a script that takes a base directory as an argument and creates all the necessary users. The goal here is to simply mounting this area via NFS and autofs.
cd /osgremote
wget http://hepuser.ucsd.edu/~tmartin/osgce/makeusers.sh
sh ./makeusers.sh /osgremote/users
Downloading pacman
Download the following script into /osglocal and run it. This will install pacman and initialize your environment.
cd /osglocal
wget http://hepuser.ucsd.edu/~tmartin/osgce/pacmaninstall.sh
sh ./pacmaninstall.sh
cd pacman/pacman-3.19
source setup.sh
cd /osglocal
Creating a directory for your OSG CE packages
cd /osglocal
mkdir osgce
Installing the OSG CE
Note: Pacman may not recognize your platform. In that case you have to select one if its supported platforms that is closest to your actual platform. The only way to find out is to try running pacman to install the CE.
cd /osglocal/osgce
pacman -get OSG:ce
If this fails due to platform identification problems run the following to get a list of possible platforms
pacman -platforms
Then re-run pacman install, you have to purge the old cache though as you cannot change platforms apparently.
cd /osglocal/osgce
rm -rf o..pacman..o/
pacman -pretend-platform:[PLATFORM] -get OSG:ce
Note: Sometimes if pacman fails it is required that you clear the entire /osglocal/osgce directory contents. eg rm -rf /osglocal/osgce/*.
Get a Drink
It takes a few minutes to download and install all the packages.
Creating Additional Paths
Assuming your CE is also your NFS server for OSG_APP you need to create an area that will be exported to the nodes. This is easiest if it is a completely different path than /osglocal/. We will use the /osgremote that we used for the users home area.
OSG_APP Required
mkdir /osgremote/osg_app
chmod 1777 /osgremote/osg_app
OSG_DATA Optional although can be handy and some VOs need this directory to run at your site.
mkdir /osgremote/osg_data
chmod 1777 /osgremote/osg_app
Setting up the OSG Environment
This step initializes the OSG environment so you can run OSG CE commands and configuration scripts.
cd /osglocal/osgce
source setup.sh
Installing Packages for the Batch System
Depending on your batch system you will select the appropriate command.
If you are using an existing installation of condor then you need to tell the installer where to find it.
export VDTSETUP_CONDOR_LOCATION=/yourcondorelease/
export VDTSETUP_CONDOR_CONFIG=${VDTSETUP_CONDOR_LOCATION}/etc/condor_config
pacman -get OSG:Globus-Condor-Setup
or
pacman -get OSG:Globus-PBS-Setup
or
pacman -get OSG:Globus-LSF-Setup
or
pacman -get OSG:Globus-SGE-Setup
Configuring the Public Key Infrastructure
/osglocal/osgce/vdt/setup/setup-cert-request
Then hit q
Requesting and installing a host certificate
Each OSG CE requires a host certificate(s) that is signed by a trusted Certificate Authority. The following are the steps to follow to aquire a signed host cert for your CE.
cd /osglocal/osgce
source setup.sh
mkdir ./hostcerts
cd ./hostcerts
cert-request -ou s -dir . -label _fully-qualified-hostname_
Once this is complete your should recieve an email notifying you your cert is approved. Retrieve your cert as follows.
cert-retrieve -certnum 0xXXX -dir . -label my-host
mkdir -p /etc/grid-security/
mv ./usercert.pem /etc/grid-security/hostcert.pem
mv ./userkey.pem /etc/grid-security/hostkey.pem
chmod 444 /etc/grid-security/hostcert.pem
chmod 400 /etc/grid-security/hostkey.pem
You also need an ldap cert
mkdir /osglocal/hostcerts/ldap
cd /osglocal/hostcerts/ldap
cert-request -ou s \
-dir . \
-host my-host.some.domain \
-service ldap \
-label my-host-ldap
cert-retrieve -certnum 0xXXXX -dir . -label my-host-ldap
mkdir /etc/grid-security/ldap
mv ./usercert.pem /etc/grid-security/ldap/ldapcert.pem
mv ./userkey.pem /etc/grid-security/ldap/ldapkey.pem
chmod 444 /etc/grid-security/ldap/ldapcert.pem
chmod 400 /etc/grid-security/ldap/ldapkey.pem
chown -R daemon.daemon /etc/grid-security/ldap
Finally we need to install the http cert.
mkdir /osglocal/hostcerts/http
cd /osglocal/hostcerts/http
cert-request -ou s \
-dir . \
-host my-host.some.domain \
-service http \
-label my-host-http
cert-retrieve -certnum 0xXXXX -dir . -label my-host-http
mkdir /etc/grid-security/http
mv ./usercert.pem /etc/grid-security/http/httpcert.pem
mv ./userkey.pem /etc/grid-security/http/httpkey.pem
chmod 444 /etc/grid-security/http/httpcert.pem
chmod 400 /etc/grid-security/http/httpkey.pem
chown -R daemon.daemon /etc/grid-security/http
Starting the grid services
The OSG CE services must be started be the configuration in OSG 0.6.0
vdt-register-service --name MLD --type init --enable
vdt-control --on
root@ppe-ce /osglocal/osgce# vdt-control --on
enabling cron service fetch-crl... no crontab for root
ok
enabling cron service vdt-rotate-logs... ok
skipping init service 'gris' -- marked as disabled
enabling inetd service globus-gatekeeper... ok
enabling inetd service gsiftp... ok
enabling init service mysql... ok
enabling init service globus-ws... FAILED! (see vdt-install.log)
skipping cron service 'edg-mkgridmap' -- marked as disabled
enabling cron service gums-host-cron... ok
skipping init service 'MLD' -- marked as disabled
enabling init service apache... ok
enabling init service tomcat-5... ok
enabling cron service gratia-condor... ok
Configuring OSG Attributes
./monitoring/configure-osg.sh
Latitude Longitude Finder
http://www.satsig.net/maps/lat-long-finder.htm
***********************************************************************
################# Configuration for the OSG CE Node ###################
***********************************************************************
This script collects the necessary information required by the various
monitoring and discovery systems for operating for the OSG.
A definition of the attributes that you will have to enter below is in:
http://osg.ivdgl.org/twiki/bin/view/Integration/LocalStorageRequirements
Intructions on how to use this script are in:
http://osg.ivdgl.org/twiki/bin/view/Integration/LocalStorageConfiguration
Your CE may not provide some of the CE-Storages (DATA, SITE_READ, SITE_WRITE,
DEFAULT_SE). In those instances, the value to enter is UNAVAILABLE
At any time, you can <CNTL-C> out of the script and no updates will be applied.
Preset information you are not prompted for
--------------------------------------------
These variables are preset at installation and cannot be changed:
OSG location
Globus location
User-VO map file
gridftp.log location
Information about your site in general
--------------------------------------
Group: The monitoring group your site is participating in.
- for the integration testbed, use OSG-ITB.
- for production, use OSG.
Hostname: The hostname by which you want this node to be identified.
It is used in setting the jobmanager contact identification as in
ppe-ce.ucsd.edu/jobmanager-blah.
Site name: The name by which the monitoring infrastructure
will refer to this resource.
Sponsors: The VO sponsors for your site.
For example: usatlas, ivdgl, ligo, uscms, sdss...
You must express the percentage of sponsorship using
the following notation.
myvo:50 yourvo:10 anothervo:20 local:20
Policy URL: This is the URL for the document describing the usage policy /
agreement for this resource
Specify your OSG GROUP [OSG-ITB]:
Specify your OSG HOSTNAME [ppe-ce.ucsd.edu]:
Specify your OSG SITE NAME [UCSDPPETest]:
Specify your VO sponsors [cms:75 cdf:25]:
Specify your policy url [http://tier2.ucsd.edu]:
Information about your site admininistrator
-------------------------------------------
Contact name: The site administrator's full name.
Contact email: The site adminstrator's email address.
Specify a contact for your server (full name) [Terrence Martin]:
Specify the contact's email address [tmartin@ucsd.edu]:
Information about your servers location
----------------------------------------
City: The city your server is located in or near.
Country: The country your server is located in.
Logitude/Latitude: For your city. This will determine your placement on any
world maps used for monitoring. You can find some approximate values
for your geographic location from:
http://geotags.com/
or you can search your location on Google
For USA: LAT is about 29 (South) ... 48 (North)
LONG is about -123 (West coast) ... -71 (East coast)
Specify your server's city [La Jolla]:
Specify your server's country [USA]:
Specify your server's longitude [-117.0703]:
Specify your server's latitude [32.5468]:
Information about the available storage on your server
------------------------------------------------------
GRID: Location where the OSG WN Client (wn-client.pacman) has
been installed.
APP: Typically used to store the applications which will run on
this gatekeeper. As a rule of thumb, the OSG APP should be on
- dedicated partition
- size: at least 10 GB.
DATA: Typically used to hold output from jobs while it is staged out to a
Storage Element.
- dedicated partition
- size: at least 2 GB times the maximum number of simultaneously
running jobs that your cluster's batch system can support.
WN_TMP: Used to hold input and output from jobs on a worker node where the
application is executing.
- local partition
- size: at least 2 GB
SITE_READ: Used to stage-in input for jobs using a Storage Element or for
persistent storage between jobs. It may be the mount point of a
dCache SE accessed read-only using dcap.
SITE_WRITE: Used to store to a Storage Element output from jobs or for
persistent storage between jobs. It may be the mount point of a
dCache SE accessed write-only using dcap.
Specify your OSG GRID path [/osglocal/osgce]:
Specify your OSG APP path [/osgremote/osg_app]:
Specify your OSG DATA path [/osgremote/osg_data]:
Specify your OSG WN_TMP path [/tmp]:
Specify your OSG SITE_READ path [UNAVAILABLE]:
Specify your OSG SITE_WRITE path [UNAVAILABLE]:
Information about the Storage Element available from your server
----------------------------------------------------------------
A storage element exists for this node.
This is the Storage Element (SE) that is visible from all the nodes of this
server (CE). It may be a SE local or close to the CE that is preferred as
destination SE if the job does not have other preferences.
Is a storage element (SE) available [y] (y/n):
Specify your default SE [ppe-ce.ucsd.edu]:
Information needed for the MonALISA monitoring.
-----------------------------------------------
MonALISA services are being used.
If you do not intend to run MonALISA for monitoring purposes, you can
skip this section.
Ganglia host: The host machine ganglia is running on.
Ganglia port: The host machine's port ganglia is using.
VO Modules: (y or n) If 'y', this will activate the VO Modules module
in the MonALISA configuration file.
Would you like to start the MonALISA monitoring services [y] (y/n):
Are you using Ganglia [n] (y/n):
Do you want to run the OSG VO Modules [y] (y/n):
Information needed for the squid caching.
-----------------------------------------------
squid services are being used.
If you do not intend to run squid for web caching purposes, you can
skip this section.
Would you like to use the squid caching service [y] (y/n): n
Information about the batch queue manager used on your server
-------------------------------------------------------------
The supported batch managers are:
condor pbs fbs lsf sge
For condor: The CONDOR_CONFIG variable value is needed.
For sge: The SGE_ROOT variable value is needed
Specify your batch queue manager OSG_JOB_MANAGER [condor]:
Specify installation directory for condor [/osglocal/condor]:
Specify the Condor config location [/etc/condor/condor_config]:
Are you using the ManagedFork service [n] (y/n):
##### ##### ##### ##### ##### ##### ##### #####
Please review the information below:
***********************************************************************
################# Configuration for the OSG CE Node ###################
***********************************************************************
Preset information you are not prompted for
--------------------------------------------
OSG location: /osglocal/osgce
Globus location: /osglocal/osgce/globus
User-VO map file: /osglocal/osgce/monitoring/grid3-user-vo-map.txt
gridftp.log file: /osglocal/osgce/globus/var/gridftp.log
Information about your site in general
--------------------------------------
Group: OSG-ITB
Hostname: ppe-ce.ucsd.edu
Site name: UCSDPPETest
Sponsors: cms:75 cdf:25
Policy URL: http://tier2.ucsd.edu
Information about your site admininistrator
-------------------------------------------
Contact name: Terrence Martin
Contact email: tmartin@ucsd.edu
Information about your servers location
----------------------------------------
City: La Jolla
Country: USA
Longitude: -117.0703
Latitude: 32.5468
Information about the available storage on your server
------------------------------------------------------
WN client: /osglocal/osgce
Directories:
Application: /osgremote/osg_app
Data: /osgremote/osg_data
WN tmp: /tmp
Site read: UNAVAILABLE
Site write: UNAVAILABLE
Information about the Storage Element available from your server
----------------------------------------------------------------
A storage element exists for this node.
Storage Element: ppe-ce.ucsd.edu
Information needed for the MonALISA monitoring.
-----------------------------------------------
MonALISA services are being used.
Ganglia host: UNAVAILABLE
Ganglia port: UNAVAILABLE
VO Modules: y
Information needed for the squid caching.
-----------------------------------------------
squid services are NOT being used.
Squid host: UNAVAILABLE
Squid caching policy:
Squid disk cache size:
Squid disk cache size:
Information about the batch queue manager used on your server
-------------------------------------------------------------
Batch queue: condor
Job queue: ppe-ce.ucsd.edu/jobmanager-condor
Utility queue: ppe-ce.ucsd.edu/jobmanager
Condor location: /osglocal/condor
Condor config: /etc/condor/condor_config
PBS location:
FBS location:
SGE location:
SGE_ROOT:
LSF location:
Is ManagedFork being used? n
##################################################
##################################################
Is the above information correct (y/n)?: y
##-----------------------------------------##
Updating /osglocal/osgce/monitoring/osg-attributes.conf file now.
... creating new /osglocal/osgce/monitoring/osg-attributes.conf
... previous file saved as /osglocal/osgce/monitoring/osg-attributes.conf.osgsave.2
DONE
##-----------------------------------------##
Creating /osglocal/osgce/monitoring/osg-job-environment.conf file now.
... creating new /osglocal/osgce/monitoring/osg-job-environment.conf
DONE
##-----------------------------------------##
Checking for grid3-locations.txt file now.
... already exists
-rw-rw-rw- 1 root root 383 Mar 6 10:28 /osgremote/osg_app/etc/grid3-locations.txt
... no need to copy it again
DONE
##-----------------------------------------##
Configuring MonALISA now.
... MonALISA service are being used.
... executing configure_monalisa script as
/osglocal/osgce/vdt/setup/configure_monalisa --server y --ganglia-used n --vdt-install /osglocal/osgce --user daemon --farm "UCSDPPETest" --monitor-group "OSG-ITB" --contact-name "Terrence Martin" --contact-email "tmartin@ucsd.edu" --city "La Jolla" --country "USA" --latitude "32.5468" --longitude "-117.0703" --vo-modules "y" --globus-location "/osglocal/osgce/globus" --condor-location "/osglocal/condor" --condor-config "/etc/condor/condor_config" --pbs-location "" --lsf-location "" --fbs-location "" --sge-location "" --auto-update n
... MonALISA should NOT be running
... /etc/init.d/MLD should not exist.
DONE
##-----------------------------------------##
Configuring GIP now.
...executing configure-osg-gip.sh
Information status of GUMS Service
----------------------------------
Would you like to publish the status of the GUMS server that you may have
configured? More information about how to properly setup monitoring can be
found at the following URL.
- http://vdt.cs.wisc.edu/releases/1.6.0/notes/GUMS.html
Do you want to publish your gums status through GIP (Y/n): [y]
Information about a possible SRM storage element
------------------------------------------------
If an SRM (Storage Resource Management) Storage Element exists that you would
like to associate with this Compute Element, please answer 'Y'
Do you want to publish your SRM information through GIP (Y/n): [n]
writing configuration files...
Configuring GIP...
WARNING: VO list file /osglocal/osgce/monitoring/osg-user-vo-map.txt not
found.
... executing configure_gip script as
/osglocal/osgce/vdt/setup/configure_gip
WARNING: VO list file /osglocal/osgce/monitoring/osg-user-vo-map.txt not
found.
DONE
... squid service NOT being used.
##-----------------------------------------##
Configuring squid now.
Squid not being used, skipping vdt configure invocation.
##-----------------------------------------##
Configuring CEMon now.
Configuring CEMon to subscribe to ITB data consumers
Executing configure_cemon as: /osglocal/osgce/vdt/setup/configure_cemon --server y --consumer=https://osg-ress-4.fnal.gov:8443/ig/services/CEInfoCollector --topic=OSG_CE --dialect=OLD_CLASSAD
The following consumer subscription has been installed:
HOST: https://osg-ress-4.fnal.gov:8443/ig/services/CEInfoCollector
TOPIC: OSG_CE
DIALECT: OLD_CLASSAD
Executing configure_cemon as: /osglocal/osgce/vdt/setup/configure_cemon --server y --consumer=http://is-itb.grid.iu.edu:14001 --topic=OSG_CE --dialect=RAW
The following consumer subscription has been installed:
HOST: http://is-itb.grid.iu.edu:14001
TOPIC: OSG_CE
DIALECT: RAW
DONE
##-----------------------------------------##
Configuring Gratia now.
Configuring Gratia to report to ITB server
Executing configure_gratia as: /osglocal/osgce/vdt/setup/configure_gratia --probe-cron --site-name UCSDPPETest --probe condor --report-to gratia-osg.fnal.gov:8881
Enabling gratia using: /osglocal/osgce/vdt/sbin/vdt-control --on gratia-condor
enabling cron service gratia-condor... ok
DONE
*** configure-osg.sh completed ***
Gridmap File Authorization
This installation assumes gridmapfile authorization. GUMS authorization is outside the scope.
mkdir /etc/grid-security
cd /osglocal/osgce
source setup.sh
./vdt/setup/configure_edg_make_gridmap
No output is good!
Note: Must run edg-mkgridmap manually so we do not have to wait for the cron job
Monitoring Setup
This should have been done when you ran ./configure-osg.sh.
CEMon
CEMon is installed and it is configured when you run ./configure-osg.sh
Site Verification
Although outside the scope of the install the official site verify can be accessed as follows.
su - myuser
cd /osglocal/osgce
source ./setup.sh
grid-proxy-init
cd verify
./site_verify.pl
Worker Node Client Install
Note: Do this at the end
There are two choices here. The first is installing wn-client on the file server you use to serve OSG_APP to the cluster. Not all sites may have a file server separate from the CE although this is recommended. If you install the wn-client on the CE then you need to log out and log back in to clean up your environment.
cd /osglocal/pacman/pacman-3.19
source setup.sh
mkdir -p /osgremote/wn-client
cd /osgremote/wn-client
VDTSETUP_AGREE_TO_LICENSES=y
export VDTSETUP_AGREE_TO_LICENSES
VDTSETUP_INSTALL_CERTS=l
export VDTSETUP_INSTALL_CERTS
VDTSETUP_EDG_CRL_UPDATE=y
export VDTSETUP_EDG_CRL_UPDATE
VDTSETUP_ENABLE_ROTATE=n
export VDTSETUP_ENABLE_ROTATE
pacman -trust-all-caches -get OSG:wn-client
If you needed to use -pretende-platform before please add that option to the OSG:wn-client install.
--
TerrenceMartin - 01 Mar 2007