Installation Walkthrough and Notes on the GlideinWMS System

DISCLAIMER: Any information you read on this page is a lie. Actually, it isn't but you should treat it as such. These are my personal notes so YMMV! I do not want to be held responsible for your destruction of the world. Having said all of that, enjoy! :)

STEP 1: Obtain, install, and update a Linux distribution

I used Scientific Linux 5.3 from: http://www.scientificlinux.org/

You can also use Redhat Enterprise 5.3 or CentOS 5.3. My installation was on two VMWare Workstation 6.5.3 virtual machines named SL#1 and SL#2. I selected that they would have 512 MiB? of RAM and up to 16 GiB? hard disks. To use GlideinWMS, there is no requirement to use a 64-bit system. You can use a 32-bit and all of the software will be compatible. The reverse is not necessarily true, but I have not confirmed this yet. It's also a good idea to update the system with the latest recommended patches and security updates before continuing any futher. I also reconfigured /etc/ssh/sshd_config to use a different port and adjusted the VM to acquire an IP address from my router in order to make my virtual machines accessible remotely through SSH. I had to change /etc/hosts to list the names of my systems as FQDNs, aliases, and their local IP addresses. Also, I had to edit /etc/resolv.conf to setup DNS after switching to a static IP in the GUI since it wasn't getting it from DHCP anymore.

STEP 2: Install Condor

Option 1: Install from Pre-compiled Binary / RPM

This is the typical way most users will work with Condor. It is the fastest and easiest way to get it up and running.

Use a web-browser to download the file "condor-7.3.2-linux-x86_64-rhel5-dynamic-unstripped.tar.gz" from http://www.cs.wisc.edu/condor/downloads-v2/download.pl

Copy the RPM to the servers you wish to deploy it on.

Let's start installing it:

su - root
groupadd -g 5000 condor
useradd -c "Condor Daemon" -g 5000 -m -s /bin/bash -u 5000 condor

The user's home directory must be world readable.

chmod 744 /home/condor

We'll be installing as root since the documentation recommends this procedure. Now, move the tarball we downloaded into the condor directory.

mv condor-7.3.2-linux-x86_64-rhel5-dynamic-unstripped.tar.gz /home/condor
cd /home/condor

Since we got the tarball, we need to decompress it.

gunzip -c condor-7.3.2-linux-x86_64-rhel5-dynamic-unstripped.tar.gz | tar xvf -

[OPTIONAL & UNTESTED] If we got the RPM, the above command might look something like this instead:

rpm -ivh condor-7.3.2-linux-x86_64-rhel5-1.x86_64.rpm --prefix=/home/condor

Now, let's install Condor as the central manager since this is the first machine. We must run condor configure_ as root to install and set the file permissions correctly.

cd condor-7.3.2
./condor_install --verbose --prefix=/home/condor --local-dir=/home/condor/local --type=manager,execute --owner=condor

Ok, now that the Central Manager is installed, we should repeat the above installation on our other nodes which will run Condor. However, we modify the installation command slightly for execute and submit systems:

./condor_install --verbose --prefix=/home/condor --local-dir=/home/condor/local --type=submit,execute --owner=condor

If you wanted all three (not recommended for production or large systems), it might look something like this:

./condor_install --verbose --prefix=/home/condor --local-dir=/home/condor/local --type=manager,submit,execute --owner=condor

Just for good measure, make sure the owner is correct on all condor files:

chown -R condor:condor /home/condor

CONDOR IS NOW INSTALLED! Skip to the next section for configuration.

Option 2: Install from Source

WARNING! The instructions to install from source are not complete yet.

Since we're setting up a development system, we'll install the latest bleeding edge version which is currently 7.3.2. The installation manual for Condor 7.3.2 can be located here: http://www.cs.wisc.edu/condor/manual/v7.3/3_2Installation.html

The condor manual asks some preparation questions. Here are my answers:

  1. What machine will be the central manager? SL#1
  2. What machines should be allowed to submit jobs? SL#1, SL#2
  3. Will Condor run as root or not? Yes, since the installation documentation recommends it.
  4. Who will be administering Condor on the machines in your pool? I will! 8D
  5. Will you have a Unix user named condor and will its home directory be shared? Yes, but it will not be shared.
  6. Where should the machine-specific directories for Condor go? They can live with the rest of the condor application inside the condor daemon's directory.
  7. Where should the parts of the Condor system be installed? Inside the condor directory.
    • Configuration file: maybe in /condor/config
    • Release directory: maybe in /condor/release
      • user binaries: maybe in /condor/bin
      • system binaries: maybe in /condor/sbin
      • lib directory: maybe in /condor/lib
      • etc directory: maybe in /condor/etc
    • Documentation: maybe in /condor/doc
  8. Am I using AFS? Nope
  9. Do I have enough disk space for Condor? Yah, 50 MiB? right?

Ok, that's it for pre-installation questions. We'll be installing Condor on SL#1 first to setup the central information repository.

Let's start installing it:

su - root
groupadd -g 5000 condor
useradd -c "Condor Daemon" -g 5000 -m -s /bin/bash -u 5000 condor

The user's home directory must be world readable.

chmod 744 /home/condor
su - condor

Ok, the condor download site is a little goofy which prevents me from just using wget. Just download the src code tarball "condor_src-7.3.2-all-all.tar.gz" and move it into /home/condor/ on SL#1. You can do this however you want to. We're going to be compiling from source and not using a binary. This allows us greater flexibility for installation.

gunzip -c condor_src-7.3.2-all-all.tar.gz | tar xvf -
cd condor-7.3.2

Let's read the file README.building in this directory. It seems that we need some packages pre-installed. Luckily, SL 5.3 already has everything I need. I checked using yum list <package name> as the root user.

Ok, now let's start compiling.

cd src
./build_init

FAILURE. O M G. It was worth a try.

Required tools are present and valid, attempting to initialize build
configure.ac:2547: /usr/bin/m4: builtin `mkstemp' requested by frozen file is not supported
autom4te: /usr/bin/m4 failed with exit status: 1
autoheader: /usr/bin/autom4te failed with exit status: 1
Failed to initialize build, check errors and try again once fixed

Let's update m4 from source. Login as root again in another window.

wget ftp://ftp.scientificlinux.org/linux/scientific/5x/SRPMS/SL/m4-1.4.8-1.src.rpm

rpm -ivh m4-1.4.8-1.src.rpm

rpmbuild -bb --rebuild /usr/src/redhat/SPECS/m4.spec

My installation is 64-bit, so install this:

rpm -ivhF /usr/src/redhat/RPMS/x86_64/m4-1.4.8-1.x86_64.rpm

The 32-bit equivalent would be:

rpm -ivhF /usr/src/redhat/RPMS/i386/m4-1.4.8-1.i386.rpm

Let's try running the build initialization tool again:

./build_init

[condor@localhost src]$ ./build_init
Checking for version of autoheader >= 2.59...succeeded. (2.59)
Checking for version of autoconf >= 2.59...succeeded. (2.59)
Required tools are present and valid, attempting to initialize build
Build initialized, you can now run "./configure; make"

SUCCESS!!! Let's enable as many optional features as possible. NOTE: --prefix doesn't work yet.

./configure --enable-full-port --enable-soft-is-hard --enable-job-hooks --enable-hibernation --enable-ssh-to-job --with-buildid

make

make release

...TO BE CONTINUED

STEP 3: Configure Condor

Now that Condor is installed, it must be configured. At the end of the installation, this message is displayed:

In order for Condor to work properly you must set your CONDOR_CONFIG environment variable to point to your Condor configuration file:
/home/condor/etc/condor_config before running Condor commands/daemons. Created scripts which can be sourced by users to setup their
Condor environment variables. These are:
sh: /home/condor/condor.sh
csh: /home/condor/condor.csh

This means we need to update the environment variable CONDOR_CONFIG to be changed when you login as the condor user:

I just added this line to /home/condor/.bash_profile:

. ~/condor.sh

Let's login as the condor user:

su - condor

Now, we must manually edit the condor configuration file:

vi $CONDOR_CONFIG

So, according to this file, we must fill out Part 1 in order for condor to work. Here are my answers for the SL1 virtual machine:

LOCAL_DIR = /home/condor/local
CONDOR_ADMIN = me@mydomain.com
UID_DOMAIN = $(FULL_HOSTNAME)
FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)
COLLECTOR_NAME = ChrisB? Pool

There is also a required field to edit in Part 2 despite the file stating that Part 2 is optional (oh well):

HOSTALLOW_WRITE = *.mydomain.com

Let's also edit the local configuration for each machine:

vi /home/condor/local/condor_config.local

Here are my changes:

CONDOR_ADMIN = me@mydomain.com

UID_DOMAIN = $(FULL_HOSTNAME)

FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)

STEP 4: Start Condor

As the condor user, let's start the condor daemon:

su - condor

condor_master

STEP 5: Install GlideinWMS

Part 1: Check and Install Prerequisite Software

First, we need to verify all the required software is loaded on our system.

  • A reasonably recent Linux OS (RH/SL4 nad RH/SL5 tested at press time).
cat /etc/redhat-release

Scientific Linux SL release 5.3 (Boron)

  • The Python interpreter (v2.3.4 or above)

python -V
Python 2.4.3

  • The perl-Time-HiRes rpm.
perl -e 'use Time::HiRes;'
# NO ERROR - MEANS THAT IT IS INSTALLED
  • The OSG client software.
    • Installed from the GlideinWMS installer by selecting "Components" and then "OSG VDT client".
  • A HTTP server, like Apache or TUX.
    • Installed from the GlideinWMS installer by selecting "Components" and then "Web server".
  • The Condor distribution.
    • This is NOT (?) required if you install: pool Collector AND User Schedd
    • Installed from the GlideinWMS installer by selecting "Components" and then "Base Condor installation".
  • The RRDTool package (v1.2.18 or later)
    • The GlideinWMS installer tells us what to install by selecting "Components" and then "rrdtool graphics package".
    • Since no installation takes place, you have to go find the files on the web. The website that lists the installation RPMS points to broken links. I found the RPMs for my 64-bit installation here:
    • http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/
yum install ruby

# This is installed with --nodeps because it has a cross-dependency with rrdtool
rpm -ivh --nodeps perl-rrdtool-1.3.8-2.el5.rf.x86_64.rpm
rpm -ivh rrdtool-1.3.8-2.el5.rf.x86_64.rpm
rpm -ivh rrdtool-devel-1.3.8-2.el5.rf.x86_64.rpm
rpm -ivh python-rrdtool-1.3.8-2.el5.rf.x86_64.rpm
python -c "import M2Crypto? "
# NO ERROR - MEANS THAT IT IS INSTALLED

Part 2: Install GlideinWMS Collector (my SL1)

Ok. Let's proceed with installing the collector on SL1. First we need a new user to own the installation. Let's call them gfactory.

su - root
groupadd -g 5001 gfactory
useradd -c "GFactory Daemon" -g 5001 -m -s /bin/bash -u 5001 gfactory

Let's download and copy condor to the gfactory's home directory:

mv condor-7.3.1-linux-x86_64-rhel5-dynamic.tar.gz /home/gfactory/
cd /home/gfactory/
chown -R gfactory:gfactory /home/gfactory/condor-7.3.1-linux-x86_64-rhel5-dynamic.tar.gz

We want to begin the installation with our new user, MAKE SURE TO DO THIS AS THE gfactory USER. The installation adds a cron job for gfactory, but the linux OS may not support cron jobs from the root user. Here is how to start the installation:

mv glideinWMS/ /home/gfactory/
chown -R gfactory:gfactory /home/gfactory/glideinWMS
su - gfactory
cd glideinWMS/install
./glideinWMS_install

Choose the option "glideinWMS Collector" and then let's answer the questions for the installer interactively. Ignore warnings/errors about CA certificates not being installed. The VDT installer doesn't do it, but the glideWMS installer does. Here are the installer answers:

Do you have already a VDT installation?: (y/n) n
Do you want to install the full OSG VDT client?: (y/n) n
Do you want to install a minimal Grid VDT client?: (y/n) y
Where do you want the VDT installed?: [/home/gfactory/vdt] /home/gfactory/vdt
Directory '/home/gfactory/vdt' does not exist, should I create it?: (y/n) y
What pacman version should I use?: [pacman-3.26]

This next question is a trick question, don't type anything here, just press enter or you'll get an error.

Which platform do you want to use (leave empty for autodetect):
WARNING: It appears that SELinux is enabled on this computer. ... Please press enter to continue the installation, or control-c to cancel.
Do you agree to the licenses? [y/n] y
Where would you like to install CA files? l
Where should I fetch the CAs from?: [http://software.grid.iu.edu/pacman/cadist/ca-certs-version]
Where do you have the Condor tarball? /home/gfactory/condor-7.3.1-linux-x86_64-rhel5-dynamic.tar.gz
Where do you want to install it?: [/home/gfactory/glidecondor]
Directory '/home/gfactory/glidecondor' does not exist, should I create it?: (y/n) y
If something goes wrong with Condor, who should get email about it?: me@mydomain.com
Do you want to split the config files between condor_config and condor_config.local?: (y/n) [y]

In order for the Factory to submit to the grid, we need a proxy on all of the servers. However, we needed VDT to be installed before this is possible. Well, at least the minimal VDT is now installed, so we can use VOMS (Virtual Organization Membership Service) to create the proxy. Let's pause our installation in this window and open up another terminal window. In this new terminal, we will generate the proxy:

Using root, I put my certificate file cert.p12 in the gfactory directory, then I switched back to gfactory.

su - root
cd <the directory where the certificate is stored>
cp cert.p12 /home/gfactory
exit
su - gfactory

Now we need to source the VDT setup script in order to work with the VDT software.

. /home/gfactory/vdt/setup.sh

In order to initialize the voms proxy, I need the private key out of my cert. I was able to get my private key using this command and I saved it to a file called cert.key:

openssl pkcs12 -in cert.p12 -info

Now the permissions need to be adjusted on these files in order to generate the proxy:

chmod 644 /home/gfactory/cert.p12
chmod 400 /home/gfactory/cert.key

Run this command to generate the proxy for 500 hours:

/home/gfactory/vdt/glite/bin/voms-proxy-init -cert /home/gfactory/cert.p12 -key /home/gfactory/cert.key -out /home/gfactory/.globus/x509_service_proxy -valid 500:0.0

Now, let's make the cert and key owned by root to prevent any bad stuff from happening:

su - root
chown root:root /home/gfactory/cert.p12
chown root:root /home/gfactory/cert.key

Back in the previous terminal window with our Collector installation, let's continue where we left off.

Will you be using a proxy or a cert? (proxy/cert) proxy
Where is your proxy located?: /home/gfactory/.globus/x509_service_proxy
What name would you like to use for this pool?: [My glideinWMS pool] ChrisB? Pool
How many secondary schedds do you want?: [9] 3

******************************************
WMS collector successfully installed
******************************************

Part 3: Install GlideinWMS Factory (my SL1)

First, we must install some additional prerequisite software. We need to get the latest flot tarball, move it into the gfactory's home and extract it. Here is the website to get it from:

http://code.google.com/p/flot/

We will be using the same proxy used in Part 2, so no additional configuration will be necessary for the proxy. This file is:

/home/gfactory/.globus/x509_service_proxy

We also need to add a web directory for glidein submission:

su - root
mkdir /var/www/html/glidefactory
chown gfactory:gfactory /var/www/html/glidefactory

We are now installing the factory. Select option 2 from the glideinWMS installation script.

su - gfactory
cd /home/gfactory/glideinWMS/install
./glideinWMS_install

Here are my answers:

Do you have already a javascriptRRD installation?: (y/n) y
Where is javascriptRRD installed?: /home/gfactory/javascriptrrd-0.4.2
Do you have already a Flot installation?: (y/n) y
Where is Flot installed?: /home/gfactory/flot
Where is your proxy located?: /home/gfactory/.globus/x509_service_proxy
Where will you host your config and log files?: [/home/gfactory/glideinsubmit][/home/gfactory/glideinsubmit]
Directory '/home/gfactory/glideinsubmit' does not exist, should I create it?: (y/n) y
Where will the web data be hosted?: [/var/www/html/glidefactory]
What Web URL will you use?: [http://myhost.mydomain/glidefactory/] http://sl1/glidefactory/
Give a name to this Glidein Factory?: [mySites-sl1] gfactory-sl1
Give a name to this Glidein instance?: [v1_0]
What is the Condor base directory?: [/home/gfactory/glidecondor]
The following glidein schedds have been found: ... Do you want to use all of them?: (y/n) y
Do you want to use CCB (requires Condor 7.3.0 or better)?: (y/n) n
Please list all the GCB servers you will be using ... Leave an empty line when finished ... GCB node:
Do you want to use gLExec?: (y/n) y
Force VO frontend to provide its own proxy?: (y/n) [y] y
Do you want to fetch entries from RESS?: (y/n) [n] n

Part 4: Install GlideinWMS Frontend System (my SL2)

This is the frontend system.

Edit | Attach | Print version | History: r11 | r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r7 - 2009/10/13 - 00:36:30 - ChrisBoynton
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback