CREAM Tests
Introduction
OSG is evaluating various Grid Compute Elements (aka Gatekeepers) to determine which ones it should support. Feature list, ease of use, performance and reliability are all important aspects of the evaluation.
CREAM is a Grid Compute Element developed in EUROPE by
gLite:
Client tests
OSG relies hevily on
Condor-G for Grid submissions, so it was used for the client testing.
Condor-G only added functional support for CREAM in
v7.3.2, but one should use at least the next stable release series, 7.4.X. The v7.5.X development series has added
additional improvements, so anyone looking for maximum performance should use that.
Condor-G also needs gridFTP and VOMS certs installed in order to talk to a CREAM CE.
The
glideinWMS installer can be used for this purpose:
cvs -d :pserver:anonymous@cdcvs.fnal.gov:/cvs/cd_read_only co -r snapshot_100518_v2plus_Igor_CREAM glideinWMS
Moreover, our group has collaborated with the Condor team for a long time to get to a release of Condor-G with CREAM support. Details can be found on the
CREAM Support for CMS page.
Server tests
CREAM installation is only supported via RPMs.
Installation on osg-gw-3
Massimo Sgaravatto helpped with the installation, resulting in the following instructions:
Installation
(updated Apr 8th 2011)
Copied in /etc/yum.repos.d the following repos:
http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/dag.repohttp://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/glite-CREAM.repohttp://repository.egi.eu/sw/production/cas/1/current/repo-files/egi-trustanchors.repo Of course also the OS repo files are needed as well
yum clean all
yum update
yum install ca-policy-egi-core
yum install java-1.6.0-openjdk
yum install xml-commons-apis
yum install glite-CREAM
Installed/updated these RPMs (this won't be needed anymore when glite-CONDOR_utils is released):
http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/glite-info-dynamic-scheduler-condor/1.0.0/noarch/glite-info-dynamic-scheduler-condor-1.0.0-1.noarch.rpm http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/glite-yaim-condor-utils/5.1.0/noarch/glite-yaim-condor-utils-5.1.0-1.noarch.rpm http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/org.glite.apel.condor/2.0.6/slc4_ia32_gcc346/glite-apel-condor-2.0.6-2.noarch.rpm
PS: The official instructions are located at
http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:devel:install-cream32 but do not include how to install the Condor part.
Configuration and customizations:
Customized the conf files (/root/SiteInfo/site-info.def and
/root/SiteInfo/services/glite-creamce)
siteinfo.def file (/opt/glite/yaim/examples/siteinfo/site-info.def)
(attached is the
siteinfo used on osg-gw-3)
(updated April 8th 2011: Add CONDOR_GROUP_ENABLE=False to the end of site-info.def)
Run yaim:
/opt/glite/yaim/bin/yaim -c -s /root/SiteInfo/site-info.def -n creamCE -n CONDOR_utils
Because in the WN the proper environment is not already defined, it was needed
to customize the CREAM
JobWrapper? .
This was done following the instructions reported at:
http://grid.pd.infn.it/cream/field.php?n=Main.HowToCustomizeTheCREAMJobWrapper
("Instructions for CREAM CE >= 1.6 (glite-ce-cream >= 1.12)" section)
adding the following 2 lines:
export OSG_GRID=/code/osgcode/wn-client
. $OSG_GRID/setup.sh
after:
for((idx=0; idx<${#__environment[*]}; idx++)); do
eval export ${__environment[$idx]}
done
Because the UCSD Condor installation requires a special arguments setting
in the Condor submit file, /opt/glite/bin/condor_submit.sh was mofified changing:
arguments = $arguments
into:
arguments = -wrapper_iwd $_CONDOR_SCRATCH_DIR $arguments
Service startup
- service tomcat5 stop
- /opt/glite/etc/init.d/glite-ce-blahparser stop
- service tomcat5 start
Client configuration
Condor 7.5.0 was used to run against CREAM.
Had to install a
GridFTP? server with appropriate grid-mapfile.
The condor submit file had the following lines in it:
Universe = grid
grid_resource = cream https://osg-gw-3.t2.ucsd.edu:8443/ce-cream/services/CREAM2 condor osg-gw-3
The CREAM client in Condor-G needs valid VOMS CA pub keys; i.e. the vomsdir must be populated.
During the test, the load on the client became very high:
top - 21:08:42 up 310 days, 9:30, 1 user, load average: 41.64, 40.04, 38.89
Likely due to all the gridFTP sessios that were calling back.
Test run
A test on glidein-c against osg-gw-3 (on 2010/02/19):
- Ran 10k 30 minute jobs against CREAM on osg-gw-3, using Condor-G v7.5.0 on glidein-c
- CREAM finished in about 7 hours; for comparison, a similar GT2 run took about 9 hours.
- Condor-G numbers are reality close to the CE ones under CREAM
- Under CREAM Condor-G was almost always reporting the same numbers as the CE
- For comparison, under GT2 Condor-G thought jobs were running almost 5 hours after they finished on the CE.
- The test run ran only ~8k jobs; ~2k jobs got held
- ~1.4k failed while staging in the input sandbox
- ~600 failed while staging out the output sandbox
- For comparison, under GT2 I observed no held jobs
- CREAM jobs took much longer than 30min to complete
- Under CREAM, over half took more than 90mins, with a non negligible fraction taking over 3 minutes
This may be related to the heavy load on the Condor-G node, due to I/O handled by the gridFTP server
- For comparison, under GT2 all jobs finished withing 31 minutes
- For detailed results, see cream_10k.ods.

OSG use of CREAM
OSG is planning in officially suporting CREAM in the near term.
The first step in the process is represented by this
planning document.
--
IgorSfiligoi - 2009/11/03