TWiki> UCSDTier2 Web>CREAMTest (revision 9)EditAttach

CREAM Tests

Introduction

OSG is evaluating various Grid Compute Elements (aka Gatekeepers) to determine which ones it should support. Feature list, ease of use, performance and reliability are all important aspects of the evaluation.

CREAM is a Grid Compute Element developed in EUROPE by gLite:

Client tests

OSG relies hevily on Condor-G for Grid submissions, so it was used for the client testing.

Condor-G only added functional support for CREAM in v7.3.2, but one should use at least the next stable release series, 7.4.X.

The v7.5.X development series has added additional improvements, so anyone looking for maximum performance should use that.

Our group has however collaborated with the Condor team for a long time to get to a release of Condor-G with CREAM support. Details can be found on the CREAM Support for CMS page.

Server tests

CREAM installation is only supported via RPMs.

Installation on osg-gw-3

Massimo Sgaravatto helpped with the installation, resulting in the following instructions:

Installation


- Copied in /etc/yum.repos.d the following repos:

http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/dag.repo
http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/lcg-CA.repo
http://etics-repository.cern.ch:8080/repository/pm/registered/repomd/name/patch_3179/etics-registered-build-by-name.repo (CREAM repo)


The third one will be replaced with a "cleaner" one as soon as glite patch #3179 will be in production
(current state is certified)



- Of course also the OS repo files are needed as well


- yum install yum-protectbase
Set protect=1 in the dag and OS repos
This is needed because the temporary CREAM repo can contain also rpms provided by the OS/DAG.
This won't be needed anymore when the temporary CREAM repo will be replaced with the production
one


- yum clean all
- yum update
- yum install java-1.6.0-openjdk tomcat5
- yum install lcg-CA
- yum install xml-commons-apis
- yum install glite-CREAM


- Installed/updated these RPMs (this won't be needed anymore when glite-CONDOR_utils is released):

http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/glite-info-dynamic-scheduler-condor/1.0.0/noarch/glite-info-dynamic-scheduler-condor-1.0.0-1.noarch.rpm

http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/glite-yaim-condor-utils/5.1.0/noarch/glite-yaim-condor-utils-5.1.0-1.noarch.rpm

http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/org.glite.apel.condor/2.0.6/slc4_ia32_gcc346/glite-apel-condor-2.0.6-2.noarch.rpm

Configuration and customizations:


- Customized the conf files (/root/SiteInfo/site-info.def and
/root/SiteInfo/services/glite-creamce)
siteinfo.def file (/opt/glite/yaim/examples/siteinfo/site-info.def)



- Run yaim:

/opt/glite/yaim/bin/yaim -c -s /root/SiteInfo/site-info.def -n creamCE -n CONDOR_utils



- Because in the WN the proper environment is not already defined, it was needed
to customize the CREAM JobWrapper? .
This was done following the instructions reported at:

http://grid.pd.infn.it/cream/field.php?n=Main.HowToCustomizeTheCREAMJobWrapper

("Instructions for CREAM CE >= 1.6 (glite-ce-cream >= 1.12)" section)

adding the following 2 lines:

export OSG_GRID=/code/osgcode/wn-client
. $OSG_GRID/setup.sh


after:

for((idx=0; idx<${#__environment[*]}; idx++)); do
eval export ${__environment[$idx]}
done


- Because the UCSD Condor installation requires a special arguments setting
in the Condor submit file, /opt/glite/bin/condor_submit.sh was mofified changing:

arguments = $arguments

into:

arguments = -wrapper_iwd $_CONDOR_SCRATCH_DIR $arguments

Service startup

- service tomcat5 stop
- /opt/glite/etc/init.d/glite-ce-blahparser stop
- service tomcat5 start

Client configuration

Condor 7.5.0 was used to run against CREAM.

Had to install a GridFTP? server with appropriate grid-mapfile.

The condor submit file had the following lines in it:

Universe = grid
grid_resource = cream https://osg-gw-3.t2.ucsd.edu:8443/ce-cream/services/CREAM2 condor osg-gw-3

The CREAM client in Condor-G needs valid VOMS CA pub keys; i.e. the vomsdir must be populated.

During the test, the load on the client became very high:
top - 21:08:42 up 310 days, 9:30, 1 user, load average: 41.64, 40.04, 38.89
Likely due to all the gridFTP sessios that were calling back.

Test run

A test on glidein-c against osg-gw-3 (on 2010/02/19):

  • Ran 10k 30 minute jobs against CREAM on osg-gw-3, using Condor-G v7.5.0 on glidein-c
  • CREAM finished in about 7 hours; for comparison, a similar GT2 run took about 9 hours.
  • Condor-G numbers are reality close to the CE ones under CREAM
    • Under CREAM Condor-G was almost always reporting the same numbers as the CE
    • For comparison, under GT2 Condor-G thought jobs were running almost 5 hours after they finished on the CE.
  • The test run ran only ~8k jobs; ~2k jobs got held
    • ~1.4k failed while staging in the input sandbox
    • ~600 failed while staging out the output sandbox
    • For comparison, under GT2 I observed no held jobs
  • CREAM jobs took much longer than 30min to complete
    • Under CREAM, over half took more than 90mins, with a non negligible fraction taking over 3 minutes
      This may be related to the heavy load on the Condor-G node, due to I/O handled by the gridFTP server
    • For comparison, under GT2 all jobs finished withing 31 minutes
  • For detailed results, see cream_10k.ods.
    cream_abs.png
    cream_30min_job_spread.png

-- IgorSfiligoi - 2009/11/03

Topic attachments
I Attachment Action Size Date Who Comment
elseods cream_10k.ods manage 59.5 K 2010/02/20 - 06:10 IgorSfiligoi Running 10k jobs against CREAM (and compared to GT2) - UCSD LAN - Condor 7.5.0
pngpng cream_10k_abs.png manage 45.4 K 2010/02/20 - 05:49 IgorSfiligoi Abs values of the 10k run
pngpng cream_30min_job_spread.png manage 27.2 K 2010/02/20 - 05:58 IgorSfiligoi Time spread of the 30min jobs
Edit | Attach | Print version | History: r13 | r11 < r10 < r9 < r8 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r9 - 2010/04/30 - 21:28:33 - IgorSfiligoi
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback