Changes need to port OSG modifications to GRAM5 jobmanager-condor

Overview

OSG uses a patched/augmented version jobmanager-condor (condor.in->condor.pm). The jobmanager-condor changed slightly between GT2 and GRAM5, so we had to merge the two.

-- IgorSfiligoi - 2009/11/16

A) Differences between the "vanilla" GRAM5 condor.pm file and the OSG/VDT "vanilla" condor.pm file:

  • In the OSG/VDT file they specify "use Config" so that they can use either 32 or 64-bit libraries.
  • The GRAM5 file specifies variables for a condor_config, condor_check_vanilla_files, and condor_mpi_script. The latter of these is to allow for a "parallel" universe case.
  • GRAM5 allows a "parallel" universe case.
  • OSG/VDT makes individual condor log files for pre-WS GRAM jobs.
  • OSG/VDT makes a number of patches for Gratia. GRAM5 does not initialize Gratia.
  • GRAM5 allows more user specified variables to the SCRIPT FILE (i.e. WhenToTransferOutput? ), this is not something we want to allow.
  • The OSG/VDT file verifies OSG installation and OSG install location on worker nodes.

B) NFS-lite changes made to the condor.pm file:

  • NFS-Lite determines scratch directory for use.
  • NFS-Lite configures the following variables in the SCRIPT FILE:

should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_output = true

C) UCSD Specific changes made to the condor.pm file:

  • UCSD condor pool is configured with a wrapper so extra parameters are required in the condor.pm file.
  • UCSD matches lognames to the appropriate condor groups for accounting purposes.
  • UCSD defines an architecture and an operating system.
  • UCSD tests to see if there is a scratch directory.
  • UCSD defines maxRunTime, maxQTime, and periodic_remove within the SCRIPT FILE. This is important so that jobs do not stay in the queue for an elongated period of time.
  • UCSD checks the version of condor to make sure an older version is not being used.
  • UCSD adds a lot of extra logging for troubleshooting purposes.

-- ChristopherTheissen - 2009/12/16

Edit | Attach | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2009/12/16 - 18:23:03 - ChristopherTheissen
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback