OSG Scalability, Reliability and Usability

The Open Science Grid is relying on several network facing services to deliver resources to its users. In order to avoid unexpected structural service problems, OSG has established a working group dedicated to testing the scalability and reliability of the services it uses. This team tests both existing and proposed software packages up to (and possibly slightly beyond) scales expected to be reached in the foreseeable future, evaluating and documenting both the response times and the failure modes.

This page provides an aggregation point of the above mentioned activity.

Activities list

Major activities:

Minor activities:

Publications

Conference publications:

  • Scalability of network facing services used in the Open Science Grid, CHEP2010 [ Preprint ]
  • An update on the scalability limits of the Condor batch system, CHEP2010 [ Preprint ]
  • Using Condor glideins for distributed testing of network-facing services, IWHGA2010 [ Paper | Preprint ]

Technical reports:

  • Loadtest_condor – A workload generating framework for testing scalability and reliability of the Condor system [ OSG Doc 1013 ]
  • Procs_monitor - A process-level resource monitor [ OSG Doc 1012 ]
  • Evaluation of new Compute Element software for the Open Science Grid: GRAM5 and CREAM [ OSG Doc 1006 ]

Internal documents

-- IgorSfiligoi - 2011/02/01

Topic revision: r4 - 2011/04/19 - 18:37:03 - IgorSfiligoi
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback