Computing Projects
Ground Rules for this Page
Projects listed here should always include the following:
- Brief description
- Starting date
- Primary person responsible plus nominal supervisor in case of students
- Link to "clean" copy of accomplishments and status
- Link to "dirty" copy of status, i.e. the electronic logbook for collaborative development
Projects involving students
Projects involving Undergraduates
Pnfs Consistency Checker
Brief Description
Develop, deploy, and package a system that checks the integrity of pnfs in a regularly scheduled fashion,
finds missing as well as corrupted files, and allows for automatic re-download of the missing or corrupt files
using
PhEDEx? , if possible.
Starting Date: April 2006
Projects involving Masters students (mostly CS)
Squid Scalability Investigation
Brief Description
Develop a set of tests to understand squid scalability, and use them to test squid in Xen, and compare
it with squid deployed on normal hardware.
Starting Date: April 2006
KiranKalyan? supervised by AbhishekSinghRana?
Projects involving Ph.D. Graduate students
On demand Job Monitoring
Brief Description
Develop an on demand job monitoring tool that allows users to do ps, ls, top, less, etc. on
their jobs running on the grid without having to know where they run.
Completed: Fall 2005
JobMon?
Generic Connection Broker Scalability Tests
Summary
The Condor Generic Connection Broker (GCB) is a piece of software developed independently for Condor to allow Condor daemons to run
over a firewall or through a NAT without difficulty. Scalibility was tested both in local deployment, and in Beta deployment on the OSG. Currently it is used by both the
NamCAF? and the OSGCAF.
Completed: Spring 2006
MatthewNorman supervised by Igor Sfiligoi (FNAL)
Condor-C Scalability Tests
Summary
Condor-C is Condor's distributed schedd approach to computing, involving deploying schedulers on multiple nodes to reduce load. Scaling
tests were done with an eye toward deployment at FNAL, but concluded that Condor-C does not scale with kerberos, and that switching to
GSI was currently impractical.
Completed: December 2005
MatthewNorman supervised by Igor Sfiligoi (FNAL)
Resources:
CDF Condor-C guide
Condor Manual Page for Grid Computing
Condor Scalability Tests
Summary
To reduce scheduler load, tests were done with deploying multiple schedds on a single node to reduce the single scheduler bottleneck. The
first tests failed, but a second set using web caching to offload the burden of user tarball transfer were more successful. The system was
declared satisfactory and was deployed in Spring 2006 at FNAL.
Completed: Spring 2006
MatthewNorman supervised by
ElliotLipeles and Igor Sfiligoi (FNAL)
Resources:
Old Multiple Schedd Page
OSG CAF
Brief Description
OSG-CAF was to be a CDF gateway onto the OSG, allowing single-point-of-submission access to a glide-in based Condor pool that could
harvest opportunistic resources from the entire Grid. Buzzwords aside, it lets CDF users treat the OSG as a giant CAF, and keeps the
headnodes and other maintenance tasks centralized at FNAL. Later split into two functions
Starting Date: January 2006
MatthewNorman supervised by Igor Sfiligoi (FNAL)
Current Status: Delayed due to lack of interest (OSGCAF) or deployed (NamCAF? )
Resources:
OSG Monitoring Page
Projects Involving Staff
NFS Lite Compute Element for OSG
Brief Description
Develop a configuration for an OSG 0.4.1 Compute Element that does not require any filesystem exports.
Completed April 2006
NfsLiteComputeElement?
Performance testing of NfsLiteComputeElement?
Brief Description
Starting Date: June 2006
NfsLiteComputeElementPerformance?
NfsLiteComputeElementPerformanceElog?
--
FkW - 27 Jun 2006
--
MatthewNorman - 27 Jun 2006