PNFS Monitoring System

Introduction

The PNFS Monitoring software is an integrated system which checks the integrity of PNFS by finding missing and corrupted files. The system uses a MySQL? database to store information about each file and users can view the results via a web interface.

PNFS Crawler

The PNFS Crawler iterates over the PNFS filesystem and inserts specific PNFS file information in a MySQL? database. This database information is used subsequently by the PNFS Tester.

What PNFS Crawler Does

When executed pnfscrawler.py iterates over the PNFS file system within the directory tree below a starting path that is defined in the pnfscrawler.config file. The pnfscrawler then queries each file in the chosen directory tree and writes the following information into the database:

  1. The file's full path.
  2. The size of the file as recorded by PNFS.
  3. The Adler32 checksum of the file as recorded by PNFS.

The file paths are obtained by an os.path Python module which crawls a given directory and returns the file path. It is also used to return the size of the file. In order to find the recorded Adler32 checksum a special 'cat' command must be issued for each file. The format for this command is: cat '.(use)(2)(filename.data)'

Example for a file called "file.0": cat '.(use)(2)(file.0)' 2,0,0,0.0,0.0 :c=1:f9b03ea3; cabinet-1-1-12_1 cabinet-1-1-9_1

The last entries indicate which pools the file is at. The "f9b03ea3" is the Adler32 checksum.

PNFS File Tester

The files in the table are checked using Condor jobs. The system is scalable and as a result there is no set minimum or maximum number of jobs that can be run in parallel. This is because each job will continue until the entire table is checked. The more jobs there are running in parallel, the faster the integrity check will complete.

What PNFS File Tester Does

There are 4 components required for the Condor job to work properly. These are outlined below:

  1. pnfsmonitor.run: The batch script which is the executable in the Condor job file, this will set the correct permissions for the dependencies and run the Python script which will perform the file checking.
  2. adlercount: A dependency of the Python script which is a C program that calculates the Adler32 checksum as well as the number of bytes it reads in.
  3. dccp: A dependency of the Python script which performs the transfer of the file from the node it is on to the worker node. The transfer is done to a FIFO file and as a result no data is actually written to disk.
  4. job.py: The Python script which executes the checking of each file in the MySQL? table.

When the batch job is executed the depencies will be made executable (chmod +x will be run for adlercount, dccp and job.py). Then job.py will run which will go through the following process:

  1. Determine if the file needs to be tested by checking when it was last checked. See below for details on how parallelism is handled.
  2. Use dccp to transfer the file from the node it is on to the worker node. The transfer is done to a FIFO file which is being read by adlercount which reads in the bytes and calculates the Adler32 checksum.
  3. The calculated Adler32 checksum is compared to the checksum in the table and a statusid is produced for the file.
  4. The following data is written for each file checked:
    1. The time it was checked.
    2. Tag and tagexp are set back to 0 so that it is free to be checked again.
    3. The statusid produced by the Python script.
    4. The calculated Adler32 checksum.
    5. The node from which the file was transferred from.
    6. The number of bytes read in by the adlercount, which is used to calculate transfer rates for timed out files.

Details on handling parallelism: Each job will assign a tag (random number) to a file as well as the current time (seconds from epoch) which is the tag expiration. By assigning a tag it will be able to locate it in the database to run the check on it. In order to deal with crashing jobs the job will also look for files with old tag expirations.

Database Schema

There are two tables used by the PNFS checking sytem, statuses and events. Statuses contains the results by the PNFS Tester and the events table contains a running log of every file checked. The latter allows us to "calibrate" the performance of each dCache pool.

Statuses Table

Information recorded includes:

  1. The file's full path.
  2. Time it was checked (seconds from epoch).
  3. Tag and tagexp which are used by each Condor job to keep track the files it is checking.
  4. A statusid which is the result of the check.
  5. The PNFS recorded Adler32 checksum.
  6. The determined Adler32 checksum.
  7. The node from which the file was obtained.
  8. The number of bytes read-in by the Adler32 program (useful for determining transfer rates).
  9. The size of the file.

Events Table

Information recorded includes:

  1. The file's full path.
  2. The time and date it was checked.
  3. The node on which the check was performed on.
  4. The amount of seconds it took to check the file.
  5. The node from which the file was obtained.
  6. The size of the file.

Web Reports Interface

The web interface provides a series of canned reports that display the current PNFS Crawler and PNFS File Tester progress, as well as the status of any files within the PNFS system. e.g. if the file is missing or corrupted.

The interface consists of php scripts which access the MySQL? table and present information to a browser.

What Are Web Reports

There are 3 components to the web interface. They are outlined below:

  1. missing.php: Presents the files determined as missing.
  2. corrupted.php: Presents the files determined as corrupted (ie. the calculated Adler32 checksum and recorded Adler32 checksum do not match).
  3. timeout.php: Presents the files which were not able to be fully transferred in the timeout specified (500 seconds default).

HOWTO

Installation

There is an rpm package located at http://heppc5.ucsd.edu/pnfschecker-0.1-2.i386.rpm

This package must be installed on all nodes.

Database Setup

There are 2 steps to the database setup. They are outlined below:
  1. Creating the tables needed via schema files.
  2. Loading the database with a script which crawls pnfs and records filenames into the MySQL? tables.

The schemas as well as the script is located at http://heppc5.ucsd.edu/pnfschecker-dbload-0.1-2.tar.gz

There are 3 different schema files which should be used to create the tables necessary. The script must be used on a system which has pnfs mounted, and can be run by simply typing ./load.py

Condor Job Submission

Web Interface

There is an rpm package located at http://heppc5.ucsd.edu/pnfsmonitor-website-0.1-1.i386.rpm

This package must be installed on the webserver.

-- RamiVanguri - 14 Jul 2006

Topic revision: r8 - 2007/09/17 - 23:00:44 - RamiVanguri
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback