osgmonitoring.rpm Package Description

osgmonitoring-2.10.0-1.noarch.rpm: The rpm containing the files for the OSG monitoring package

Note: At installation time, the rpm will look for the lowest number local facility available in the syslog (/etc/syslog.conf file). It will add it to send the osgmonitoring messages to the file /var/log/osgmonitoring.log and put the correct value in the file /etc/osgmonitoring.conf

You may edit that two files to suit your needs, but reinstallations may change back to the original values...

What it does

It monitors processes (using ps and lsof) associated to the OSG infrastructure and logs their memory consumption, CPU utilization... through the syslog facility to a file /var/log/osgmonitoring.log and to a DB if configured.


The rpm requires: perl, perl-DBI, lsof.

Contents of the package


Comment: The log file /var/log/osgmonitoring.log will be created but is not part of the distribution.

Contents explained


Contains the basic configuration of the package. An example of the file:

#Syslog local facility
#Configure it too in the /etc/syslog.conf
syslogFacility = Substitute_It_By_A_Valid_syslogFacility
# Database Parameters
useDB = no
Database_user = cactisource
#The pass variable has to be in quotes "mypassword"
Database_password = "MyPasswd"
Database_db = MyDatabase
Database_server = myDatabaseServer.MyDomain
Database_TABLE = osgmonitoring
# System parameters
ExecutableFile_ps = /bin/ps
ExecutableFile_lsof = /usr/sbin/lsof

The variable syslogFacility has to point to a valid syslog facility. The rpm post install process will figure out which one is available and define it to point to a file: /var/log/osgmonitoring.log. Otherwise you have to do it.

The ExcutableFile _{ps,lsof} variables have to point to where the executables exist.

The DB parameters are not needed, but you can put them and it will use the DB too. The Database_password variable has to contain the double quotes (") that do NOT belong to the password.


This file holds the configuration of the processes to monitor. An Example of the file:

#A line starting with # is a comment
#Here you can not use the # caracter at the begining of the line for the NameToBePutToTheDB
globus-job-manager-script-real,/usr/bin/perl /osglocal/osgce/globus/libexec/globus-job-manager-script-real\.pl.*
globus-job-manager-script,/bin/sh /osglocal/osgce/globus/libexec/globus-job-manager-script\.pl.*
grid_manager_monitor_agent,perl .*grid_manager_monitor_agent.*
grid-monitor,perl /osglocal.*grid-monitor.*/grid-monitor-job-status.*
#Specially in the submitter

It consists of lines of two strings separated by a coma.

The first string is the name by which you tag the results.

The second is a regular expression the program will apply to the output of ps (actually the column containing the command) to get all the processes that match the expression and group them together as one entry with the previously stated name. The coma can not be in the regular expression (probably). You have to escape the . if it is a literal.

You can add more lines to the file (with this format) to monitor more processes. The ones given in the file as a default are for a condor based OSG Computing Element or submitter.


File to make the log rotation happen.


File holding when the script will run and directing the output of the standard out and error to the file /var/log/osgmonitoring.log. You can change the interval here. Be careful that it has time to finish before launching it again. Monitoring should not occur to often. Otherwise it modifies the results from the unperturbed system. It is set to once every 5 minutes by default.


The perl script that does the job. It is called by cron (as root). What it does:

1) It will read the /etc/osgmonitoring.conf file, otherwise exit with error which goes to the standard out (all output goes to standard out till stated later).

2) If there is a filter (which holds the regular expression that filters if it is a valid value) for the variable, it will apply the filter and if it passes the filter, it will assign the value to the variable.

3) It will check that all variables were read properly otherwise exit with error.

4) It will check that ps and lsof are there and are readable and executable.

5) If the Database is configured to be used, it will try to connect to it and exit with an error if it can not.

6) It will write the messages (including errors) from now on, to the syslog facility.

7) It will read the /etc/osgmonitoring-procs_to_watch.conf and line by line

- read the processed to be monitored

- Get the information for them and put it to the syslog facility and, if the Database is configured to be used, to the Database. It would be here, where you would write the results to a different place. A GRATIA DB?

The information for the list of open file with the lsof command is taken after the first one so, it could happen that the processes are dead before the command takes place. In this case you will get a -1 for the value. Very short lived processes show this behaviour.

8) It will close the DB connection.

Topic attachments
ISorted ascending Attachment Action Size Date Who Comment
elserpm osgmonitoring-2.10.0-1.noarch.rpm manage 7.3 K 2008/10/02 - 18:12 ToniCoarasa The rpm containing the files for the OSG monitoring package
zipZ osgcetest.tar.Z manage 4.0 K 2009/04/07 - 11:45 ToniCoarasa osgcetest package with scripts to run the jobs
Topic revision: r8 - 2009/04/07 - 11:46:47 - ToniCoarasa
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback