osgmonitoring.rpm Package Description

osgmonitoring-2.10.0-1.noarch.rpm: The rpm containing the files for the OSG monitoring package

Note: At installation time, the rpm will look for the lowest number local facility available in the syslog (/etc/syslog.conf file). It will add it to send the osgmonitoring messages to the file /var/log/osgmonitoring.log and put the correct value in the file /etc/osgmonitoring.conf

You may edit that two files to suit your needs, but reinstallations may change back to the original values...

What it does

It monitors processes (using ps and lsof) associated to the OSG infrastructure and logs their memory consumption, CPU utilization... through the syslog facility to a file /var/log/osgmonitoring.log and to a DB if configured.

Dependencies

The rpm requires: perl, perl-DBI, lsof.

Contents of the package

/etc/osgmonitoring.conf
/etc/osgmonitoring-procs_to_watch.conf
/etc/logrotate.d/osgmonitoring
/etc/cron.d/osgmonitoring.cron
/usr/local/bin/osgmonitoring.pl

Comment: The log file /var/log/osgmonitoring.log will be created but is not part of the distribution.

Contents explained

/etc/osgmonitoring.conf

Contains the basic configuration of the package. An example of the file:

#Syslog local facility
#Configure it too in the /etc/syslog.conf
syslogFacility = Substitute_It_By_A_Valid_syslogFacility
# Database Parameters
useDB = no
Database_user = cactisource
#The pass variable has to be in quotes "mypassword"
Database_password = "MyPasswd"
Database_db = MyDatabase
Database_server = myDatabaseServer.MyDomain
Database_TABLE = osgmonitoring
# System parameters
ExecutableFile_ps = /bin/ps
ExecutableFile_lsof = /usr/sbin/lsof

The variable syslogFacility has to point to a valid syslog facility. The rpm post install process will figure out which one is available and define it to point to a file: /var/log/osgmonitoring.log. Otherwise you have to do it.

The ExcutableFile _{ps,lsof} variables have to point to where the executables exist.

The DB parameters are not needed, but you can put them and it will use the DB too. The Database_password variable has to contain the double quotes (") that do NOT belong to the password.

/etc/osgmonitoring-procs_to_watch.conf

This file holds the configuration of the processes to monitor. An Example of the file:

#A line starting with # is a comment
#NameToBePutToTheDB,RegularExpressionToFindTheProcesses
#Here you can not use the # caracter at the begining of the line for the NameToBePutToTheDB
condor_master,.*condor_master.*
condor_procd,.*condor_procd.*
condor_schedd,.*condor_schedd.*
condor_shadow,.*condor_shadow.*
condor_starter,.*condor_starter.*
globus-job-manager,globus-job-manager.*
globus-job-manager-script-real,/usr/bin/perl /osglocal/osgce/globus/libexec/globus-job-manager-script-real\.pl.*
globus-job-manager-script,/bin/sh /osglocal/osgce/globus/libexec/globus-job-manager-script\.pl.*
grid_manager_monitor_agent,perl .*grid_manager_monitor_agent.*
grid-monitor,perl /osglocal.*grid-monitor.*/grid-monitor-job-status.*
globus-url-copy,.*globus-url-copy.*
#Specially in the submitter
condor_gridmanager,.*condor_gridmanager.*
gahp_server,.*gahp_server.*

It consists of lines of two strings separated by a coma.

The first string is the name by which you tag the results.

The second is a regular expression the program will apply to the output of ps (actually the column containing the command) to get all the processes that match the expression and group them together as one entry with the previously stated name. The coma can not be in the regular expression (probably). You have to escape the . if it is a literal.

You can add more lines to the file (with this format) to monitor more processes. The ones given in the file as a default are for a condor based OSG Computing Element or submitter.

/etc/logrotate.d/osgmonitoring

File to make the log rotation happen.

/etc/cron.d/osgmonitoring.cron

File holding when the script will run and directing the output of the standard out and error to the file /var/log/osgmonitoring.log. You can change the interval here. Be careful that it has time to finish before launching it again. Monitoring should not occur to often. Otherwise it modifies the results from the unperturbed system. It is set to once every 5 minutes by default.

/usr/local/bin/osgmonitoring.pl

The perl script that does the job. It is called by cron (as root). What it does:

1) It will read the /etc/osgmonitoring.conf file, otherwise exit with error which goes to the standard out (all output goes to standard out till stated later).

2) If there is a filter (which holds the regular expression that filters if it is a valid value) for the variable, it will apply the filter and if it passes the filter, it will assign the value to the variable.

3) It will check that all variables were read properly otherwise exit with error.

4) It will check that ps and lsof are there and are readable and executable.

5) If the Database is configured to be used, it will try to connect to it and exit with an error if it can not.

6) It will write the messages (including errors) from now on, to the syslog facility.

7) It will read the /etc/osgmonitoring-procs_to_watch.conf and line by line

- read the processed to be monitored

- Get the information for them and put it to the syslog facility and, if the Database is configured to be used, to the Database. It would be here, where you would write the results to a different place. A GRATIA DB?

The information for the list of open file with the lsof command is taken after the first one so, it could happen that the processes are dead before the command takes place. In this case you will get a -1 for the value. Very short lived processes show this behaviour.

8) It will close the DB connection.

Topic attachments
I Attachment Action Size Date WhoSorted ascending Comment
zipZ osgcetest.tar.Z manage 4.0 K 2009/04/07 - 11:45 ToniCoarasa osgcetest package with scripts to run the jobs
elserpm osgmonitoring-2.10.0-1.noarch.rpm manage 7.3 K 2008/10/02 - 18:12 ToniCoarasa The rpm containing the files for the OSG monitoring package
Topic revision: r8 - 2009/04/07 - 11:46:47 - ToniCoarasa
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback