Difference: OsgmonitoringrpmPackageDescription (1 vs. 8)

Revision 82009/04/07 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

Line: 117 to 117
 8) It will close the DB connection.
Deleted:
<
<

Intallation Instructions for dummys

Installation on client and server

ssh root@uaf-2
wget http://hepuser.ucsd.edu/twiki2/pub/UCSDTier2/OsgmonitoringrpmPackageDescription/osgmonitoring-2.10.0-1.noarch.rpm
rpm -i osgmonitoring-2.10.0-1.noarch.rpm 

Then you need to change /etc/osgmonitoring.conf to look like the one in uaf-2:/home/users/fkw

Verifying that installation was successful

After 5minutes, just check /var/log/osgmonitoring.log The monitoring writes both into the logfile locally on disk, and the database, as configured in /etc/osgmonitoring.conf

To see the cacti monitoring for this, check here. Looks like we don't have the right plots in t2sentry0 cacti display any more for the uaf machines. Need to ask Terrence about this.

Running the tests for dummys

Initial comments:

  • the clients are monitored only if the osgmonitoring rpm is instaled on that node
  • the tests work even if the clients are not monitored

On uaf-2 (other uaf machines do not have the osgmonitoring rpm installed) you untar [[][this]] tarball. Then:

 
cd osgcetest
./SKEL/SubmitSKELTimes.sh 100 1 
.

This creates 100 jobs that will get submitted automatically to the cluster. All that's done is in a directory with current date inside the current working dir.

Things left to do

  1. create the client side tarball, and document it on this twiki, and attach the tarball to the twiki page.
  2. communicate with Terrence to make sure that we have the client monitoring online on uaf-2, and uaf-1.
  3. document the process of putting monitoring via t2sentry0 and cacti into place

Goal to be finished: Monday April 6th

-- ToniCoarasa - 26 Sep 2008

 
META FILEATTACHMENT attachment="osgmonitoring-2.10.0-1.noarch.rpm" attr="" comment="The rpm containing the files for the OSG monitoring package" date="1222971168" name="osgmonitoring-2.10.0-1.noarch.rpm" path="osgmonitoring-2.10.0-1.noarch.rpm" size="7458" stream="osgmonitoring-2.10.0-1.noarch.rpm" tmpFilename="/usr/tmp/CGItemp36473" user="ToniCoarasa" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="osgcetest.tar.Z" attr="" comment="osgcetest package with scripts to run the jobs" date="1238776065" name="osgcetest.tar.Z" path="osgcetest.tar.Z" size="3729" stream="osgcetest.tar.Z" tmpFilename="/usr/tmp/CGItemp16384" user="ToniCoarasa" version="1"
>
>
META FILEATTACHMENT attachment="osgcetest.tar.Z" attr="" comment="osgcetest package with scripts to run the jobs" date="1239104731" name="osgcetest.tar.Z" path="osgcetest.tar.Z" size="4063" stream="osgcetest.tar.Z" tmpFilename="/usr/tmp/CGItemp19287" user="ToniCoarasa" version="2"

Revision 72009/04/03 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

Line: 154 to 154
 

Things left to do

Added:
>
>
 
  1. create the client side tarball, and document it on this twiki, and attach the tarball to the twiki page.
  2. communicate with Terrence to make sure that we have the client monitoring online on uaf-2, and uaf-1.
  3. document the process of putting monitoring via t2sentry0 and cacti into place
Line: 162 to 164
  -- ToniCoarasa - 26 Sep 2008
Added:
>
>
 
META FILEATTACHMENT attachment="osgmonitoring-2.10.0-1.noarch.rpm" attr="" comment="The rpm containing the files for the OSG monitoring package" date="1222971168" name="osgmonitoring-2.10.0-1.noarch.rpm" path="osgmonitoring-2.10.0-1.noarch.rpm" size="7458" stream="osgmonitoring-2.10.0-1.noarch.rpm" tmpFilename="/usr/tmp/CGItemp36473" user="ToniCoarasa" version="1"
Added:
>
>
META FILEATTACHMENT attachment="osgcetest.tar.Z" attr="" comment="osgcetest package with scripts to run the jobs" date="1238776065" name="osgcetest.tar.Z" path="osgcetest.tar.Z" size="3729" stream="osgcetest.tar.Z" tmpFilename="/usr/tmp/CGItemp16384" user="ToniCoarasa" version="1"

Revision 62009/03/30 - Main.FkW

Line: 1 to 1
 
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

Line: 116 to 116
  8) It will close the DB connection.
Added:
>
>

Intallation Instructions for dummys

Installation on client and server

ssh root@uaf-2
wget http://hepuser.ucsd.edu/twiki2/pub/UCSDTier2/OsgmonitoringrpmPackageDescription/osgmonitoring-2.10.0-1.noarch.rpm
rpm -i osgmonitoring-2.10.0-1.noarch.rpm 

Then you need to change /etc/osgmonitoring.conf to look like the one in uaf-2:/home/users/fkw

Verifying that installation was successful

After 5minutes, just check /var/log/osgmonitoring.log The monitoring writes both into the logfile locally on disk, and the database, as configured in /etc/osgmonitoring.conf

To see the cacti monitoring for this, check here. Looks like we don't have the right plots in t2sentry0 cacti display any more for the uaf machines. Need to ask Terrence about this.

Running the tests for dummys

Initial comments:

  • the clients are monitored only if the osgmonitoring rpm is instaled on that node
  • the tests work even if the clients are not monitored

On uaf-2 (other uaf machines do not have the osgmonitoring rpm installed) you untar [[][this]] tarball. Then:

 
cd osgcetest
./SKEL/SubmitSKELTimes.sh 100 1 
.

This creates 100 jobs that will get submitted automatically to the cluster. All that's done is in a directory with current date inside the current working dir.

Things left to do

  1. create the client side tarball, and document it on this twiki, and attach the tarball to the twiki page.
  2. communicate with Terrence to make sure that we have the client monitoring online on uaf-2, and uaf-1.
  3. document the process of putting monitoring via t2sentry0 and cacti into place

Goal to be finished: Monday April 6th

 -- ToniCoarasa - 26 Sep 2008

META FILEATTACHMENT attachment="osgmonitoring-2.10.0-1.noarch.rpm" attr="" comment="The rpm containing the files for the OSG monitoring package" date="1222971168" name="osgmonitoring-2.10.0-1.noarch.rpm" path="osgmonitoring-2.10.0-1.noarch.rpm" size="7458" stream="osgmonitoring-2.10.0-1.noarch.rpm" tmpFilename="/usr/tmp/CGItemp36473" user="ToniCoarasa" version="1"

Revision 52009/01/03 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

Line: 24 to 24
 /etc/cron.d/osgmonitoring.cron /usr/local/bin/osgmonitoring.pl
Changed:
<
<
Comment: The log file /var/log/osgmonitoring.log will be created but is not part or the distribution.
>
>
Comment: The log file /var/log/osgmonitoring.log will be created but is not part of the distribution.
 

Contents explained

/etc/osgmonitoring.conf

Line: 54 to 54
 

/etc/osgmonitoring-procs_to_watch.conf

Changed:
<
<
This file holds the configuration of the processes to monitor. An Example of the file:
>
>
This file holds the configuration of the processes to monitor. An Example of the file:
 
#A line starting with # is a comment
#NameToBePutToTheDB,RegularExpressionToFindTheProcesses

Revision 42008/10/02 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

Added:
>
>
osgmonitoring-2.10.0-1.noarch.rpm: The rpm containing the files for the OSG monitoring package
 
Added:
>
>
Note: At installation time, the rpm will look for the lowest number local facility available in the syslog (/etc/syslog.conf file). It will add it to send the osgmonitoring messages to the file /var/log/osgmonitoring.log and put the correct value in the file /etc/osgmonitoring.conf

You may edit that two files to suit your needs, but reinstallations may change back to the original values...

 

What it does

It monitors processes (using ps and lsof) associated to the OSG infrastructure and logs their memory consumption, CPU utilization... through the syslog facility to a file /var/log/osgmonitoring.log and to a DB if configured.

Dependencies

Changed:
<
<
The rpm requires: perl, perl-DBI, lsof.
>
>
The rpm requires: perl, perl-DBI, lsof.
 

Contents of the package

/etc/osgmonitoring.conf
Line: 83 to 89
 

/etc/cron.d/osgmonitoring.cron

Changed:
<
<
File holding when the script will run. You can change the interval here. Be careful that it has time to finish before launching it again. Monitoring should not occur to often. Otherwise it modifies the results from the unperturbed system. It is set to once every 5 minutes by default.
>
>
File holding when the script will run and directing the output of the standard out and error to the file /var/log/osgmonitoring.log. You can change the interval here. Be careful that it has time to finish before launching it again. Monitoring should not occur to often. Otherwise it modifies the results from the unperturbed system. It is set to once every 5 minutes by default.
 

/usr/local/bin/osgmonitoring.pl

The perl script that does the job. It is called by cron (as root). What it does:

Changed:
<
<
1) It will read the /etc/osgmonitoring.conf file, otherwise exit with error.
>
>
1) It will read the /etc/osgmonitoring.conf file, otherwise exit with error which goes to the standard out (all output goes to standard out till stated later).
  2) If there is a filter (which holds the regular expression that filters if it is a valid value) for the variable, it will apply the filter and if it passes the filter, it will assign the value to the variable.
Line: 112 to 118
 8) It will close the DB connection.

-- ToniCoarasa - 26 Sep 2008 \ No newline at end of file

Added:
>
>

META FILEATTACHMENT attachment="osgmonitoring-2.10.0-1.noarch.rpm" attr="" comment="The rpm containing the files for the OSG monitoring package" date="1222971168" name="osgmonitoring-2.10.0-1.noarch.rpm" path="osgmonitoring-2.10.0-1.noarch.rpm" size="7458" stream="osgmonitoring-2.10.0-1.noarch.rpm" tmpFilename="/usr/tmp/CGItemp36473" user="ToniCoarasa" version="1"

Revision 32008/09/30 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

Line: 9 to 9
 It monitors processes (using ps and lsof) associated to the OSG infrastructure and logs their memory consumption, CPU utilization... through the syslog facility to a file /var/log/osgmonitoring.log and to a DB if configured.

Dependencies

Changed:
<
<
The rpm needs: perl perl-DBI lsof
>
>
The rpm requires: perl, perl-DBI, lsof.
 

Contents of the package

/etc/osgmonitoring.conf

Revision 22008/09/26 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

Changed:
<
<

What it does:

>
>

What it does

  It monitors processes (using ps and lsof) associated to the OSG infrastructure and logs their memory consumption, CPU utilization... through the syslog facility to a file /var/log/osgmonitoring.log and to a DB if configured.

Dependencies

The rpm needs: perl perl-DBI lsof

Changed:
<
<

Contents of the package:

>
>

Contents of the package

 
/etc/osgmonitoring.conf
/etc/osgmonitoring-procs_to_watch.conf
/etc/logrotate.d/osgmonitoring
Line: 17 to 19
 /usr/local/bin/osgmonitoring.pl

Comment: The log file /var/log/osgmonitoring.log will be created but is not part or the distribution.

Changed:
<
<

Contents explained:

>
>

Contents explained

 

/etc/osgmonitoring.conf

Changed:
<
<
Contains the basic configuration of the package: An example of the file:
>
>
Contains the basic configuration of the package. An example of the file:
 
#Syslog local facility
#Configure it too in the /etc/syslog.conf
syslogFacility = Substitute_It_By_A_Valid_syslogFacility
Line: 40 to 42
  The variable syslogFacility has to point to a valid syslog facility. The rpm post install process will figure out which one is available and define it to point to a file: /var/log/osgmonitoring.log. Otherwise you have to do it.
Changed:
<
<
The ExcutableFile? _{ps,lsof} variables have to point to where the executables exist.
>
>
The ExcutableFile _{ps,lsof} variables have to point to where the executables exist.
  The DB parameters are not needed, but you can put them and it will use the DB too. The Database_password variable has to contain the double quotes (") that do NOT belong to the password.

/etc/osgmonitoring-procs_to_watch.conf

Added:
>
>
This file holds the configuration of the processes to monitor.
 An Example of the file:

#A line starting with # is a comment
Line: 66 to 69
 condor_gridmanager,.*condor_gridmanager.* gahp_server,.*gahp_server.*
Deleted:
<
<
This file holds the configuration of the processes to monitor.
 It consists of lines of two strings separated by a coma.

The first string is the name by which you tag the results.

Line: 100 to 101
  6) It will write the messages (including errors) from now on, to the syslog facility.
Changed:
<
<
7) It will read the /etc/osgmonitoring-procs_to_watch.conf and line by line - read the processed to be monitored - Get the information for them and put it to the syslog facility and, if the Database is configured to be used, to the Database. It would be here, where you would write the results to a different place. A GRATIA DB?
>
>
7) It will read the /etc/osgmonitoring-procs_to_watch.conf and line by line

- read the processed to be monitored

- Get the information for them and put it to the syslog facility and, if the Database is configured to be used, to the Database. It would be here, where you would write the results to a different place. A GRATIA DB?

  The information for the list of open file with the lsof command is taken after the first one so, it could happen that the processes are dead before the command takes place. In this case you will get a -1 for the value. Very short lived processes show this behaviour.

Revision 12008/09/26 - Main.ToniCoarasa

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="OSGCESubmitterTestingAndMonitoring"

osgmonitoring.rpm Package Description

What it does:

It monitors processes (using ps and lsof) associated to the OSG infrastructure and logs their memory consumption, CPU utilization... through the syslog facility to a file /var/log/osgmonitoring.log and to a DB if configured.

Dependencies

The rpm needs: perl perl-DBI lsof

Contents of the package:

/etc/osgmonitoring.conf
/etc/osgmonitoring-procs_to_watch.conf
/etc/logrotate.d/osgmonitoring
/etc/cron.d/osgmonitoring.cron
/usr/local/bin/osgmonitoring.pl

Comment: The log file /var/log/osgmonitoring.log will be created but is not part or the distribution.

Contents explained:

/etc/osgmonitoring.conf

Contains the basic configuration of the package: An example of the file:

#Syslog local facility
#Configure it too in the /etc/syslog.conf
syslogFacility = Substitute_It_By_A_Valid_syslogFacility
# Database Parameters
useDB = no
Database_user = cactisource
#The pass variable has to be in quotes "mypassword"
Database_password = "MyPasswd"
Database_db = MyDatabase
Database_server = myDatabaseServer.MyDomain
Database_TABLE = osgmonitoring
# System parameters
ExecutableFile_ps = /bin/ps
ExecutableFile_lsof = /usr/sbin/lsof

The variable syslogFacility has to point to a valid syslog facility. The rpm post install process will figure out which one is available and define it to point to a file: /var/log/osgmonitoring.log. Otherwise you have to do it.

The ExcutableFile? _{ps,lsof} variables have to point to where the executables exist.

The DB parameters are not needed, but you can put them and it will use the DB too. The Database_password variable has to contain the double quotes (") that do NOT belong to the password.

/etc/osgmonitoring-procs_to_watch.conf

An Example of the file:

#A line starting with # is a comment
#NameToBePutToTheDB,RegularExpressionToFindTheProcesses
#Here you can not use the # caracter at the begining of the line for the NameToBePutToTheDB
condor_master,.*condor_master.*
condor_procd,.*condor_procd.*
condor_schedd,.*condor_schedd.*
condor_shadow,.*condor_shadow.*
condor_starter,.*condor_starter.*
globus-job-manager,globus-job-manager.*
globus-job-manager-script-real,/usr/bin/perl /osglocal/osgce/globus/libexec/globus-job-manager-script-real\.pl.*
globus-job-manager-script,/bin/sh /osglocal/osgce/globus/libexec/globus-job-manager-script\.pl.*
grid_manager_monitor_agent,perl .*grid_manager_monitor_agent.*
grid-monitor,perl /osglocal.*grid-monitor.*/grid-monitor-job-status.*
globus-url-copy,.*globus-url-copy.*
#Specially in the submitter
condor_gridmanager,.*condor_gridmanager.*
gahp_server,.*gahp_server.*

This file holds the configuration of the processes to monitor.

It consists of lines of two strings separated by a coma.

The first string is the name by which you tag the results.

The second is a regular expression the program will apply to the output of ps (actually the column containing the command) to get all the processes that match the expression and group them together as one entry with the previously stated name. The coma can not be in the regular expression (probably). You have to escape the . if it is a literal.

You can add more lines to the file (with this format) to monitor more processes. The ones given in the file as a default are for a condor based OSG Computing Element or submitter.

/etc/logrotate.d/osgmonitoring

File to make the log rotation happen.

/etc/cron.d/osgmonitoring.cron

File holding when the script will run. You can change the interval here. Be careful that it has time to finish before launching it again. Monitoring should not occur to often. Otherwise it modifies the results from the unperturbed system. It is set to once every 5 minutes by default.

/usr/local/bin/osgmonitoring.pl

The perl script that does the job. It is called by cron (as root). What it does:

1) It will read the /etc/osgmonitoring.conf file, otherwise exit with error.

2) If there is a filter (which holds the regular expression that filters if it is a valid value) for the variable, it will apply the filter and if it passes the filter, it will assign the value to the variable.

3) It will check that all variables were read properly otherwise exit with error.

4) It will check that ps and lsof are there and are readable and executable.

5) If the Database is configured to be used, it will try to connect to it and exit with an error if it can not.

6) It will write the messages (including errors) from now on, to the syslog facility.

7) It will read the /etc/osgmonitoring-procs_to_watch.conf and line by line - read the processed to be monitored - Get the information for them and put it to the syslog facility and, if the Database is configured to be used, to the Database. It would be here, where you would write the results to a different place. A GRATIA DB?

The information for the list of open file with the lsof command is taken after the first one so, it could happen that the processes are dead before the command takes place. In this case you will get a -1 for the value. Very short lived processes show this behaviour.

8) It will close the DB connection.

-- ToniCoarasa - 26 Sep 2008

 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback