--
JohnWeigand - 2010/09/21 - test results:
Did not know how to create this condition specifically for this type exception.
Tests I did perform:
*1. Stopped the globus-gatekeeper on the entry CE.*
Results - Only 1 glidein was started on the WMS Collector. This message in the client log file indicated the resource was down:
In submit_20100921_cms_jgw-v2_4_3.main.log.. 026 (004.000.000) 09/21 08:59:38 Detected Down Grid Resource GridResource: gt2 gr9x0.fnal.gov/jobmanager-condor
However, at 09:38, this 'tuple index out of range' stacktrace appeared log files and never appeared again over an 01:30:00 time period. I do not believe it is related to this issue. The log files this appeared in where the factory (not client logs):
- glidein_v2_4_3/log/entry_ress_ITB_GRATIA_TEST_2/factory.20100921.info.log
- glidein_v2_4_3/log/entry_ress_ITB_GRATIA_TEST_2/factory.20100921.err.log
[2010-09-21T09:38:37-05:00 14896] WARNING: Exception occurred: ['Traceback (most recent call last):\n', ' File "/home/weigand/glidein/glideinWMS.v2_4_3_alpha_1/factory/glideFactoryEntry.py", line 453, in iterate\n write_stats()\n', ' File "/home/weigand/glidein/glideinWMS.v2_4_3_alpha_1/factory/glideFactoryEntry.py", line 357, in write_stats\n glideFactoryLib.factoryConfig.log_stats.write_file()\n', ' File "/home/weigand/glidein/glideinWMS.v2_4_3_alpha_1/factory/glideFactoryMonitoring.py", line 870, in write_file\n diff_summary=self.get_diff_summary()\n', ' File "/home/weigand/glidein/glideinWMS.v2_4_3_alpha_1/factory/glideFactoryMonitoring.py", line 795, in get_diff_summary\n sdel[4][\'username\']=username\n', 'IndexError: tuple index out of range\n']
2. Started the entry CE globus-gatekeeper at 10:40. %BR& Results: It processed the user jobs successfully. I then shutdown the entry CE gatekeeper. It recognized the down resource correctly. The glidein pilots on the WMS collector continued running. I did not get the warning message again.