OSG Bdii Testing

Introduction

A series of tests are done in order to better understand how OSG's central US server will respond to a high volume of bdii queries. The tests are done using the script bdii_par.py (from bdiisrc.tar.gz). One of the main goals is to meausre the failure rate of bdii queries as a function of the query rate. Also measured is the rate of both successful and failed queries as a function of the number of script instances running.

The tar can be found at: /pnfs/t2.ucsd.edu/data4/cms/phedex/store/user/spadhi/osg/bdii/bdiisrc.tar.gz

Implementation

The script bdii_par.py is configured to send 15 queries at a time to the server is-dev.grid.iu.edu:2170. The script then waits for the queries to return, counting successful and failed queries for a designated total amount of time.

In order to test the success and failure rates, we vary the number of script instances running simultaneously. For all of the tests, we distribute the instances over five 8 core machines at the UCSD T2. The number of instances run are 5, 10, 15, 20, 25, and 30. There are thus at most 6 instances per 8 core machine running. For each of these points, the test was run for 12 hours, with the 5, 20, and 30 points also being tested for shorter time periods (2 or 3 hours).

Results

Summary of Results

Our main findings are:

  • The number of fails (and the rate of fails) increase as the number of script instances increase.
  • There are no fails below a certain number of instances: 10.
  • There are no fails below a certain rate of querries: 10Hz.
  • The total rate of queries never exceeds 20 Hz. As we run more simultaneous querries,the querries take longer, and a larger fraction of them fails.

Detailed Results

Run Summary Table

The following table summarizes the results of the different runs. Table entries are the total of all the processes (script instances) in a given run, unless the column says average, in which case it is the average of the different processes.

What is meant by 'Input time' is the parameter in the script which tells the script how long to run. 'Average run time' is the average of the actual time each script runs for, as output by the script. 'Average pass (fail) rate' is 'Total queries passed (failed)' divided by 'Average run time.'

Num processes Input time (h) Average run time (s) Total queries passed Average pass rate (Hz) Total queries failed Average fail rate (Hz)
5 2 7204 70095 9.73 0 0
5 12 43202 425055 9.84 0 0
10 12 43218 635995 14.7 5 0.00012
15 3 10872 160183 14.7 47 0.0043
15 12 43221 820458 19.0 102 0.0023
20 3 10847 161719 14.90 176 0.016
20 12 43312 834140 19.3 775 0.017
25 12 43380 666060 15.4 1665 0.038
30 2 7337 109202 14.9 388 0.052
30 12 43464 659036 15.16 2419 0.055

Failure rate versus Number of processes

What is plotted here is the failure rate as a function of the total number of script instances running concurrently. This data is in the table above in the first and last columns. Recall that each of these processes is an instance of our script. Each script runs 15 querries in parallel.

bdii_fails_v_proc.jpg

Script Instances Histogram

What is plotted below is a histogram of the individual script instances. The top histogram is the individual script success rate, that is, the number of queries passed for that script divided by the total time that that script ran for. The bottom histogram is the individual script failure rate: number of failed queries for each script divided by that script's total run time. In both histograms, the colors are the total number of script instances running for that instance, as indicated in the legend. The different plots are overlayed, not stacked.

bdii_query_data.jpg

-- WarrenAndrews - 2008/12/18

Topic attachments
I Attachment Action Size Date Who Comment
jpgjpg bdii_fails_v_proc.jpg manage 15.5 K 2008/12/20 - 03:16 WarrenAndrews  
jpgjpg bdii_fails_v_proc2.jpg manage 14.5 K 2008/12/20 - 03:21 WarrenAndrews  
jpgjpg bdii_query_data.jpg manage 22.5 K 2008/12/20 - 03:59 WarrenAndrews  
jpgjpg bdii_query_data2.jpg manage 22.5 K 2008/12/20 - 04:01 WarrenAndrews  
Topic revision: r6 - 2008/12/22 - 06:38:06 - FkW
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback