Difference: BDIItests (5 vs. 6)

Revision 62008/10/10 - Main.SanjayPadhi

Line: 1 to 1
 

This page documents bdii loadtests done by Sanjay Padhi for OSG.

Description of tools used

Changed:
<
<
We use the python interface to root in order to plot results in realtime while the test is running, and then store the results histograms in rootfile. The logic of the test is as follows:
>
>
We use the python interface to root in order to plot results in realtime while the test is running, and then store the results histograms in rootfile. The logic of the test is as follows:
 
  • Run N python threads
  • Each thread queries the bdii with:
    • os.popen("ldapsearch -xLLL -p2170 -h is-dev.grid.iu.edu -b o=grid","r")
  • Each thread collects the return, and records the time it took to complete the query.
  • The average return time for the N threads is logged in hprof histogram, after all N threads have returned.
Changed:
<
<
  • Run the next N threads. The N threads are launched at most once per second. However, given that the return time of the ldapsearch is several seconds, the launch of the N threads is really much less often than once per second.
>
>
  • Run the next N threads. The N threads are launched at most once per second. However, given that the return time of the ldapsearch is several seconds, the launch of the N threads is really much less often than once per second.
 
  • Continue doing this for a fixed amount of time dt

We then record a few different things:

Line: 23 to 21
 

Final Results

Added:
>
>
Minimum sustainability:
  • Rate is 20.289463005545787 Hz with 1 failed thread
 For the final results, we ran this as follows:
  • Once at CERN for 1h with N=15.
  • 8 instances of the test program run in parallel on our 8core desktop at CERN
Line: 37 to 38
 

50 instances run as jobs submitted to various clusters

Changed:
<
<
* success.gif:
success.gif
>
>
* success.gif:
success.gif
 
Changed:
<
<
  • failure.gif:
    failure.gif
>
>
  • failure.gif:
    failure.gif
 
Changed:
<
<
  • avgqtime.gif:
    avgqtime.gif
>
>
  • avgqtime.gif:
    avgqtime.gif
 
Changed:
<
<
  • entries.gif:
    entries.gif
>
>
  • entries.gif:
    entries.gif
 

Consistency Checks on these data

It's probably a good idea to do some consistency checks of these data by comparing the entries of the histograms at the same time, and checking if it all makes sense.
Line: 69 to 66
 
    • Between the epoch times 1221857546 and 1221866546 the bdii was basically unuseable !!!
  • The turn-on curve where failures are starting to happen is very sharp at around 200-300 queries per minute, or 15-20 jobs with 15 threads of queries in parallel.
  • Surprisingly enough, the bdii recovers from this after the load subsides.
Changed:
<
<
    • In fact, at around time=700min has completely recovered, and is operating at about 15x15=225 queries per minute, with each query taking about 4seconds, thus reaching a peak of more than 3000 successful queries per minute.
>
>
    • In fact, at around time=700min has completely recovered, and is operating at about 15x15=225 queries per minute, with each query taking about 4seconds, thus reaching a peak of more than 3000 successful queries per minute.
 

One instance run from CERN

Line: 79 to 75
 

Results from miscellaneous initial testruns

Sunday September 14th

Changed:
<
<
Ran a few different short tests, then one longer test of a few hours. For the longer run we picked: N = 15 and dt = 12000 seconds = 200 minutes = 3h 20min, and 18000 seconds = 300min = 5h respectively.

We then ran this test simultaneously from CERN (12000 seconds) and UCSD (18000 seconds). The CERN test ended at 2:35 Monday September 15th CERN time, while the UCSD one ended at 19:39 pacific on the 14th, i.e. 2h and 4min later.

>
>
Ran a few different short tests, then one longer test of a few hours. For the longer run we picked: N = 15 and dt = 12000 seconds = 200 minutes = 3h 20min, and 18000 seconds = 300min = 5h respectively.
 
Changed:
<
<
  • Response time for the bdii querries from CERN:
    bdii-from-cern.gif
>
>
We then ran this test simultaneously from CERN (12000 seconds) and UCSD (18000 seconds). The CERN test ended at 2:35 Monday September 15th CERN time, while the UCSD one ended at 19:39 pacific on the 14th, i.e. 2h and 4min later.
 
Changed:
<
<
  • Response time for the bdii queries from UCSD:
    bdii-from-ucsd.gif
>
>
  • Response time for the bdii querries from CERN:
    bdii-from-cern.gif
 
Changed:
<
<
  • bdii host system monitoring: network traffic:
    riley-if_eth0-day.png
>
>
  • Response time for the bdii queries from UCSD:
    bdii-from-ucsd.gif
 
Changed:
<
<
  • bdii host system monitoring: netstat:
    riley-netstat-day.png
>
>
  • bdii host system monitoring: network traffic:
    riley-if_eth0-day.png
 
Changed:
<
<
  • bdii host system monitoring: processes:
    riley-processes-day.png
>
>
  • bdii host system monitoring: netstat:
    riley-netstat-day.png
 
Changed:
<
<
  • bdii host system monitoring: loadavg:
    riley-load-day.png
>
>
  • bdii host system monitoring: processes:
    riley-processes-day.png
 
Added:
>
>
  • bdii host system monitoring: loadavg:
    riley-load-day.png
 

Understanding the client profile better (Monday September 15th)

Changed:
<
<
To understand the client profile better, we did a series of tests where we varied N first on just one machine, and then having the same N but running the test program 4 times in parallel on 4 different (but identical hardware) hosts.
>
>
To understand the client profile better, we did a series of tests where we varied N first on just one machine, and then having the same N but running the test program 4 times in parallel on 4 different (but identical hardware) hosts.
  We find the the time per query depends significantly on the number of parallel python threads, but not significantly on whether we run one or 4 simultaneously.
N time for 4 in parallel time for one by itself
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback