Difference: OSGCESubmitterTestingAndMonitoring (1 vs. 18)

Revision 182009/09/10 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 16 to 16
 

Test hardware

UCSD Test Cluster

  • CE hardware
    • 2 x AMD Opteron 275 (4 cores total)
    • 8GB of memory
    • 2TB disk space, mounted as RAID
  • CE software
    • CentOS? 5.2 (x86_64)
    • Condor 7.2
  • worker nodes
    • uses test slots on the production worker nodes
      (shadow pool)
    • there are about 4 test slots per production slot (about 3k total)
    • only very low resource jobs are allowed on these test nodes (policy)

FNAL Test Cluster (operated by FNAL)

  • CE hardware
    • 2 x Intel Xeon E5335 (8 cores total)
    • 8GB of memory
    • 500GB disk space
  • CE software
    • Scientific Linux 4.4 (x86_64)
    • Condor 6.9
  • worker nodes
    • each production worker node runs a second copy of the Condor daemons as a non-privileged user
      (shadow pool)
    • there are 1 test slot per production slot (about 5k total, but can inscrease to 4 to 1 when needed
    • only very low resource jobs are allowed on these test nodes (policy)
Changed:
<
<

UCSD test client

  • 2x Intel Xeon 3.0GHz (4 cores total)
  • 4GB of memory
  • CentOS? 4.7 (i386)

FNAL test client (owned by FNAL)

  • 2x Intel Xeon 3.2GHz (4 cores total)
  • 4GB of memory
  • Scientific Linux 4.2 (i386)

Italy test client (owned by INFN Pisa)

  • 1x AMD Opteron 148 (1 core total)
  • 1GB of memory
  • Scientific Linux 5.2 (x86_64)
>
>

UCSD test client

  • 1x Intel Xeon 3.0GHz (2 hyperthreaded cores, 4 threads total)
  • 4GB of memory
  • CentOS? 4.7 (i386)

FNAL test client (owned by FNAL)

  • 2x Intel Xeon 3.2GHz (4 cores total)
  • 4GB of memory
  • Scientific Linux 4.2 (i386)

Italy test client (owned by INFN Pisa)

  • 1x AMD Opteron 148 (1 core total)
  • 1GB of memory
  • Scientific Linux 5.2 (x86_64)
 

Results

Revision 172009/09/08 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 13 to 13
 
    • The code was refactored by Igor Sfiligoi to make it more user friendly, flexible and able to stress test non-Grid resources.
      The package can be downloaded here: loadtest_condor.v1_0_0.tgz
      The programs are in the bin directory; it only relies on command line options that are documented in the binary help screen.
  1. a load monitoring package
    • The first version was developed by Toni Coarasa.
      Follow the links for a description.
Added:
>
>

Test hardware

UCSD Test Cluster

  • CE hardware
    • 2 x AMD Opteron 275 (4 cores total)
    • 8GB of memory
    • 2TB disk space, mounted as RAID
  • CE software
    • CentOS? 5.2 (x86_64)
    • Condor 7.2
  • worker nodes
    • uses test slots on the production worker nodes
      (shadow pool)
    • there are about 4 test slots per production slot (about 3k total)
    • only very low resource jobs are allowed on these test nodes (policy)

FNAL Test Cluster (operated by FNAL)

  • CE hardware
    • 2 x Intel Xeon E5335 (8 cores total)
    • 8GB of memory
    • 500GB disk space
  • CE software
    • Scientific Linux 4.4 (x86_64)
    • Condor 6.9
  • worker nodes
    • each production worker node runs a second copy of the Condor daemons as a non-privileged user
      (shadow pool)
    • there are 1 test slot per production slot (about 5k total, but can inscrease to 4 to 1 when needed
    • only very low resource jobs are allowed on these test nodes (policy)

UCSD test client

  • 2x Intel Xeon 3.0GHz (4 cores total)
  • 4GB of memory
  • CentOS? 4.7 (i386)

FNAL test client (owned by FNAL)

  • 2x Intel Xeon 3.2GHz (4 cores total)
  • 4GB of memory
  • Scientific Linux 4.2 (i386)

Italy test client (owned by INFN Pisa)

  • 1x AMD Opteron 148 (1 core total)
  • 1GB of memory
  • Scientific Linux 5.2 (x86_64)
 

Results

First round of tests were performed in Fall 2008 (by Toni Coarasa):

Revision 162009/09/03 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 28 to 28
 
  • GT2 at UCSD:
    Submitted 5k jobs per user.
    With 4 parallel users the job rate limit is 26 jobs/min (compare this to 7.5 jobs/min using a single user).
    Can sustain 47 jobs/min if monitoring not important (compare this to 33 jobs/min using a single user).
    More details at: rates_gt2_single_vs_multi_30min.ods or rates_gt2_single_vs_multi_30min.pdf
  • GT2 between Italy and UCSD"
    Submitted 5k jobs per user. Due to limited client resources (single CPU, 1G of memory) -maxidle 1k was used when using multiple DNs.
    With 4 parallel users the job rate limit is 21 jobs/min (compare this to 5.2 jobs/min using a single user).
    Can sustain 43 jobs/min if monitoring not important (compare this to 14 jobs/min using a single user).
    More details at: rates_gt2_r2_single_vs_multi_30min.ods or rates_gt2_r2_single_vs_multi_30min.pdf
  • Network latency is much more noticeable when a single DN is used. Using multiple users, the networking latencies don't seem to be a major issue.
Added:
>
>
  -- IgorSfiligoi - 2009/08/28

Deprecated information... left only for historical reference

Line: 67 to 68
 
META FILEATTACHMENT attachment="rates_gt2_single_vs_multi_30min.pdf" attr="" comment="Job startup rates for GT2 - Single vs multi user" date="1251415697" name="rates_gt2_single_vs_multi_30min.pdf" path="rates_gt2_single_vs_multi_30min.pdf" size="300972" stream="rates_gt2_single_vs_multi_30min.pdf" tmpFilename="/tmp/kcWPidk1QU" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_r2_single_vs_multi_30min.ods" attr="" comment="Job startup rates for GT2 - Single vs multi user - from Italy" date="1251999655" name="rates_gt2_r2_single_vs_multi_30min.ods" path="rates_gt2_r2_single_vs_multi_30min.ods" size="102140" stream="rates_gt2_r2_single_vs_multi_30min.ods" tmpFilename="/tmp/TdZdDFqgKv" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_r2_single_vs_multi_30min.pdf" attr="" comment="Job startup rates for GT2 - Single vs multi user - from Italy" date="1251999678" name="rates_gt2_r2_single_vs_multi_30min.pdf" path="rates_gt2_r2_single_vs_multi_30min.pdf" size="301979" stream="rates_gt2_r2_single_vs_multi_30min.pdf" tmpFilename="/tmp/Xf1KeMcpPI" user="IgorSfiligoi" version="1"
Added:
>
>
META FILEATTACHMENT attachment="rates_gt2_r2_resource_constraint.ods" attr="" comment="Job startup rates for GT2 on a resource constrained client - from Italy" date="1252013929" name="rates_gt2_r2_resource_constraint.ods" path="rates_gt2_r2_resource_constraint.ods" size="105738" stream="rates_gt2_r2_resource_constraint.ods" tmpFilename="/tmp/q6Jhf7tEFp" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_r2_resource_constraint.pdf" attr="" comment="Job startup rates for GT2 on a resource constrained client - from Italy" date="1252013943" name="rates_gt2_r2_resource_constraint.pdf" path="rates_gt2_r2_resource_constraint.pdf" size="306486" stream="rates_gt2_r2_resource_constraint.pdf" tmpFilename="/tmp/qANcRMGeBX" user="IgorSfiligoi" version="1"

Revision 152009/09/03 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 26 to 26
  A third round of tests, using multiple users, were performed in Summer 2009 (by Igor Sfiligoi):
Changed:
<
<
  • GT2 between Italy and UCSD"
    In progress
>
>
  • GT2 between Italy and UCSD"
    Submitted 5k jobs per user. Due to limited client resources (single CPU, 1G of memory) -maxidle 1k was used when using multiple DNs.
    With 4 parallel users the job rate limit is 21 jobs/min (compare this to 5.2 jobs/min using a single user).
    Can sustain 43 jobs/min if monitoring not important (compare this to 14 jobs/min using a single user).
    More details at: rates_gt2_r2_single_vs_multi_30min.ods or rates_gt2_r2_single_vs_multi_30min.pdf
  • Network latency is much more noticeable when a single DN is used. Using multiple users, the networking latencies don't seem to be a major issue.
  -- IgorSfiligoi - 2009/08/28

Deprecated information... left only for historical reference

Line: 64 to 65
 
META FILEATTACHMENT attachment="gt2_scaling_limit_rtt.png" attr="" comment="Image: GT2 scaling vs RTT - Limiting factor" date="1251150640" name="gt2_scaling_limit_rtt.png" path="gt2_scaling_limit_rtt.png" size="5654" stream="gt2_scaling_limit_rtt.png" tmpFilename="/tmp/275mbihVES" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_single_vs_multi_30min.ods" attr="" comment="Job startup rates for GT2 - Single vs multi user" date="1251415685" name="rates_gt2_single_vs_multi_30min.ods" path="rates_gt2_single_vs_multi_30min.ods" size="98771" stream="rates_gt2_single_vs_multi_30min.ods" tmpFilename="/tmp/H8LnuWL6V6" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_single_vs_multi_30min.pdf" attr="" comment="Job startup rates for GT2 - Single vs multi user" date="1251415697" name="rates_gt2_single_vs_multi_30min.pdf" path="rates_gt2_single_vs_multi_30min.pdf" size="300972" stream="rates_gt2_single_vs_multi_30min.pdf" tmpFilename="/tmp/kcWPidk1QU" user="IgorSfiligoi" version="1"
Added:
>
>
META FILEATTACHMENT attachment="rates_gt2_r2_single_vs_multi_30min.ods" attr="" comment="Job startup rates for GT2 - Single vs multi user - from Italy" date="1251999655" name="rates_gt2_r2_single_vs_multi_30min.ods" path="rates_gt2_r2_single_vs_multi_30min.ods" size="102140" stream="rates_gt2_r2_single_vs_multi_30min.ods" tmpFilename="/tmp/TdZdDFqgKv" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_r2_single_vs_multi_30min.pdf" attr="" comment="Job startup rates for GT2 - Single vs multi user - from Italy" date="1251999678" name="rates_gt2_r2_single_vs_multi_30min.pdf" path="rates_gt2_r2_single_vs_multi_30min.pdf" size="301979" stream="rates_gt2_r2_single_vs_multi_30min.pdf" tmpFilename="/tmp/Xf1KeMcpPI" user="IgorSfiligoi" version="1"

Revision 142009/08/28 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 26 to 26
  A third round of tests, using multiple users, were performed in Summer 2009 (by Igor Sfiligoi):
Changed:
<
<
  • GT2 between Italy and UCSD"
    Submitted 5k jobs per user.
    With 4 parallel users the CE died due to overload (the last registered load was over 600).
    GT2 seems effectively unusable over long distances when using mulitple users submitting many jobs.
>
>
  • GT2 between Italy and UCSD"
    In progress
  -- IgorSfiligoi - 2009/08/28

Deprecated information... left only for historical reference

Revision 132009/08/28 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 26 to 26
  A third round of tests, using multiple users, were performed in Summer 2009 (by Igor Sfiligoi):
Added:
>
>
  • GT2 between Italy and UCSD"
    Submitted 5k jobs per user.
    With 4 parallel users the CE died due to overload (the last registered load was over 600).
    GT2 seems effectively unusable over long distances when using mulitple users submitting many jobs.
 
Changed:
<
<
-- IgorSfiligoi - 2009/08/27
>
>
-- IgorSfiligoi - 2009/08/28
 

Deprecated information... left only for historical reference

Monitoring

Revision 122009/08/27 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 10 to 10
 Two software packages have been developped:
  1. a load generating engine
Changed:
<
<
    • The code was refactored by Igor Sfiligoi to make it more user friendly, flexible and able to stress test non-Grid resources.
      Intructions to be provided shortly.
>
>
    • The code was refactored by Igor Sfiligoi to make it more user friendly, flexible and able to stress test non-Grid resources.
      The package can be downloaded here: loadtest_condor.v1_0_0.tgz
      The programs are in the bin directory; it only relies on command line options that are documented in the binary help screen.
 
  1. a load monitoring package
    • The first version was developed by Toni Coarasa.
      Follow the links for a description.

Results

Line: 19 to 19
 
  • GT2 at UCSD:
    Submitted more than 30k jobs.
    The observed limit was 0.44Hz (i.e. 26.4 jobs/min).
    See the following presentation.

Second round of tests were performed in Summer 2009 (by Igor Sfiligoi):

Changed:
<
<
  • GT2 at UCSD:
    Submitted 5k jobs.
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
  • GT2 between FNAL and UCSD :
    Submitted 5k jobs - from UCSD to FNAL, and from FNAL to UCSD.
    Job rate limited to 6.9 jobs/min under heavy load.
    Can sustain 25-28 jobs/min if monitoring not important (CPU speed seems a factor here).
    Network latency seems to be a factor.
    More details for UCSD to FNAL at: rates_gt2_fnal_r.pdf
    and for FNAL to UCSD at: rates_gt2_ucsd_r.pdf
  • GT2 between Italy and UCSD :
    Submitted 5k jobs - from Italy to UCSD.
    Job rate limited to 3.5 jobs/min under heavy load.
    Can sustain 7 jobs/min if monitoring not important.
    Network latency is definitely a factor.
    More details: rates_gt2_ucsd_r2.pdf
  • Network latency is definitely a factor with GT2; below you can see it graphically:
    Image of GT2 scalability vs RTT















    The following shows only the limiting rate:
    GT2 scaling limit vs RTT
>
>
  • GT2 at UCSD:
    Submitted 5k jobs.
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
  • GT2 between FNAL and UCSD :
    Submitted 5k jobs - from UCSD to FNAL, and from FNAL to UCSD.
    Job rate limited to 6.9 jobs/min under heavy load.
    Can sustain 25-28 jobs/min if monitoring not important (CPU speed seems a factor here).
    Network latency seems to be a factor.
    More details for UCSD to FNAL at: rates_gt2_fnal_r.pdf
    and for FNAL to UCSD at: rates_gt2_ucsd_r.pdf
  • GT2 between Italy and UCSD :
    Submitted 5k jobs - from Italy to UCSD.
    Job rate limited to 3.5 jobs/min under heavy load.
    Can sustain 7 jobs/min if monitoring not important.
    Network latency is definitely a factor.
    More details: rates_gt2_ucsd_r2.pdf
  • Network latency is definitely a factor with GT2; below you can see it graphically:
    Image of GT2 scalability vs RTT
    The following shows only the limiting rate:
    GT2 scaling limit vs RTT
 
Changed:
<
<
-- IgorSfiligoi - 2009/08/14
>
>
A third round of tests, using multiple users, were performed in Summer 2009 (by Igor Sfiligoi):

-- IgorSfiligoi - 2009/08/27

 

Deprecated information... left only for historical reference

Monitoring

Line: 58 to 61
 
META FILEATTACHMENT attachment="rates_gt2_ucsd_rtt.ods" attr="" comment="Spreadsheet with rates vs RTT - UCSD" date="1251150589" name="rates_gt2_ucsd_rtt.ods" path="rates_gt2_ucsd_rtt.ods" size="24044" stream="rates_gt2_ucsd_rtt.ods" tmpFilename="/tmp/OE7CebPdVc" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="gt2_scaling_rtt.png" attr="" comment="Image: GT2 scaling vs RTT" date="1251150622" name="gt2_scaling_rtt.png" path="gt2_scaling_rtt.png" size="10731" stream="gt2_scaling_rtt.png" tmpFilename="/tmp/1we92wyAmd" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="gt2_scaling_limit_rtt.png" attr="" comment="Image: GT2 scaling vs RTT - Limiting factor" date="1251150640" name="gt2_scaling_limit_rtt.png" path="gt2_scaling_limit_rtt.png" size="5654" stream="gt2_scaling_limit_rtt.png" tmpFilename="/tmp/275mbihVES" user="IgorSfiligoi" version="1"
Added:
>
>
META FILEATTACHMENT attachment="rates_gt2_single_vs_multi_30min.ods" attr="" comment="Job startup rates for GT2 - Single vs multi user" date="1251415685" name="rates_gt2_single_vs_multi_30min.ods" path="rates_gt2_single_vs_multi_30min.ods" size="98771" stream="rates_gt2_single_vs_multi_30min.ods" tmpFilename="/tmp/H8LnuWL6V6" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_single_vs_multi_30min.pdf" attr="" comment="Job startup rates for GT2 - Single vs multi user" date="1251415697" name="rates_gt2_single_vs_multi_30min.pdf" path="rates_gt2_single_vs_multi_30min.pdf" size="300972" stream="rates_gt2_single_vs_multi_30min.pdf" tmpFilename="/tmp/kcWPidk1QU" user="IgorSfiligoi" version="1"

Revision 112009/08/24 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 21 to 21
 Second round of tests were performed in Summer 2009 (by Igor Sfiligoi):
  • GT2 at UCSD:
    Submitted 5k jobs.
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
  • GT2 between FNAL and UCSD :
    Submitted 5k jobs - from UCSD to FNAL, and from FNAL to UCSD.
    Job rate limited to 6.9 jobs/min under heavy load.
    Can sustain 25-28 jobs/min if monitoring not important (CPU speed seems a factor here).
    Network latency seems to be a factor.
    More details for UCSD to FNAL at: rates_gt2_fnal_r.pdf
    and for FNAL to UCSD at: rates_gt2_ucsd_r.pdf
Changed:
<
<
  • More will be posted soon.
>
>
  • GT2 between Italy and UCSD :
    Submitted 5k jobs - from Italy to UCSD.
    Job rate limited to 3.5 jobs/min under heavy load.
    Can sustain 7 jobs/min if monitoring not important.
    Network latency is definitely a factor.
    More details: rates_gt2_ucsd_r2.pdf
  • Network latency is definitely a factor with GT2; below you can see it graphically:
    Image of GT2 scalability vs RTT















    The following shows only the limiting rate:
    GT2 scaling limit vs RTT
  -- IgorSfiligoi - 2009/08/14

Deprecated information... left only for historical reference

Line: 52 to 53
 
META FILEATTACHMENT attachment="rates_gt2_fnal_r.odt" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250788615" name="rates_gt2_fnal_r.odt" path="rates_gt2_fnal_r.odt" size="29301" stream="rates_gt2_fnal_r.odt" tmpFilename="/tmp/aqG9xFdED2" user="IgorSfiligoi" version="2"
META FILEATTACHMENT attachment="rates_gt2_ucsd_r.pdf" attr="" comment="Job startup rates for GT2 for UCSD - from FNAL" date="1250633705" name="rates_gt2_ucsd_r.pdf" path="rates_gt2_ucsd_r.pdf" size="528002" stream="rates_gt2_ucsd_r.pdf" tmpFilename="/tmp/MYWPPwvlyZ" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_r.odt" attr="" comment="Job startup rates for GT2 for UCSD - from FNAL" date="1250633718" name="rates_gt2_ucsd_r.odt" path="rates_gt2_ucsd_r.odt" size="69954" stream="rates_gt2_ucsd_r.odt" tmpFilename="/tmp/53yA50xgdM" user="IgorSfiligoi" version="1"
Added:
>
>
META FILEATTACHMENT attachment="rates_gt2_ucsd_r2.pdf" attr="" comment="Job startup rates for GT2 for UCSD - from Italy" date="1251149559" name="rates_gt2_ucsd_r2.pdf" path="rates_gt2_ucsd_r2.pdf" size="537613" stream="rates_gt2_ucsd_r2.pdf" tmpFilename="/tmp/LqcwVbgAQl" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_r2.odt" attr="" comment="Job startup rates for GT2 for UCSD - from Italy" date="1251149574" name="rates_gt2_ucsd_r2.odt" path="rates_gt2_ucsd_r2.odt" size="80613" stream="rates_gt2_ucsd_r2.odt" tmpFilename="/tmp/lSows0owtJ" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_rtt.ods" attr="" comment="Spreadsheet with rates vs RTT - UCSD" date="1251150589" name="rates_gt2_ucsd_rtt.ods" path="rates_gt2_ucsd_rtt.ods" size="24044" stream="rates_gt2_ucsd_rtt.ods" tmpFilename="/tmp/OE7CebPdVc" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="gt2_scaling_rtt.png" attr="" comment="Image: GT2 scaling vs RTT" date="1251150622" name="gt2_scaling_rtt.png" path="gt2_scaling_rtt.png" size="10731" stream="gt2_scaling_rtt.png" tmpFilename="/tmp/1we92wyAmd" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="gt2_scaling_limit_rtt.png" attr="" comment="Image: GT2 scaling vs RTT - Limiting factor" date="1251150640" name="gt2_scaling_limit_rtt.png" path="gt2_scaling_limit_rtt.png" size="5654" stream="gt2_scaling_limit_rtt.png" tmpFilename="/tmp/275mbihVES" user="IgorSfiligoi" version="1"

Revision 102009/08/20 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 48 to 48
 
META FILEATTACHMENT attachment="loadtest_condor.v1_0_0.tgz" attr="" comment="" date="1250546864" name="loadtest_condor.v1_0_0.tgz" path="loadtest_condor.v1_0_0.tgz" size="6955" stream="loadtest_condor.v1_0_0.tgz" tmpFilename="/tmp/lYC7w3LRA8" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_08.odp" attr="" comment="Job startup rates for GT2 - Toni's tests" date="1250549164" name="rates_gt2_ucsd_08.odp" path="rates_gt2_ucsd_08.odp" size="282212" stream="rates_gt2_ucsd_08.odp" tmpFilename="/tmp/zanxjlsluj" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_08.pdf" attr="" comment="Job startup rates for GT2 - Toni's tests" date="1250549176" name="rates_gt2_ucsd_08.pdf" path="rates_gt2_ucsd_08.pdf" size="325145" stream="rates_gt2_ucsd_08.pdf" tmpFilename="/tmp/i73Ot2Q7rd" user="IgorSfiligoi" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="rates_gt2_fnal_r.pdf" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250627776" name="rates_gt2_fnal_r.pdf" path="rates_gt2_fnal_r.pdf" size="400635" stream="rates_gt2_fnal_r.pdf" tmpFilename="/tmp/XlOuzIHUPd" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_fnal_r.odt" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250627799" name="rates_gt2_fnal_r.odt" path="rates_gt2_fnal_r.odt" size="29304" stream="rates_gt2_fnal_r.odt" tmpFilename="/tmp/itNyWMmrZQ" user="IgorSfiligoi" version="1"
>
>
META FILEATTACHMENT attachment="rates_gt2_fnal_r.pdf" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250788637" name="rates_gt2_fnal_r.pdf" path="rates_gt2_fnal_r.pdf" size="400630" stream="rates_gt2_fnal_r.pdf" tmpFilename="/tmp/f24xangFFS" user="IgorSfiligoi" version="2"
META FILEATTACHMENT attachment="rates_gt2_fnal_r.odt" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250788615" name="rates_gt2_fnal_r.odt" path="rates_gt2_fnal_r.odt" size="29301" stream="rates_gt2_fnal_r.odt" tmpFilename="/tmp/aqG9xFdED2" user="IgorSfiligoi" version="2"
 
META FILEATTACHMENT attachment="rates_gt2_ucsd_r.pdf" attr="" comment="Job startup rates for GT2 for UCSD - from FNAL" date="1250633705" name="rates_gt2_ucsd_r.pdf" path="rates_gt2_ucsd_r.pdf" size="528002" stream="rates_gt2_ucsd_r.pdf" tmpFilename="/tmp/MYWPPwvlyZ" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_r.odt" attr="" comment="Job startup rates for GT2 for UCSD - from FNAL" date="1250633718" name="rates_gt2_ucsd_r.odt" path="rates_gt2_ucsd_r.odt" size="69954" stream="rates_gt2_ucsd_r.odt" tmpFilename="/tmp/53yA50xgdM" user="IgorSfiligoi" version="1"

Revision 92009/08/18 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 15 to 15
 
    • The first version was developed by Toni Coarasa.
      Follow the links for a description.

Results

Changed:
<
<
First round of tests were performed in Fall 2008:
>
>
First round of tests were performed in Fall 2008 (by Toni Coarasa):
 
  • GT2 at UCSD:
    Submitted more than 30k jobs.
    The observed limit was 0.44Hz (i.e. 26.4 jobs/min).
    See the following presentation.
Changed:
<
<
Second round of tests were performed in Summer 2009:
>
>
Second round of tests were performed in Summer 2009 (by Igor Sfiligoi):
 
  • GT2 at UCSD:
    Submitted 5k jobs.
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
Changed:
<
<
  • GT2 at FNAL - from UCSD :
    Submitted 5k jobs.
    Job rate limited to 6.9 jobs/min under heavy load.
    Can sustain 28 jobs/min if monitoring not important.
    More details at: rates_gt2_fnal_r.pdf
>
>
  • GT2 between FNAL and UCSD :
    Submitted 5k jobs - from UCSD to FNAL, and from FNAL to UCSD.
    Job rate limited to 6.9 jobs/min under heavy load.
    Can sustain 25-28 jobs/min if monitoring not important (CPU speed seems a factor here).
    Network latency seems to be a factor.
    More details for UCSD to FNAL at: rates_gt2_fnal_r.pdf
    and for FNAL to UCSD at: rates_gt2_ucsd_r.pdf
 
  • More will be posted soon.

-- IgorSfiligoi - 2009/08/14

Line: 50 to 50
 
META FILEATTACHMENT attachment="rates_gt2_ucsd_08.pdf" attr="" comment="Job startup rates for GT2 - Toni's tests" date="1250549176" name="rates_gt2_ucsd_08.pdf" path="rates_gt2_ucsd_08.pdf" size="325145" stream="rates_gt2_ucsd_08.pdf" tmpFilename="/tmp/i73Ot2Q7rd" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_fnal_r.pdf" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250627776" name="rates_gt2_fnal_r.pdf" path="rates_gt2_fnal_r.pdf" size="400635" stream="rates_gt2_fnal_r.pdf" tmpFilename="/tmp/XlOuzIHUPd" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_fnal_r.odt" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250627799" name="rates_gt2_fnal_r.odt" path="rates_gt2_fnal_r.odt" size="29304" stream="rates_gt2_fnal_r.odt" tmpFilename="/tmp/itNyWMmrZQ" user="IgorSfiligoi" version="1"
Added:
>
>
META FILEATTACHMENT attachment="rates_gt2_ucsd_r.pdf" attr="" comment="Job startup rates for GT2 for UCSD - from FNAL" date="1250633705" name="rates_gt2_ucsd_r.pdf" path="rates_gt2_ucsd_r.pdf" size="528002" stream="rates_gt2_ucsd_r.pdf" tmpFilename="/tmp/MYWPPwvlyZ" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_r.odt" attr="" comment="Job startup rates for GT2 for UCSD - from FNAL" date="1250633718" name="rates_gt2_ucsd_r.odt" path="rates_gt2_ucsd_r.odt" size="69954" stream="rates_gt2_ucsd_r.odt" tmpFilename="/tmp/53yA50xgdM" user="IgorSfiligoi" version="1"

Revision 82009/08/18 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 16 to 16
 

Results

First round of tests were performed in Fall 2008:

Changed:
<
<
  • GT2 on osg-gw-5:
    Submitted more than 30k jobs.
    The observed limit was 0.44Hz (i.e. 26.4 jobs/min).
    See the following presentation.
>
>
  • GT2 at UCSD:
    Submitted more than 30k jobs.
    The observed limit was 0.44Hz (i.e. 26.4 jobs/min).
    See the following presentation.
  Second round of tests were performed in Summer 2009:
Changed:
<
<
  • GT2 on osg-gw-5:
    Submitted 5k jobs.
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
>
>
  • GT2 at UCSD:
    Submitted 5k jobs.
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
  • GT2 at FNAL - from UCSD :
    Submitted 5k jobs.
    Job rate limited to 6.9 jobs/min under heavy load.
    Can sustain 28 jobs/min if monitoring not important.
    More details at: rates_gt2_fnal_r.pdf
 
  • More will be posted soon.

-- IgorSfiligoi - 2009/08/14

Line: 42 to 43
  -- ToniCoarasa - 19 Sep 2008
Changed:
<
<
META FILEATTACHMENT attachment="rates_gt2_ucsd.odt" attr="" comment="Job startup rates for GT2 at UCSD" date="1250540184" name="rates_gt2_ucsd.odt" path="rates_gt2_ucsd.odt" size="84481" stream="rates_gt2_ucsd.odt" tmpFilename="/tmp/ZLZE7gjZJF" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd.pdf" attr="" comment="Job startup rates for GT2 at UCSD" date="1250540336" name="rates_gt2_ucsd.pdf" path="rates_gt2_ucsd.pdf" size="543873" stream="rates_gt2_ucsd.pdf" tmpFilename="/tmp/NwpSsHi02y" user="IgorSfiligoi" version="1"
>
>
META FILEATTACHMENT attachment="rates_gt2_ucsd.odt" attr="" comment="Job startup rates for GT2 at UCSD" date="1250626593" name="rates_gt2_ucsd.odt" path="rates_gt2_ucsd.odt" size="85038" stream="rates_gt2_ucsd.odt" tmpFilename="/tmp/qmPoaJy244" user="IgorSfiligoi" version="3"
META FILEATTACHMENT attachment="rates_gt2_ucsd.pdf" attr="" comment="Job startup rates for GT2 at UCSD" date="1250626581" name="rates_gt2_ucsd.pdf" path="rates_gt2_ucsd.pdf" size="543947" stream="rates_gt2_ucsd.pdf" tmpFilename="/tmp/rk4yT1YKcB" user="IgorSfiligoi" version="3"
 
META FILEATTACHMENT attachment="loadtest_condor.v1_0_0.tgz" attr="" comment="" date="1250546864" name="loadtest_condor.v1_0_0.tgz" path="loadtest_condor.v1_0_0.tgz" size="6955" stream="loadtest_condor.v1_0_0.tgz" tmpFilename="/tmp/lYC7w3LRA8" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_08.odp" attr="" comment="Job startup rates for GT2 - Toni's tests" date="1250549164" name="rates_gt2_ucsd_08.odp" path="rates_gt2_ucsd_08.odp" size="282212" stream="rates_gt2_ucsd_08.odp" tmpFilename="/tmp/zanxjlsluj" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_08.pdf" attr="" comment="Job startup rates for GT2 - Toni's tests" date="1250549176" name="rates_gt2_ucsd_08.pdf" path="rates_gt2_ucsd_08.pdf" size="325145" stream="rates_gt2_ucsd_08.pdf" tmpFilename="/tmp/i73Ot2Q7rd" user="IgorSfiligoi" version="1"
Added:
>
>
META FILEATTACHMENT attachment="rates_gt2_fnal_r.pdf" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250627776" name="rates_gt2_fnal_r.pdf" path="rates_gt2_fnal_r.pdf" size="400635" stream="rates_gt2_fnal_r.pdf" tmpFilename="/tmp/XlOuzIHUPd" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_fnal_r.odt" attr="" comment="Job startup rates for GT2 for FNAL - from UCSD" date="1250627799" name="rates_gt2_fnal_r.odt" path="rates_gt2_fnal_r.odt" size="29304" stream="rates_gt2_fnal_r.odt" tmpFilename="/tmp/itNyWMmrZQ" user="IgorSfiligoi" version="1"

Revision 72009/08/17 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 15 to 15
 
    • The first version was developed by Toni Coarasa.
      Follow the links for a description.

Results

Changed:
<
<
First round of tests were performed in Winter 2009:
  • Will be posted soon...
>
>
First round of tests were performed in Fall 2008:
  • GT2 on osg-gw-5:
    Submitted more than 30k jobs.
    The observed limit was 0.44Hz (i.e. 26.4 jobs/min).
    See the following presentation.
  Second round of tests were performed in Summer 2009:
Changed:
<
<
  • GT2 on osg-gw-5:
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
>
>
  • GT2 on osg-gw-5:
    Submitted 5k jobs.
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
 
  • More will be posted soon.

-- IgorSfiligoi - 2009/08/14

Line: 44 to 44
 
META FILEATTACHMENT attachment="rates_gt2_ucsd.odt" attr="" comment="Job startup rates for GT2 at UCSD" date="1250540184" name="rates_gt2_ucsd.odt" path="rates_gt2_ucsd.odt" size="84481" stream="rates_gt2_ucsd.odt" tmpFilename="/tmp/ZLZE7gjZJF" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd.pdf" attr="" comment="Job startup rates for GT2 at UCSD" date="1250540336" name="rates_gt2_ucsd.pdf" path="rates_gt2_ucsd.pdf" size="543873" stream="rates_gt2_ucsd.pdf" tmpFilename="/tmp/NwpSsHi02y" user="IgorSfiligoi" version="1"
Added:
>
>
META FILEATTACHMENT attachment="loadtest_condor.v1_0_0.tgz" attr="" comment="" date="1250546864" name="loadtest_condor.v1_0_0.tgz" path="loadtest_condor.v1_0_0.tgz" size="6955" stream="loadtest_condor.v1_0_0.tgz" tmpFilename="/tmp/lYC7w3LRA8" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_08.odp" attr="" comment="Job startup rates for GT2 - Toni's tests" date="1250549164" name="rates_gt2_ucsd_08.odp" path="rates_gt2_ucsd_08.odp" size="282212" stream="rates_gt2_ucsd_08.odp" tmpFilename="/tmp/zanxjlsluj" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd_08.pdf" attr="" comment="Job startup rates for GT2 - Toni's tests" date="1250549176" name="rates_gt2_ucsd_08.pdf" path="rates_gt2_ucsd_08.pdf" size="325145" stream="rates_gt2_ucsd_08.pdf" tmpFilename="/tmp/i73Ot2Q7rd" user="IgorSfiligoi" version="1"

Revision 62009/08/17 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Line: 19 to 19
 
  • Will be posted soon...

Second round of tests were performed in Summer 2009:

Changed:
<
<
  • Will be posted soon.
>
>
  • GT2 on osg-gw-5:
    Job rate limited to 7.5 jobs/min under heavy load.
    Can sustain 33 jobs/min if monitoring not important.
    More details at: rates_gt2_ucsd.pdf
  • More will be posted soon.
  -- IgorSfiligoi - 2009/08/14

Deprecated information... left only for historical reference

Line: 40 to 41
 Goal to be finished: Monday April 6th

-- ToniCoarasa - 19 Sep 2008

Added:
>
>
META FILEATTACHMENT attachment="rates_gt2_ucsd.odt" attr="" comment="Job startup rates for GT2 at UCSD" date="1250540184" name="rates_gt2_ucsd.odt" path="rates_gt2_ucsd.odt" size="84481" stream="rates_gt2_ucsd.odt" tmpFilename="/tmp/ZLZE7gjZJF" user="IgorSfiligoi" version="1"
META FILEATTACHMENT attachment="rates_gt2_ucsd.pdf" attr="" comment="Job startup rates for GT2 at UCSD" date="1250540336" name="rates_gt2_ucsd.pdf" path="rates_gt2_ucsd.pdf" size="543873" stream="rates_gt2_ucsd.pdf" tmpFilename="/tmp/NwpSsHi02y" user="IgorSfiligoi" version="1"

Revision 52009/08/15 - Main.IgorSfiligoi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Changed:
<
<

Monitoring

>
>

Mission statement

This page is meant to gather information about the OSG CE scalability activity.

Tools

Two software packages have been developped:

  1. a load generating engine
    • The first version was developed by Toni Coarasa.
      Follow the links for installation instructions and usage instructions.
    • The code was refactored by Igor Sfiligoi to make it more user friendly, flexible and able to stress test non-Grid resources.
      Intructions to be provided shortly.
  2. a load monitoring package
    • The first version was developed by Toni Coarasa.
      Follow the links for a description.

Results

First round of tests were performed in Winter 2009:

  • Will be posted soon...

Second round of tests were performed in Summer 2009:

  • Will be posted soon.

-- IgorSfiligoi - 2009/08/14

Deprecated information... left only for historical reference

Monitoring

  The monitoring has been done using the package described in "Description of the osgmonitoring.rpm package". and the cacti installed in t2sentry0.t2.ucsd.edu.
Line: 9 to 31
 

Running the tests for dummys

Changed:
<
<

Things left to do

>
>

Things left to do

 
  1. create the client side tarball, and document it on this twiki, and attach the tarball to the twiki page.
  2. communicate with Terrence to make sure that we have the client monitoring online on uaf-2, and uaf-1.

Revision 42009/04/08 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Monitoring

The monitoring has been done using the package described in "Description of the osgmonitoring.rpm package". and the cacti installed in t2sentry0.t2.ucsd.edu.

Changed:
<
<

Intallation Instructions for dummys

>
>

Installation Instructions for dummys

 

Running the tests for dummys

Revision 32009/04/07 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Monitoring

The monitoring has been done using the package described in "Description of the osgmonitoring.rpm package". and the cacti installed in t2sentry0.t2.ucsd.edu.

Added:
>
>

Intallation Instructions for dummys

Running the tests for dummys

Things left to do

  1. create the client side tarball, and document it on this twiki, and attach the tarball to the twiki page.
  2. communicate with Terrence to make sure that we have the client monitoring online on uaf-2, and uaf-1.
  3. document the process of putting monitoring via t2sentry0 and cacti into place

Goal to be finished: Monday April 6th

 -- ToniCoarasa - 19 Sep 2008 \ No newline at end of file

Revision 22008/10/02 - Main.ToniCoarasa

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Monitoring

Changed:
<
<
The monitoring has been done using the package described in "Description of the osgmonitoring.rpm package".
>
>
The monitoring has been done using the package described in "Description of the osgmonitoring.rpm package". and the cacti installed in t2sentry0.t2.ucsd.edu.
  -- ToniCoarasa - 19 Sep 2008

Revision 12008/09/26 - Main.ToniCoarasa

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

OSG CE And Submitter Testing And Monitoring

Monitoring

The monitoring has been done using the package described in "Description of the osgmonitoring.rpm package".

-- ToniCoarasa - 19 Sep 2008

 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback