Allocations (as received)
* NERSC 50M core hours (25M internally earmarked for USCMS) (valid for 2018)
* XSEDE : TACC Stampede2 : 170942 node hours (valid for 2018)
* XSEDE : PSC Bridges : 1196271 SU (valid for 2018)
Allocation Adjustments
* 06/07/2018 : NERSC : added 30M hours (making up for large Nova use and for new SciDAC use cases)
* 10/08/2018 : XSEDE : transferred 21448 TACC Stampede2 node hours into 500000 PSC Bridges SU
* 10/10/2018 : NERSC : added 5M hours (extra CMS hours)
* 10/11/2018 : NERSC : added 1.2M hours (extra CMS hours)
* 11/12/2018 : XSEDE : asked for extension until 03/31/2019 (would affect both Stampede2 and Bridges)
* 11/16/2018 : NERSC : added 5M hours (extra CMS hours)
* 11/16/2018 : NERSC : XSEDE allocation extended until 03/31/2019 (with Stampede2 rollover into 2019 capped at 75k)
* 11/19/2018 : NERSC : added 1.5M hours (extra CMS hours)
Allocation Usage and Leftovers (as of 11/19/2018)
HPC allocation validity |
original |
equivalent ttbar events (very rough) |
adjusted total |
used |
leftover |
monitoring |
NERSC Cori 2018 |
50M (25M for USCMS) |
160M to 620M |
92.7M |
27.3M |
11.7M |
addme |
PSC Bridges 2018 |
1.2M |
70M |
1.7M |
1.1M |
0.6M |
addme |
TACC Stampede2 2018 |
171k |
100M to 240M |
150k |
6k |
143k |
addme |
NERSC Core Hours
Accounting unit is Core Hours and directly represents wall time used on the Edison cluster (which we don't use anymore and which will be retired in 2019).
For the Cori nodes it's similar, but with a different charge factor. One hour of a Cori Haswell node (Dual Intel Xeon Processor E5-2698 v3, 2.3GHz, 128GB RAM, 32 physical cores with 2xHT) is being charged at 80 core hours. One hour of a Cori KNL node (Intel Xeon Phi 7250, 1.4GHz, 96GB RAM, 68 physical cores with 4xHT) is being charged at 96 core hours. Fractional use of Cori Haswell nodes (in the shared partition) is charged at the fraction of the node used.
We currently use a mix of Cori Haswell and Cori KNL full nodes.
PSC SU
Accounting unit is SU. We use the regular nodes (Dual Intel Haswell E5-2695 v3, 2.3GHz, 128GB RAM, 28 physical cores with HT disabled) and one SU represents one hour of one core being used.
Stampede2 SU
Accounting unit is SU, which on Stampede2 represents one hour of one node being used, multiplied by a charge rate for the given node type. There are KNL (Intel Xeon Phi 7250, 1.4GHz, 96GB RAM, 68 physical cores with 4xHT) and Skylake (Dual Intel Xeon Platinum 8160, 2.1GHz, 192GB RAM, 48 physical cores with 2xHT) nodes, but at the moment both have the same charge rate of 1.
We currently only use Skylake nodes.
Converting between Allocation Units
These numbers are from performance measurements on equivalent nodes to Cori Haswell and Cori KNL, from performance measurements on Bridges regular nodes and from the set conversion factor to move allocation units between Bridges and Stampede2. They are internally consistent and
should be roughly correct, but nevertheless should be taken with skepticism. Actual numbers will be release and workflow dependent of course (these are for ttbar events), but the ratios should remain more or less identical. The estimates for number of ttbar events in the previous table are based on these numbers.
HPC and node type |
Allocation Units needed to produce 1B CMS MC events |
Cori Haswell |
40M |
Cori KNL |
160M |
PSC Bridges |
16.6M |
Stampede2 Skylake |
0.71M |
Stampede2 KNL |
1.6M |
--
DirkHufnagel - 2018/11/19