Site Commissioning vs. Link Commissioning

It seems there is a lot of discussion about link commissioning that properly belongs to site commissioning. It doesn't make sense to talk about commissioning a link between two sites unless both sites already have a verifiably functioning storage element and srm door into that SE. Such commissioning should be done by the local administrator. Once I verify that I have a working T2 site with a working SE and srm, then I can consider commissioning the links to the various T1's and T2's.

How that verification should be made can be open for discussion, but as for myself I verify that I can download several files from FNAL to UCSD via a third-party srmcp and then consider my srm functional. If that doesn't work, then I consider the further problems (grid-ftp door problems, pool issues, network issues) but I would consider this part of (re)-commissioning a site rather than a particular link.

New Link Commissioning and Problem Troubleshooting

In commissioning a new link between a T1 and my T2, the goal is to use FTS with the server at the T1.

Since the T2 administrator has no access to the logs on the T1 FTS server, and therefore cannot really debug FTS problems or make changes to the configuration of the link, it is essential to have the active help of an FTS expert at the T1. For FNAL, this has been Yujun Wu, for example. If the effort at commissioning links is to come from the T2, this essential help must come from each T1. A list of FTS experts for every T1 who can be available to help would be an essential part of an FTS link commissioning effort.

Testing srmcp between T1 and T2 sites is not relevant because we must use ftscp for the links, and we should assume that srm is working on either end.

The second problem that we run into fairly regularly is a network mis-routing or failure. To debug this type of problem, it would be necessary to have a list of networking experts locally and at the T1 site that we could contact to debug and fix networking issues. The types of problems we have run into between ASGC and UCSD for example are poor routings, and between FNAL and UCSD outright failures.

In summary, the resources that we have needed as a T2 site to debug link issues between T1 sites and our T2 have been responsive FTS experts at the T1 and access to networking experts for the route between the T1 and our T2. When we have had those two things, the links are easy to commission and maintain in a working state, but where we have not had access to such experts, the link commissioning has been extremely difficult if not impossible.

-- JamesLetts - 03 Jul 2007

Topic revision: r2 - 2007/08/17 - 18:15:14 - JamesLetts
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback