Stage-out issues at sites

The Aachen Problem

Aachen is a very large T2, growing to be even larger soon. They run all jobs out of a single shared filesystem. In that case, the startd ends up failing to connect to the schedd because the time to write into the startd log is delayed so massively that every connection attempt times out.

  -- FkW - 30 Apr 2008
