TWiki> UCSDTier2 Web>DCacheLink (revision 3)EditAttach

Quick and easy SRM DCache Troubleshooting

Table of Contents

Restarting SRM server and all GridFTP? doors

Note that this is the most commonly needed procedure. If it is found that SRM transfers have started to fail, most likely the cause is either a malfunctioning SRM server itself, or one or more GridFTP? doors. Please login to t2data2.local and run the following script with 'restart' option.

[root@t2data2:~]$ ./restart-SRM-all-GridFTP-servers.sh
Usage: ./restart-SRM-all-GridFTP-servers.sh {start|stop|restart}

Restarting all other dCache servers (except SRM and GridFTP? )

If the above step is taken, yet transfers still continue to fail AND if you are sufficiently confident that the cause is a malfunctioning lower-level dCache service (eg., PNFS server, Pool Manager, DCap server, Replica Manager, Broadcast cell, etc.), please login to t2data2.local and run the following script with 'restart' option.

[root@t2data2:~]$ ./restart-all-other-dCache-servers.sh
Usage: ./restart-all-other-dCache-servers.sh {start|stop|restart}

Restarting everything -- all dCache servers, including SRM and GridFTP?

There is order to be maintained while restarting dCache services. Since a higher level service depends on the lower level services, the higher level service needs to be stopped before, and needs to be started later, than a lower level service. Each of the above mentioned two scripts take care of order internally. However, while using both scripts together, ie., to restart everything, one would run the scripts using different options and in the following sequence.

[root@t2data2:~]$ ./restart-SRM-all-GridFTP-servers.sh stop
[root@t2data2:~]$ ./restart-all-other-dCache-servers.sh restart
[root@t2data2:~]$ ./restart-SRM-all-GridFTP-servers.sh start

Restarting dCache server(s) running on a known hostname

If you simply want to restart service(s) on a single known hostname, please login to t2data2.local and use the following script with 'a single short hostname' and 'restart' option. Example cases:

  • restarting all DCap doors on "dcopy-1"
  • restarting a GridFTP? door on "gftp-1"
  • restarting Replica Manager services on "replica-1"

[root@t2data2:~]$ ./restart-one-dCache-server-host.sh
Usage: ./restart-one-dCache-server-host.sh {a single short hostname eg., dcopy-1 or gftp-1 or replica-1} {start|stop|restart}
An example: 
[root@t2data2:~]$ ./restart-one-dCache-server-host.sh gftp-1 restart

Mounting PNFS areas on uaf-1 or uaf-2

Two important PNFS areas are mounted on uaf-1 and uaf-2. These are:

  • /pnfs/sdsc.edu/data4/cms/userdata
  • /pnfs/sdsc.edu/data3/cms/phedex

If you discover that any or both areas are not mounted and want to verify, please login to uaf-1 or uaf-2 and run the following script with no option.

root@uaf-1 ~# ./PNFS-userdata-mounter
PNFS userdata is ALREADY mounted.
If needed, use --umount to unmount.

If you need to unmount, please run with '--umount' option.

root@uaf-1 ~# ./PNFS-userdata-mounter --umount
--umount

If you need to verify again, please run with no option.

root@uaf-1 ~# ./PNFS-userdata-mounter
PNFS userdata is NOT mounted.
If needed, use --mount to mount.

If you need to mount, please run with '--mount' option.

root@uaf-1 ~# ./PNFS-userdata-mounter --mount
--mount
root@uaf-1 ~# ./PNFS-userdata-mounter
PNFS userdata is ALREADY mounted.
If needed, use --umount to unmount.

-- AsRana - 14 May 2007

Edit | Attach | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r3 - 2007/05/15 - 19:25:47 - AsRana
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback