Difference: HdfsXrootdAdmin (1 vs. 7)

Revision 72015/06/30 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HDFS Xrootd Fallback Administration

Line: 54 to 54
 
  • /cms/phedex/store/relval
  • /cms/phedex/store/mc/Summer12_DR53X
Added:
>
>
2015-06-30:
  • /cms/phedex/store/mc/Summer11Backfill
  • /cms/phedex/store/mc/Summer11
  • /cms/phedex/store/mc/Summer12FS53
  • /cms/phedex/store/mc/Summer12_DR53X
  • /cms/phedex/store/mc/Summer12Backfill
  • /cms/phedex/store/mc/Summer12pLHE
  • /cms/phedex/store/mc/Summer11LegwmLHE
  • /cms/phedex/store/mc/Summer11Leg
  • /cms/phedex/store/mc/Summer11LegDR
  • /cms/phedex/store/mc/Summer12WMLHE
  • /cms/phedex/store/mc/Summer12 (waiting for FKW's go-ahead for this one since it also has a subdir in our x3 namespace)
  • /cms/phedex/store/mc/Summer12DR53X
 

Namespaces with Replication Set to 3

CMS started running some digi-reco 2015-05-28 which requires xrootd transfers from UCSDT2 to SDSC. To ensure throughput and file availability we upped replication to the following namespace:

Revision 62015/05/28 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HDFS Xrootd Fallback Administration

Line: 54 to 54
 
  • /cms/phedex/store/relval
  • /cms/phedex/store/mc/Summer12_DR53X
Added:
>
>

Namespaces with Replication Set to 3

CMS started running some digi-reco 2015-05-28 which requires xrootd transfers from UCSDT2 to SDSC. To ensure throughput and file availability we upped replication to the following namespace:

  • /cms/phedex/store/mc/RunIIWinter15GS/MinBias_TuneCUETP8M1_13TeV-pythia8
  • /cms/phedex/store/mc/RunIIFall14GS/MinBias_TuneCUETP8M1_13TeV-pythia8
  • /cms/phedex/store/mc/Summer12/MinBias_TuneZ2star_8TeV-pythia6
 

Healing progress

As of 2014-09-09, performed healing on a small subset:

Revision 52014/09/09 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HDFS Xrootd Fallback Administration

Line: 54 to 54
 
  • /cms/phedex/store/relval
  • /cms/phedex/store/mc/Summer12_DR53X
Added:
>
>

Healing progress

As of 2014-09-09, performed healing on a small subset:

  • /cms/phedex/store/data/Run2012A/DoubleElectron/AOD/22Jan2013-v1/20000

This should have healed 33 of the 95 corrupt files as of writing this. Will confirm on 09-10

 

Monitoring Fallback

Currently the best place to check to ensure fallback is working as expected is the udp log at xrootd-proxy.t2.ucsd.edu:

Revision 42014/08/05 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HDFS Xrootd Fallback Administration

Line: 35 to 35
  ALERT! NOTE the -R option can cause a load increase on the namenode if the file namespace is large enough, use with care!
Changed:
<
<
Using a bash for loop to set replication to 1 over a list of directories in a file called rep1.txt:
>
>
Using a bash while loop to set replication to 1 over a list of directories in a file called rep1.txt:
 
while read line;do echo `date +%T` $line;hadoop fs -setrep -R 1 $line > /dev/null;done < rep1.txt

Revision 32014/08/04 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HDFS Xrootd Fallback Administration

Line: 35 to 35
  ALERT! NOTE the -R option can cause a load increase on the namenode if the file namespace is large enough, use with care!
Added:
>
>
Using a bash for loop to set replication to 1 over a list of directories in a file called rep1.txt:

while read line;do echo `date +%T` $line;hadoop fs -setrep -R 1 $line > /dev/null;done < rep1.txt
 

Namespaces with Replication Set to 1

As of 2014-03-27 we have Replication reduced to 1 for the following:

  • /cms/phedex/store/data/Run2012D/
Added:
>
>
2014-08-04:
  • /cms/phedex/store/himc
  • /cms/phedex/store/data/Run2012A
  • /cms/phedex/store/data/Run2012B
  • /cms/phedex/store/data/Run2012C
  • /cms/phedex/store/data/Fall13
  • /cms/phedex/store/data/Summer13
  • /cms/phedex/store/relval
  • /cms/phedex/store/mc/Summer12_DR53X
 

Monitoring Fallback

Currently the best place to check to ensure fallback is working as expected is the udp log at xrootd-proxy.t2.ucsd.edu:

Revision 22014/03/27 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HDFS Xrootd Fallback Administration

Line: 23 to 23
  These examples only require built-in hadoop commands.
Added:
>
>
ALERT! NOTE you need to either be the cmswriter user to modify replication or have root access on a node with a hadoop client.
 Set replication to 1 for a single file:

hadoop fs -setrep 1 /some/file/in/hadoop
Line: 33 to 35
  ALERT! NOTE the -R option can cause a load increase on the namenode if the file namespace is large enough, use with care!
Added:
>
>

Namespaces with Replication Set to 1

As of 2014-03-27 we have Replication reduced to 1 for the following:

  • /cms/phedex/store/data/Run2012D/
 

Monitoring Fallback

Currently the best place to check to ensure fallback is working as expected is the udp log at xrootd-proxy.t2.ucsd.edu:

Revision 12014/03/27 - Main.JeffreyDost

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

HDFS Xrootd Fallback Administration

Finding Files with a Given Replication

The examples here use the hadoop_lsr tool, which can be found on uaf at:

~jdost/bin/hadoop_lsr

List files of a given replication (2 in this example):

hadoop_lsr 2 /some/dir/in/hadoop

Find all files of replication 2 recursively in all subdirectories:

hadoop_lsr 2 -R /some/dir/in/hadoop

ALERT! NOTE the -R option can cause a load increase on the namenode if the file namespace is large enough, use with care!

Changing File Replication

These examples only require built-in hadoop commands.

Set replication to 1 for a single file:

hadoop fs -setrep 1 /some/file/in/hadoop

Set replication to 1 recursively for all files in a directory and all its subdirectories:

hadoop fs -setrep -R 1 /some/dir/in/hadoop

ALERT! NOTE the -R option can cause a load increase on the namenode if the file namespace is large enough, use with care!

Monitoring Fallback

Currently the best place to check to ensure fallback is working as expected is the udp log at xrootd-proxy.t2.ucsd.edu:

/var/log/xrootd/hdfs-mon-snatcher.log

This section will be updated once more monitoring tools are developed.

-- JeffreyDost - 2014/03/27

 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback