Difference: HdfsXrootd (1 vs. 13)

Revision 132013/12/08 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 89 to 89
 

Obtain Dependencies for HDFS

Deleted:
<
<
Enable osg-development repo by setting enabled=1 in
/etc/yum.repos.d/osg-el6-development.repo
 Install dependencies:
Changed:
<
<
yum install maven3 yum install protobuf-compiler
>
>
yum --enablerepo=osg-development install maven3 yum --enablerepo=osg-development install protobuf-compiler
 

Add mvn to path:

Line: 119 to 116
 Currently the only jar file we modify is hadoop-hdfs-2.0.0-cdh4.1.1.jar. To build this, run:
Changed:
<
<
cd $PROJDIR/hadoop-2.0.0-cdh4.1.1/src/hadoop-hdfs-project
>
>
cd $PROJDIR/hadoop-2.0.0-cdh4.1.1/src/hadoop-hdfs-project/hadoop-hdfs
 mvn install -DskipTests

The jar will be located in:

Changed:
<
<
$PROJDIR/hadoop-2.0.0-cdh4.1.1/src/hadoop-hdfs-project/target/hadoop-hdfs-2.0.0-cdh4.1.1.jar
>
>
$PROJDIR/hadoop-2.0.0-cdh4.1.1/src/hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-2.0.0-cdh4.1.1.jar
  To test, this jar should replace:
/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.1.1.jar

Revision 122013/12/08 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 8 to 8
 

Build Instructions

Changed:
<
<
This section will use $PROJDIR as shorthand to denote a common area for the project source code to be located. Some optional instructions below are intended only if you plan to modify the Hadoop 2.0 extendable_client.patch and are separated in their own sections.
>
>
The sections below will use $PROJDIR as shorthand to denote a common area for the project source code to be located.
 

Obtain Dependencies

Line: 31 to 31
 NOTE RHEL >= 6.5 may break some OSG java 7 dependencies. If this happens, it can be fixed by running:
yum install osg-java7-compat
Deleted:
<
<
NOTE if you plan to rebuild hadoop with the patch as in the optional sections below you don't need to install hadoop-hdfs.
 Checkout hdfs-xrootd-fallback repo:
svn checkout https://svn.gled.org/var/trunk/hdfs-xrootd-fallback
Line: 41 to 39
 export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64
Deleted:
<
<

Obtain Dependencies for HDFS patch (Optional)

Install dependencies:

yum install maven3
yum install protobuf-compiler

Add mvn to path:

export PATH=$PATH:/usr/share/apache-maven-3.0.4/bin

Obtain cloudera hadoop tarball:

cd $PROJDIR
wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.1.1.tar.gz
tar -xzf hadoop-2.0.0-cdh4.1.1.tar.gz

Patch hadoop to allow DFSInputStream inheritance:

cd $PROJDIR/hadoop-2.0.0-cdh4.1.1
patch -p1 < $PROJDIR/hdfs-xrootd/extendable_client.patch
 

Build hdfs-xrootd-fallback

cd $PROJDIR/hdfs-xrootd-fallback
Line: 75 to 49
 
  • hdfs-xrootd-fallback-1.0.0.jar
  • libXrdBlockFetcher.so.1.0.0
Deleted:
<
<

Build hdfs-xrootd-fallback including patched hadoop (Optional)

cd $PROJDIR/hdfs-xrootd-fallback
make

This will generate the following files:

  • hadoop-hdfs-2.0.0-cdh4.1.1.jar
  • hdfs-xrootd-fallback.jar
  • libXrdBlockFetcher.so
 

Deployment for Testing

Assumptions - a hadoop cluster is available and a hadoop client node is properly configured to access it. The following instructions are only required on the client node.

Line: 120 to 83
 
hadoop-fuse-dfs -oserver=xfbfs://cabinet-10-10-3.t2.ucsd.edu,port=8020,allow_other,rw,rdbuffer=4096 -d /mnt/hadoop
Added:
>
>

Modifying Hadoop

The instructions here only need to be followed if you plan to modify the extendable-client.patch.

Obtain Dependencies for HDFS

Enable osg-development repo by setting enabled=1 in

/etc/yum.repos.d/osg-el6-development.repo

Install dependencies:

yum install maven3
yum install protobuf-compiler

Add mvn to path:

export PATH=$PATH:/usr/share/apache-maven-3.0.4/bin

Obtain cloudera hadoop tarball:

cd $PROJDIR
wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.1.1.tar.gz
tar -xzf hadoop-2.0.0-cdh4.1.1.tar.gz

Patch hadoop to allow DFSInputStream inheritance:

cd $PROJDIR/hadoop-2.0.0-cdh4.1.1
patch -p1 < $PROJDIR/hdfs-xrootd-fallback/extendable_client.patch

Building HDFS

Currently the only jar file we modify is hadoop-hdfs-2.0.0-cdh4.1.1.jar. To build this, run:

cd $PROJDIR/hadoop-2.0.0-cdh4.1.1/src/hadoop-hdfs-project
mvn install -DskipTests

The jar will be located in:

$PROJDIR/hadoop-2.0.0-cdh4.1.1/src/hadoop-hdfs-project/target/hadoop-hdfs-2.0.0-cdh4.1.1.jar

To test, this jar should replace:

/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.1.1.jar

Creating Updated Patch File

Assuming changes have been made to $PROJDIR/hadoop-2.0.0-cdh4.1.1 and an original unmodded copy was untarred into $PROJDIR/hadoop-2.0.0-cdh4.1.1.orig, run the following to generate a patch file:

$PROJDIR/hdfs-xrootd-fallback/create_hdpatch.sh $PROJDIR/hadoop-2.0.0-cdh4.1.1.orig $PROJDIR/hadoop-2.0.0-cdh4.1.1

The new patch will be created in the current directory and be named extendable_client.patch.

 

Useful Test Commands

Put File In Hadoop

Line: 187 to 204
  Temporarily rename the file with mv along with its associated meta file (ending in .meta) to "corrupt" it.
Changed:
<
<
NOTE sometimes you have to restart the namenode to get hadoop to notice the block is "fixed" after replacing the moved block.
>
>
NOTE sometimes you have to restart the datanode to get hadoop to notice the block is "fixed" after replacing the moved block.
 
Changed:
<
<
On cabinet-10-10-3.t2.ucsd.edu:
>
>
On cabinet-10-10-8.t2.ucsd.edu:
 
Changed:
<
<
service hadoop-hdfs-namenode restart
>
>
service hadoop-hdfs-datanode restart
  -- JeffreyDost - 2013/07/25

Revision 112013/12/08 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 68 to 68
 

Build hdfs-xrootd-fallback

cd $PROJDIR/hdfs-xrootd-fallback
Changed:
<
<
make -f Makefile_rpm
>
>
make
 

This will generate the following files:

Line: 90 to 90
  Assumptions - a hadoop cluster is available and a hadoop client node is properly configured to access it. The following instructions are only required on the client node.
Changed:
<
<

Copy Binaries

>
>

Install Binaries

make install prefix=/usr sysconfdir=/etc
 
Changed:
<
<
Replace:
/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.1.1.jar
Copy regenfs.jar to:
/usr/lib/hadoop/client/
Copy libXrdBlockFetcher.so to:
/usr/lib/
>
>
This will install the following:
/usr/lib/hadoop/client/hdfs-xrootd-fallback-1.0.0.jar
/usr/lib64/libXrdBlockFetcher.so
/usr/lib64/libXrdBlockFetcher.so.1
/usr/lib64/libXrdBlockFetcher.so.1.0.0

Also a template config file will be installed in:

/etc/hadoop/conf.osg/xfbfs-site.xml
 

Configuration

Append the following to /etc/hadoop/conf/core-site.xml:

  <property>
Changed:
<
<
fs.regfs.impl regenfs.RegenerativeFileSystem
>
>
fs.xfbfs.impl org.xrootd.hdfs.fallback.XrdFallBackFileSystem
 
Added:
>
>
Copy the /etc/hadoop/conf.osg/xfbfs-site.xml template file to /etc/hadoop/conf and modify the relevant values.
 

Startup Fuse in Debug Mode

Changed:
<
<
hadoop-fuse-dfs -oserver=regfs://cabinet-10-10-3.t2.ucsd.edu,port=8020,allow_other,rw,rdbuffer=4096 -d /mnt/hadoop
>
>
hadoop-fuse-dfs -oserver=xfbfs://cabinet-10-10-3.t2.ucsd.edu,port=8020,allow_other,rw,rdbuffer=4096 -d /mnt/hadoop
 

Useful Test Commands

Revision 102013/12/07 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 12 to 12
 

Obtain Dependencies

Changed:
<
<
Follow steps 1 - 4 at Install the Yum Repositories required by OSG to obtain OSG yum repo.
>
>
Follow steps at Install the Yum Repositories required by OSG to obtain OSG yum repo >= version 3.2.
  Disable osg-release repo by setting enabled=0 in
/etc/yum.repos.d/osg-el6.repo
Changed:
<
<
Enable osg-upcoming-development repo by setting enabled=1 in
/etc/yum.repos.d/osg-el6-upcoming-development.repo
>
>
Enable osg-testing repo by setting enabled=1 in
/etc/yum.repos.d/osg-el6-testing.repo
  Install dependencies:
Line: 28 to 28
 yum install hadoop-hdfs
Added:
>
>
NOTE RHEL >= 6.5 may break some OSG java 7 dependencies. If this happens, it can be fixed by running:
yum install osg-java7-compat
 NOTE if you plan to rebuild hadoop with the patch as in the optional sections below you don't need to install hadoop-hdfs.

Checkout hdfs-xrootd-fallback repo:

Revision 92013/09/15 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 8 to 8
 

Build Instructions

Changed:
<
<
This section will use $PROJDIR as shorthand to denote a common area for the project source code to be located.
>
>
This section will use $PROJDIR as shorthand to denote a common area for the project source code to be located. Some optional instructions below are intended only if you plan to modify the Hadoop 2.0 extendable_client.patch and are separated in their own sections.
 

Obtain Dependencies

Line: 17 to 17
 Disable osg-release repo by setting enabled=0 in
/etc/yum.repos.d/osg-el6.repo
Changed:
<
<
Enable osg-development repo by setting enabled=1 in
/etc/yum.repos.d/osg-el6-development.repo
>
>
Enable osg-upcoming-development repo by setting enabled=1 in
/etc/yum.repos.d/osg-el6-upcoming-development.repo

Install dependencies:

yum install java-1.7.0-openjdk-devel
yum install pcre-devel
yum install xrootd-client-devel
yum install hadoop-hdfs

NOTE if you plan to rebuild hadoop with the patch as in the optional sections below you don't need to install hadoop-hdfs.

Checkout hdfs-xrootd-fallback repo:

svn checkout https://svn.gled.org/var/trunk/hdfs-xrootd-fallback

Make sure JAVA_HOME is set:

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64

Obtain Dependencies for HDFS patch (Optional)

  Install dependencies:
yum install maven3
yum install protobuf-compiler
Deleted:
<
<
yum install xrootd-client-devel
 

Add mvn to path:

Line: 37 to 56
 tar -xzf hadoop-2.0.0-cdh4.1.1.tar.gz
Deleted:
<
<
Checkout hdfs-xrootd repo:
svn checkout https://svn.gled.org/var/trunk/hdfs-xrootd
 Patch hadoop to allow DFSInputStream inheritance:
cd $PROJDIR/hadoop-2.0.0-cdh4.1.1
patch -p1 < $PROJDIR/hdfs-xrootd/extendable_client.patch
Changed:
<
<
NOTE until OSG moves to OpenJDK 1.7 we should build with Oracle Java 1.6. To obtain first remove java if installed, then install Oracle from OSG repo:
>
>

Build hdfs-xrootd-fallback

 
Changed:
<
<
yum remove java-1.7.0-openjdk yum remove java-1.6.0-openjdk yum install java-1.6.0-sun-compat
>
>
cd $PROJDIR/hdfs-xrootd-fallback make -f Makefile_rpm
 
Changed:
<
<

Build hdfs-xrootd

>
>
This will generate the following files:
  • hdfs-xrootd-fallback-1.0.0.jar
  • libXrdBlockFetcher.so.1.0.0

Build hdfs-xrootd-fallback including patched hadoop (Optional)

 
Changed:
<
<
cd $PROJDIR/hdfs-xrootd
>
>
cd $PROJDIR/hdfs-xrootd-fallback
 make
Deleted:
<
<
To alternatively build with Oracle Java 1.6:
make JJ=/usr/lib/jvm/java-1.6.0-sun-1.6.0.45
 This will generate the following files:
  • hadoop-hdfs-2.0.0-cdh4.1.1.jar
Changed:
<
<
  • regenfs.jar
>
>
  • hdfs-xrootd-fallback.jar
 
  • libXrdBlockFetcher.so

Deployment for Testing

Revision 82013/08/14 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 163 to 163
  NOTE sometimes you have to restart the namenode to get hadoop to notice the block is "fixed" after replacing the moved block.
Changed:
<
<
On cabinet-10-10-4.t2.ucsd.edu:
>
>
On cabinet-10-10-3.t2.ucsd.edu:
 
service hadoop-hdfs-namenode restart

Revision 72013/08/01 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 161 to 161
  Temporarily rename the file with mv along with its associated meta file (ending in .meta) to "corrupt" it.
Added:
>
>
NOTE sometimes you have to restart the namenode to get hadoop to notice the block is "fixed" after replacing the moved block.

On cabinet-10-10-4.t2.ucsd.edu:

service hadoop-hdfs-namenode restart
 -- JeffreyDost - 2013/07/25

Revision 62013/07/31 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 146 to 146
  </>
<--/twistyPlugin-->
Added:
>
>

Corrupt A Block

Run fsck as above to find the node and filename:
1. BP-902182059-169.228.130.162-1360642643875:blk_-1470524685933485700_1033 len=51200 repl=1 [169.228.130.161:50010]

The block filename in this example is: blk_-1470524685933485700

Look in /etc/hadoop/conf/hdfs-site.xml to find value of dfs.data.dir

Locate block:

find dfs.data.dir -name blk_-1470524685933485700

Temporarily rename the file with mv along with its associated meta file (ending in .meta) to "corrupt" it.

 -- JeffreyDost - 2013/07/25

Revision 52013/07/31 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 109 to 109
 
hdfs fsck /store/user/matevz/xxx_for_jni_test -files -blocks -locations

sample output:

Added:
>
>
<--/twistyPlugin twikiMakeVisibleInline-->
 
Connecting to namenode via http://cabinet-10-10-3.t2.ucsd.edu:50070
FSCK started by jdost (auth:SIMPLE) from /169.228.131.30 for path /store/user/matevz/xxx_for_jni_test at Tue Jul 30 19:11:13 PDT 2013
Line: 137 to 144
  The filesystem under path '/store/user/matevz/xxx_for_jni_test' is HEALTHY
Added:
>
>
<--/twistyPlugin-->
  -- JeffreyDost - 2013/07/25 \ No newline at end of file

Revision 42013/07/31 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 94 to 94
 
hadoop-fuse-dfs -oserver=regfs://cabinet-10-10-3.t2.ucsd.edu,port=8020,allow_other,rw,rdbuffer=4096 -d /mnt/hadoop
Added:
>
>

Useful Test Commands

Put File In Hadoop

This example shows how to specify blocksize:

hadoop fs -Ddfs.blocksize=51200 -put xxx_for_jni_test /store/user/matevz

Use Hadoop fsck

To obtain useful info about block locations:

hdfs fsck /store/user/matevz/xxx_for_jni_test -files -blocks -locations

sample output:

Connecting to namenode via http://cabinet-10-10-3.t2.ucsd.edu:50070
FSCK started by jdost (auth:SIMPLE) from /169.228.131.30 for path /store/user/matevz/xxx_for_jni_test at Tue Jul 30 19:11:13 PDT 2013
/store/user/matevz/xxx_for_jni_test 117499 bytes, 3 block(s):  OK
0. BP-902182059-169.228.130.162-1360642643875:blk_4838916777429210792_1032 len=51200 repl=1 [169.228.130.161:50010]
1. BP-902182059-169.228.130.162-1360642643875:blk_-1470524685933485700_1033 len=51200 repl=1 [169.228.130.161:50010]
2. BP-902182059-169.228.130.162-1360642643875:blk_-2496259640795403674_1034 len=15099 repl=1 [169.228.130.161:50010]

Status: HEALTHY
 Total size:   117499 B
 Total dirs:   0
 Total files:   1
 Total blocks (validated):   3 (avg. block size 39166 B)
 Minimally replicated blocks:   3 (100.0 %)
 Over-replicated blocks:   0 (0.0 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks:      0 (0.0 %)
 Default replication factor:   1
 Average block replication:   1.0
 Corrupt blocks:      0
 Missing replicas:      0 (0.0 %)
 Number of data-nodes:      1
 Number of racks:      1
FSCK ended at Tue Jul 30 19:11:13 PDT 2013 in 18 milliseconds


The filesystem under path '/store/user/matevz/xxx_for_jni_test' is HEALTHY
 -- JeffreyDost - 2013/07/25

Revision 32013/07/29 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Line: 8 to 8
 

Build Instructions

Added:
>
>
This section will use $PROJDIR as shorthand to denote a common area for the project source code to be located.
 

Obtain Dependencies

Follow steps 1 - 4 at Install the Yum Repositories required by OSG to obtain OSG yum repo.

Line: 25 to 27
 yum install xrootd-client-devel
Added:
>
>
Add mvn to path:
export PATH=$PATH:/usr/share/apache-maven-3.0.4/bin

Obtain cloudera hadoop tarball:

cd $PROJDIR
wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.1.1.tar.gz
tar -xzf hadoop-2.0.0-cdh4.1.1.tar.gz

Checkout hdfs-xrootd repo:

svn checkout https://svn.gled.org/var/trunk/hdfs-xrootd

Patch hadoop to allow DFSInputStream inheritance:

cd $PROJDIR/hadoop-2.0.0-cdh4.1.1
patch -p1 < $PROJDIR/hdfs-xrootd/extendable_client.patch

NOTE until OSG moves to OpenJDK 1.7 we should build with Oracle Java 1.6. To obtain first remove java if installed, then install Oracle from OSG repo:

yum remove java-1.7.0-openjdk
yum remove java-1.6.0-openjdk
yum install java-1.6.0-sun-compat

Build hdfs-xrootd

cd $PROJDIR/hdfs-xrootd
make

To alternatively build with Oracle Java 1.6:

make JJ=/usr/lib/jvm/java-1.6.0-sun-1.6.0.45

This will generate the following files:

  • hadoop-hdfs-2.0.0-cdh4.1.1.jar
  • regenfs.jar
  • libXrdBlockFetcher.so

Deployment for Testing

Assumptions - a hadoop cluster is available and a hadoop client node is properly configured to access it. The following instructions are only required on the client node.

Copy Binaries

Replace:

/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.1.1.jar
Copy regenfs.jar to:
/usr/lib/hadoop/client/
Copy libXrdBlockFetcher.so to:
/usr/lib/

Configuration

Append the following to /etc/hadoop/conf/core-site.xml:

  <property>
    <name>fs.regfs.impl</name>
    <value>regenfs.RegenerativeFileSystem</value>
  </property>

Startup Fuse in Debug Mode

hadoop-fuse-dfs -oserver=regfs://cabinet-10-10-3.t2.ucsd.edu,port=8020,allow_other,rw,rdbuffer=4096 -d /mnt/hadoop
 -- JeffreyDost - 2013/07/25

Revision 22013/07/29 - Main.JeffreyDost

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

Added:
>
>

Contents

Build Instructions

Obtain Dependencies

Follow steps 1 - 4 at Install the Yum Repositories required by OSG to obtain OSG yum repo.

Disable osg-release repo by setting enabled=0 in

/etc/yum.repos.d/osg-el6.repo

Enable osg-development repo by setting enabled=1 in

/etc/yum.repos.d/osg-el6-development.repo

Install dependencies:

yum install maven3
yum install protobuf-compiler
yum install xrootd-client-devel
 -- JeffreyDost - 2013/07/25

Revision 12013/07/25 - Main.JeffreyDost

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

HdfsXrootd Development Twiki

-- JeffreyDost - 2013/07/25

 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback