【オリジナルサイト】
Welcome to Apache™ Hadoop™!
Hadoop Common Releases

Other Info
Linux と Hadoop による分散コンピューティング

【JAVAのインストール】
■ JDKインストール Java SE Development Kit(JDK、v1.6以上推奨)
http://www.oracle.com/technetwork/java/javase/downloads/index.html
http://www.oracle.com/technetwork/java/javase/downloads/jdk-6u32-downloads-1594644.html

[root@colinux hadoop]# ls -l
合計 67060
-rw-rw-r– 1 root root 68593311 2012-05-12 08:03 jdk-6u32-linux-i586-rpm.bin
[root@colinux hadoop]# chmod 755 jdk-6u32-linux-i586-rpm.bin

[root@colinux hadoop]# ./jdk-6u32-linux-i586-rpm.bin
Unpacking…
Checksumming…
Extracting…
UnZipSFX 5.50 of 17 February 2002, by Info-ZIP (Zip-Bugs@lists.wku.edu).
inflating: jdk-6u32-linux-i586.rpm
inflating: sun-javadb-common-10.6.2-1.1.i386.rpm
inflating: sun-javadb-core-10.6.2-1.1.i386.rpm
inflating: sun-javadb-client-10.6.2-1.1.i386.rpm
inflating: sun-javadb-demo-10.6.2-1.1.i386.rpm
inflating: sun-javadb-docs-10.6.2-1.1.i386.rpm
inflating: sun-javadb-javadoc-10.6.2-1.1.i386.rpm
準備中… ########################################### [100%]
1:jdk ########################################### [100%]
Unpacking JAR files…
rt.jar…
jsse.jar…
charsets.jar…
tools.jar…
localedata.jar…
plugin.jar…
javaws.jar…
deploy.jar…
Installing JavaDB
準備中… ########################################### [100%]
1:sun-javadb-common ########################################### [ 17%]
2:sun-javadb-core ########################################### [ 33%]
3:sun-javadb-client ########################################### [ 50%]
4:sun-javadb-demo ########################################### [ 67%]
5:sun-javadb-docs ########################################### [ 83%]
6:sun-javadb-javadoc ########################################### [100%]

Java(TM) SE Development Kit 6 successfully installed.

Product Registration is FREE and includes many benefits:
* Notification of new versions, patches, and updates
* Special offers on Oracle products, services and training
* Access to early releases and documentation

[root@colinux hadoop]# java -version
java version “1.6.0_32”
Java(TM) SE Runtime Environment (build 1.6.0_32-b05)
Java HotSpot(TM) Client VM (build 20.7-b02, mixed mode, sharing)
[root@colinux hadoop]#

【 Hadoopインストール】
※2012年5月現在
http://hadoop.apache.org/common/releases.html#Download
http://ftp.kddilabs.jp/infosystems/apache/hadoop/common/
http://ftp.kddilabs.jp/infosystems/apache/hadoop/common/stable/

1.0.X – current stable version, 1.0 release
1.1.X – current beta version, 1.1 release
0.23.X – current alpha version, MR2
0.22.X – does not include security
0.20.203.X – legacy stable version
0.20.X – legacy version

リリースノート
http://ftp.kddilabs.jp/infosystems/apache/hadoop/common/stable/RELEASE_NOTES_HADOOP-1.0.1.html

[root@colinux hadoop]# wget http://ftp.kddilabs.jp/infosystems/apache/hadoop/common/stable/hadoop-1.0.1.tar.gz
–2012-05-12 12:11:03– http://ftp.kddilabs.jp/infosystems/apache/hadoop/common/stable/hadoop-1.0.1.tar.gz
ftp.kddilabs.jp をDNSに問いあわせています… 192.26.91.193, 2001:200:601:10:206:5bff:fef0:466c
ftp.kddilabs.jp|192.26.91.193|:80 に接続しています… 接続しました。
HTTP による接続要求を送信しました、応答を待っています… 200 OK
長さ: 60811130 (58M) [application/x-gzip]
`hadoop-1.0.1.tar.gz’ に保存中

100%[=============================================================================================================================>] 60,811,130 3.24M/s 時間 21s

2012-05-12 12:11:24 (2.72 MB/s) – `hadoop-1.0.1.tar.gz’ へ保存完了 [60811130/60811130]

[root@colinux hadoop]#

[root@colinux hadoop]# mv hadoop-1.0.1.tar.gz /usr/local/
[root@colinux local]# pwd
/usr/local
[root@colinux local]# tar zxf hadoop-1.0.1.tar.gz
[root@colinux local]#

[root@colinux local]# ls -l
合計 84
drwxr-xr-x 2 root root 4096 2011-12-10 10:12 bin
drwxr-xr-x 2 root root 4096 2011-12-10 10:12 etc
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 games
drwxr-xr-x 14 root root 4096 2012-02-14 17:18 hadoop-1.0.1
drwxr-xr-x 3 root root 4096 2011-11-05 09:00 include
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 lib
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 libexec
lrwxrwxrwx 1 root root 38 2009-12-26 01:40 mysql -> mysql-5.5.0-m2-linux-i686-icc-glibc23/
drwxr-xr-x 14 mysql mysql 4096 2009-12-22 00:23 mysql-5.1.41-linux-i686-icc-glibc23
drwxr-xr-x 14 mysql mysql 4096 2009-12-26 01:37 mysql-5.5.0-m2-linux-i686-icc-glibc23
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 sbin
drwxr-xr-x 6 root root 4096 2011-12-10 10:12 share
drwxr-xr-x 2 root root 4096 2011-01-09 17:14 src
drwxrwxrwt 2 root root 40 2012-05-12 06:49 tmp
[root@colinux local]# ln -s /usr/local/hadoop-1.0.1 /usr/local/hadoop
[root@colinux local]# ls -l
合計 84
drwxr-xr-x 2 root root 4096 2011-12-10 10:12 bin
drwxr-xr-x 2 root root 4096 2011-12-10 10:12 etc
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 games
lrwxrwxrwx 1 root root 23 2012-05-12 12:17 hadoop -> /usr/local/hadoop-1.0.1
drwxr-xr-x 14 root root 4096 2012-02-14 17:18 hadoop-1.0.1
drwxr-xr-x 3 root root 4096 2011-11-05 09:00 include
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 lib
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 libexec
lrwxrwxrwx 1 root root 38 2009-12-26 01:40 mysql -> mysql-5.5.0-m2-linux-i686-icc-glibc23/
drwxr-xr-x 14 mysql mysql 4096 2009-12-22 00:23 mysql-5.1.41-linux-i686-icc-glibc23
drwxr-xr-x 14 mysql mysql 4096 2009-12-26 01:37 mysql-5.5.0-m2-linux-i686-icc-glibc23
drwxr-xr-x 2 root root 4096 2007-04-17 21:46 sbin
drwxr-xr-x 6 root root 4096 2011-12-10 10:12 share
drwxr-xr-x 2 root root 4096 2011-01-09 17:14 src
drwxrwxrwt 2 root root 40 2012-05-12 06:49 tmp
[root@colinux local]#

【Hadoopサービスアカウント設定(パス無し鍵認証)】

[root@colinux local]# /usr/sbin/useradd hadoop
[root@colinux local]# chown -R hadoop:hadoop /usr/local/hadoop-1.0.1
[root@colinux local]#
[root@colinux local]# passwd hadoop
Changing password for user hadoop.
新しいUNIX パスワード:
新しいUNIX パスワードを再入力してください:
passwd: all authentication tokens updated successfully.
[root@colinux local]#
[root@colinux local]# id hadoop
uid=503(hadoop) gid=503(hadoop) 所属グループ=503(hadoop)
[root@colinux local]#

[root@colinux local]# su – hadoop
[hadoop@colinux ~]$ ssh-keygen -t dsa -P ” -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Created directory ‘/home/hadoop/.ssh’.
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
d0:5c:57:22:9b:8e:38:97:e4:47:0f:ac:08:13:4c:ae hadoop@colinux
[hadoop@colinux ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@colinux ~]$ chmod 600 ~/.ssh/authorized_keys
[hadoop@colinux ~]$

[hadoop@colinux ~]$ ssh localhost
The authenticity of host ‘localhost (127.0.0.1)’ can’t be established.
RSA key fingerprint is a2:b7:25:e3:78:61:15:2a:59:ed:fb:9f:1c:e7:94:db.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘localhost’ (RSA) to the list of known hosts.
[hadoop@colinux ~]$exit
[hadoop@colinux ~]$ ssh localhost
Last login: Sat May 12 12:31:20 2012 from localhost.localdomain
[hadoop@colinux ~]$

[hadoop@colinux ~]$ ls -l /usr/java/
合計 4
lrwxrwxrwx 1 root root 16 2012-05-12 08:11 default -> /usr/java/latest
drwxr-xr-x 7 root root 4096 2012-05-12 08:11 jdk1.6.0_32
lrwxrwxrwx 1 root root 21 2012-05-12 08:11 latest -> /usr/java/jdk1.6.0_32
[hadoop@colinux ~]$

【HADOOP設定ファイル変更】

[hadoop@colinux ~]$ cd /usr/local/hadoop-1.0.1/conf/

[hadoop@colinux conf]$ vi hadoop-env.sh
# Set Hadoop-specific environment variables here.
—————————————————-
# The java implementation to use. Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/java/default
# Extra Java CLASSPATH elements. Optional.
# export HADOOP_CLASSPATH=
—————————————————-
[hadoop@colinux conf]$ vi core-site.xml
[hadoop@colinux conf]$ cat core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

[hadoop@colinux conf]$
[hadoop@colinux conf]$ vi hdfs-site.xml
[hadoop@colinux conf]$ cat hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

[hadoop@colinux conf]$
[hadoop@colinux conf]$ vi mapred-site.xml
[hadoop@colinux conf]$ cat mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
     <property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
</configuration>

[hadoop@colinux conf]$

【初期設定とサービスの開始】

[hadoop@colinux conf]$ /usr/local/hadoop/bin/hadoop namenode -format
12/05/12 13:05:05 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = colinux/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.0.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by ‘hortonfo’ on Tue Feb 14 08:15:38 UTC 2012
************************************************************/
12/05/12 13:05:06 INFO util.GSet: VM type = 32-bit
12/05/12 13:05:06 INFO util.GSet: 2% max memory = 19.33375 MB
12/05/12 13:05:06 INFO util.GSet: capacity = 2^22 = 4194304 entries
12/05/12 13:05:06 INFO util.GSet: recommended=4194304, actual=4194304
12/05/12 13:05:08 INFO namenode.FSNamesystem: fsOwner=hadoop
12/05/12 13:05:08 INFO namenode.FSNamesystem: supergroup=supergroup
12/05/12 13:05:08 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/05/12 13:05:08 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/05/12 13:05:08 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/05/12 13:05:08 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/05/12 13:05:09 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/05/12 13:05:10 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
12/05/12 13:05:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at colinux/127.0.0.1
************************************************************/
[hadoop@colinux conf]$

[hadoop@colinux conf]$ /usr/local/hadoop/bin/start-all.sh
starting namenode, logging to /usr/local/hadoop-1.0.1/libexec/../logs/hadoop-hadoop-namenode-colinux.out
localhost: starting datanode, logging to /usr/local/hadoop-1.0.1/libexec/../logs/hadoop-hadoop-datanode-colinux.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop-1.0.1/libexec/../logs/hadoop-hadoop-secondarynamenode-colinux.out
starting jobtracker, logging to /usr/local/hadoop-1.0.1/libexec/../logs/hadoop-hadoop-jobtracker-colinux.out
localhost: starting tasktracker, logging to /usr/local/hadoop-1.0.1/libexec/../logs/hadoop-hadoop-tasktracker-colinux.out
[hadoop@colinux conf]$

[hadoop@colinux conf]$ jps
4689 Jps
4313 SecondaryNameNode
4062 NameNode
4186 DataNode
4561 TaskTracker
4399 JobTracker
[hadoop@colinux conf]$

【基本設定確認】
NameNode
$ http://localhost:50070/
 例) http://192.168.0.2:50070/dfshealth.jsp

JobTracker
$ http://localhost:50030/
 例)http://192.168.0.2:50030/jobtracker.jsp

【サンプルテスト】

[hadoop@colinux hadoop]$ ./bin/hadoop jar hadoop-examples-1.0.1.jar pi 4 1000
Number of Maps = 4
Samples per Map = 1000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Starting Job
12/05/12 13:24:36 INFO mapred.FileInputFormat: Total input paths to process : 4
12/05/12 13:24:37 INFO mapred.JobClient: Running job: job_201205121308_0001
12/05/12 13:24:38 INFO mapred.JobClient: map 0% reduce 0%
12/05/12 13:25:35 INFO mapred.JobClient: map 25% reduce 0%
12/05/12 13:25:50 INFO mapred.JobClient: map 50% reduce 0%
12/05/12 13:26:35 INFO mapred.JobClient: map 75% reduce 0%
12/05/12 13:26:54 INFO mapred.JobClient: map 75% reduce 16%
12/05/12 13:27:09 INFO mapred.JobClient: map 100% reduce 25%
12/05/12 13:27:16 INFO mapred.JobClient: map 100% reduce 33%
12/05/12 13:27:29 INFO mapred.JobClient: map 100% reduce 100%
12/05/12 13:27:43 INFO mapred.JobClient: Job complete: job_201205121308_0001
12/05/12 13:27:45 INFO mapred.JobClient: Counters: 30
12/05/12 13:27:45 INFO mapred.JobClient: Job Counters
12/05/12 13:27:45 INFO mapred.JobClient: Launched reduce tasks=1
12/05/12 13:27:45 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=245935
12/05/12 13:27:45 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
12/05/12 13:27:45 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
12/05/12 13:27:45 INFO mapred.JobClient: Launched map tasks=4
12/05/12 13:27:45 INFO mapred.JobClient: Data-local map tasks=4
12/05/12 13:27:45 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=110001
12/05/12 13:27:45 INFO mapred.JobClient: File Input Format Counters
12/05/12 13:27:45 INFO mapred.JobClient: Bytes Read=472
12/05/12 13:27:45 INFO mapred.JobClient: File Output Format Counters
12/05/12 13:27:45 INFO mapred.JobClient: Bytes Written=97
12/05/12 13:27:45 INFO mapred.JobClient: FileSystemCounters
12/05/12 13:27:45 INFO mapred.JobClient: FILE_BYTES_READ=94
12/05/12 13:27:45 INFO mapred.JobClient: HDFS_BYTES_READ=964
12/05/12 13:27:45 INFO mapred.JobClient: FILE_BYTES_WRITTEN=108240
12/05/12 13:27:45 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=215
12/05/12 13:27:45 INFO mapred.JobClient: Map-Reduce Framework
12/05/12 13:27:45 INFO mapred.JobClient: Map output materialized bytes=112
12/05/12 13:27:45 INFO mapred.JobClient: Map input records=4
12/05/12 13:27:45 INFO mapred.JobClient: Reduce shuffle bytes=112
12/05/12 13:27:45 INFO mapred.JobClient: Spilled Records=16
12/05/12 13:27:45 INFO mapred.JobClient: Map output bytes=72
12/05/12 13:27:45 INFO mapred.JobClient: Total committed heap usage (bytes)=816316416
12/05/12 13:27:45 INFO mapred.JobClient: CPU time spent (ms)=44840
12/05/12 13:27:45 INFO mapred.JobClient: Map input bytes=96
12/05/12 13:27:45 INFO mapred.JobClient: SPLIT_RAW_BYTES=492
12/05/12 13:27:45 INFO mapred.JobClient: Combine input records=0
12/05/12 13:27:45 INFO mapred.JobClient: Reduce input records=8
12/05/12 13:27:45 INFO mapred.JobClient: Reduce input groups=8
12/05/12 13:27:45 INFO mapred.JobClient: Combine output records=0
12/05/12 13:27:45 INFO mapred.JobClient: Physical memory (bytes) snapshot=451342336
12/05/12 13:27:45 INFO mapred.JobClient: Reduce output records=0
12/05/12 13:27:45 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1873936384
12/05/12 13:27:45 INFO mapred.JobClient: Map output records=8
Job Finished in 189.673 seconds
Estimated value of Pi is 3.14000000000000000000
[hadoop@colinux hadoop]$

[hadoop@colinux hadoop]$ /usr/local/hadoop/bin/stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
[hadoop@colinux hadoop]$

ジョブ実行中の管理画面①

ジョブ実行中の管理画面②

ジョブ実行中の管理画面③ 実行時間など

ジョブ実行中の管理画面④ その他詳細

ジョブ実行中の管理画面⑤ 

参考サイト
Apache Hadoopプロジェクトとは何か?

Hadoop入門 – Hadoopと高可用性(09-MAY-2012)

Hadoopをインストールし使ってみる(06-APR-2011)

Nagiosで Hadoopを監視する(21-APR-2011)

Gangliaで Hadoopを監視する(22-APR-2011)

Comments are closed.

Post Navigation