Contents

Hadoop - 安裝及設定

Hadoop 為 Apache 基金會的開源頂級專案,為軟體框架做為分散式儲存及運算,無論是增減加機器都能處理,另具備高可用性、數據副本等能力

機器基本訊息:

  1. 準備五台機器 (兩台主節點、三台工作節點)
IP FQDN HOSTNAME 用途
192.168.1.30 test30.example.org test30 Master 節點 (Namenode)
192.168.1.31 test31.example.org test31 Master 節點 (ResourceManager)
192.168.1.32 test32.example.org test32 Worker 節點
192.168.1.33 test33.example.org test33 Worker 節點
192.168.1.34 test34.example.org test34 Worker 節點
  1. OS : Ubuntu 18.04

  2. 資源配置 :

    • Cpu : 4 core
    • Ram : 8 G
    • Disk : 50 G

建置步驟 - 安裝及設定:

1. 下載及安裝hadoop(管理者身份)

  1. 下載
1
2
cd
wget http://ftp.tc.edu.tw/pub/Apache/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
Info
如果載點失效,請至Apache Hadoop 官網下載
  1. 解壓縮
1
2
tar -tvf hadoop-3.2.1.tar.gz #查看一下檔案內容
tar -xvf hadoop-3.2.1.tar.gz -C /usr/local
  1. 更名
1
mv /usr/local/hadoop-3.2.1 /usr/local/hadoop
  1. 改變資料夾及檔案擁有者
1
chown -R hadoop:hadoop /usr/local/hadoop

2. 設定hadoop使用者環境變數 (Hadoop身份)

  1. 設定.bashrc
1
nano ~/.bashrc

https://i.imgur.com/FdzPaih.png

1
2
3
4
5
6
# Set HADOOP_HOME
export HADOOP_HOME=/usr/local/hadoop
# Set HADOOP_MAPRED_HOME
export HADOOP_MAPRED_HOME=${HADOOP_HOME} 
# Add Hadoop bin and sbin directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
  1. 重新載入設定檔
1
source ~/.bashrc  # . .bashrc
  1. 查看環境變數

https://i.imgur.com/mLRFOgM.png


3. 更改 Hadoop運行程式時環境腳本(Hadoop身份)

1
nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

https://i.imgur.com/SaJCOJS.png

1
2
3
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

4. 更改 Hadoop core-site.xml(Hadoop身份)

1
nano /usr/local/hadoop/etc/hadoop/core-site.xml 

https://i.imgur.com/ECXvLk6.png

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
<property>
   <name>hadoop.tmp.dir</name>
   <value>/home/hadoop/data</value>
   <description>Temporary Directory.</description>
</property>
<property>
   <name>fs.defaultFS</name>
   <value>hdfs://test30.example.org</value>
   <description>Use HDFS as file storage engine</description>
</property>
Tip

Hadoop 3.2.0版之後有檢查語法指令

1
hadoop conftest

https://i.imgur.com/floyZJQ.png


  • 參考 hortonworks 各組件參數配置建議

https://i.imgur.com/83zJ63V.png

GitHub原始碼請點我


5. 更改 Hadoop mapred-site.xml(Hadoop身份)

1
nano /usr/local/hadoop/etc/hadoop/mapred-site.xml

https://i.imgur.com/IXQ4srT.png

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
<property>
	<name>mapreduce.map.memory.mb</name>
	<value>2048</value>
</property>
<property>
	<name>mapreduce.map.java.opts</name>
	<value>-Xmx1638m</value>
</property>
<property>
	<name>mapreduce.reduce.memory.mb</name>
	<value>4096</value>
</property>
<property>
	<name>mapreduce.reduce.java.opts</name>
	<value>-Xmx3276m</value>
</property>
<property>
	<name>yarn.app.mapreduce.am.resource.mb</name>
	<value>4096</value>
</property>
<property>
	<name>yarn.app.mapreduce.am.command-opts</name>
	<value>-Xmx3276m</value>
</property>
<property>
	<name>mapreduce.task.io.sort.mb</name>
	<value>819</value>
</property>
<property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
</property>
<property>
       <name>mapreduce.jobhistory.address</name>
       <value>test32.example.org:10020</value>
</property>
<property>
       <name>mapreduce.jobhistory.webapp.address</name>
       <value>test32.example.org:19888</value>
</property>

6. 更改 Hadoop yarn-site.xml(Hadoop身份)

1
nano /usr/local/hadoop/etc/hadoop/yarn-site.xml

https://i.imgur.com/5rMard1.png

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
</property>
<property>
   <name>yarn.nodemanager.env-whitelist</name>
   <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<property>
   <name>yarn.scheduler.minimum-allocation-mb</name>
   <value>2048</value>
</property>
<property>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>6144</value>
</property>
<property>
   <name>yarn.nodemanager.resource.memory-mb</name>
   <value>6144</value>
</property>
<property>
   <name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
   <value>true</value>
</property>
<property>
   <name>yarn.nodemanager.resource.cpu-vcores</name>
   <value>3</value>
</property>
<property>
   <name>yarn.resourcemanager.hostname</name>
   <value>test31.example.org</value>
</property>	
<property>
   <name>yarn.resourcemanager.scheduler.class</name>
   <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>

7. 更改Hadoop hdfs-site.xml(Hadoop身份)

1
nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

https://i.imgur.com/CfFZs01.png

1
2
3
4
5
<property>
   <name>dfs.permissions.superusergroup</name>
   <value>hadoop</value>
   <description>The name of the group of super-users. The value should be a single group name.</description>
</property>

8. 建立Hadoop worker檔(管理者身份)

1
nano /usr/local/hadoop/etc/hadoop/workers

https://i.imgur.com/ZCqlC08.png

Success
如上述步驟都完成,可以下一步驟Hadoop - 啟動服務


如果你還沒有註冊 Like Coin,你可以在文章最下方看到 Like 的按鈕,點下去後即可申請帳號,透過申請帳號後可以幫我的文章按下 Like,而 Like 最多可以點五次,而你不用付出任何一塊錢,就能給我寫這篇文章的最大的回饋!