1. 概述
- 本文档主要记录hadoop的hdfs高可用集群搭建。
- hdfs高可用集群依赖zookeeper,zookeeper集群搭建教程,请查看上篇文章。01.Zookeeper集群搭建
- zookeeper信息
- zookeeper001.local.com:2181
- zookeeper002.local.com:2181
- zookeeper003.local.com:2181
2. 服务器信息
主机名 | IP | CPU | Mem | 磁盘 | 操作系统 |
hdfs001.local.com | 172.21.0.21 | 4核 | 8GiB | 200GiB | centos 7.x |
hdfs002.local.com | 172.21.0.22 | 4核 | 8GiB | 200GiB | centos 7.x |
hdfs003.local.com | 172.21.0.23 | 4核 | 8GiB | 200GiB | centos 7.x |
- 主机名修改命令
# 172.21.0.21服务器上执行
$ sudo hostnamectl set-hostname hdfs001.local.com
# 172.21.0.22服务器上执行
$ sudo hostnamectl set-hostname hdfs002.local.com
# 172.21.0.23服务器上执行
$ sudo hostnamectl set-hostname hdfs003.local.com
3. 创建部署账户
# 分别在hdfs001,002,003上创建hdfsuser
# 创建账户
$ sudo useradd -m -s /bin/bash -r hdfsuser
# 设置密码
$ sudo passwd hdfsuser
# 设置账户具有sudo权限
$ sudo vim /etc/sudoers
hdfsuser ALL=(ALL) NOPASSWD: ALL
4. 调整安全项
- 分别在hdfs001,002,003上执行下面操作
- 关闭selinux
# 临时关闭
$ sudo setenforce 0
# 永久关闭
$ sduo sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
- 关闭防火墙
$ systemctl stop firewalld
$ systemctl disable firewalld
5. 配置服务器之间免密登陆
- 分别在3台服务器(hdfs001,hdfs002,hdfs003)上执行下面操作
- 编辑/etc/hosts文件,增加服务器主机名映射:
$ sudo vim /etc/hosts
172.21.0.11 zookeeper001.local.com
172.21.0.12 zookeeper002.local.com
172.21.0.13 zookeeper003.local.com
172.21.0.21 hdfs001.local.com
172.21.0.22 hdfs002.local.com
172.21.0.23 hdfs003.local.com
- 生成服务器秘钥
# 切换到hdfsuser账户
$ su - hdfsuser
# 一直回车
$ ssh-keygen
- 复制秘钥到其他服务器
$ ssh-copy-id hdfs001.local.com
$ ssh-copy-id hdfsr002.local.com
$ ssh-copy-id hdfs003.local.com
6. 部署java环境
- 解压 jdk1.8.0_151.tar.gz 到 /usr/java 目录下
$ sudo tar -xvf jdk1.8.0_151.tar.gz
$ sudo mv jdk1.8.0_151 /usr/java
- 编辑/etc/profile,配置java环境变量
$ sudo vim /etc/profile.d/javaenv.sh
#!/bin/bash
#java
export JAVA_HOME=/usr/java/jdk1.8.0_151
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
- 使配置生效
$ source /etc/profile.d/javaenv.sh
- 验证
$ java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
7. 搭建hdfs集群
7.1 配置
- 解压hadoop-3.3.3.tar.gz 到 /opt/software目录下
$ tar -xvf hadoop-3.3.3.tar.gz
# 创建安装目录
$ sudo mkdir /opt/software
$ sudo chown -R hdfsuser.hdfsuser /opt/software
# 移动至/opt/software
$ mv hadoop-3.3.3 /opt/software
- 编辑/etc/profile.d/hadoopenv.sh,增加hadoop环境变量
$ sudo vim /etc/profile.d/hadoopenv.sh
#!/bin/bash
#hadoop
export HADOOP_PREFIX=/opt/software/hadoop-3.3.3
export HADOOP_HOME=/opt/software/hadoop-3.3.3
export HADOOP_HDFS_HOME=/opt/software/hadoop-3.3.3
export HADOOP_CONF_DIR=/opt/software/hadoop-3.3.3/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=/opt/software/hadoop-3.3.3/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
- 修改hadoop-env.sh
$ cd /opt/software/hadoop-3.3.3/etc/hadoop
$ vim hadoop-env.sh
#增加以下内容,hadoop 的操作用户为当前系统用户
export JAVA_HOME=/usr/java/jdk1.8.0_151
export HDFS_NAMENODE_USER="hdfsuser"
export HDFS_DATANODE_USER="hdfsuser"
export HDFS_SECONDARYNAMENODE_USER="hdfsuser"
export YARN_RESOURCEMANAGER_USER="hdfsuser"
export YARN_NODEMANAGER_USER="hdfsuser"
- 编辑 core-site.xml,此处设置了 hdfs 主节点为 hdfs001.local.com和hdfs002.local.com,目的确保数据安全。
$ cd /opt/software/hadoop-3.3.3/etc/hadoop
$ vim core-site.xml
fs.defaultFS
hdfs://ns1/
hadoop.proxyuser.hdfsuser.hosts
*
hadoop.proxyuser.hdfsuser.groups
*
hadoop.tmp.dir
/hdata/hadoop_data/temDir
ha.zookeeper.quorum zookeeper001.local.com:2181,zookeeper002.local.com:2181,zookeeper003.local.com:2181
fs.hdfs.impl
org.apache.hadoop.hdfs.DistributedFileSystem
The FileSystem for hdfs: uris.
- 编辑hdfs-site.xml
$ cd /opt/software/hadoop-3.3.3/etc/hadoop
$ vim hdfs-site.xml
dfs.nameservices
ns1
dfs.ha.namenodes.ns1
nn1,nn2
dfs.namenode.rpc-address.ns1.nn1
hdfs001.local.com:9000
dfs.namenode.http-address.ns1.nn1
hdfs001.local.com:50070
dfs.namenode.rpc-address.ns1.nn2
hdfs002.local.com:9000
dfs.namenode.http-address.ns1.nn2
hdfs002.local.com:50070
dfs.namenode.shared.edits.dir
qjournal://hdfs001.local.com:8485;hdfs002.local..com:8485;hdfs003.local..com:8485/ns1
dfs.journalnode.edits.dir
/hdata/hadoop_data/journaldata
dfs.ha.automatic-failover.enabled
true
dfs.client.failover.proxy.provider.ns1
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
shell(/bin/true)
dfs.ha.fencing.ssh.private-key-files
/home/docloud/.ssh/id_rsa
dfs.ha.fencing.ssh.connect-timeout
30000
dfs.replication
2
dfs.namenode.name.dir
/hdata/hadoop_data/namenode
dfs.datanode.data.dir
/hdata/hadoop_data/datanode
- 编辑 mapred-site.xml
$ cd /opt/software/hadoop-3.3.3/etc/hadoop
$ vim mapred-site.xml
mapreduce.framework.name
yarn
- 编辑 yarn-site.xml
$ cd /opt/software/hadoop-3.3.3/etc/hadoop
$ vim yarn-site.xml
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
yrc
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
hdfs001.local.com
yarn.resourcemanager.hostname.rm2
hdfs002.local.com
yarn.resourcemanager.zk-address
zookeeper001.lcoal.com:2181,zookeeper002.lcoal.com:2181,zookeeper003.lcoal.com:2181
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.env-whitelist
JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
yarn.nodemanager.vmem-check-enabled
false
yarn.nodemanager.resource.memory-mb
49152
yarn.scheduler.maximum-allocation-mb
49152
- 配置授权账户,编辑hadoop-policy.xml,修改一下内容 * 改为hdfsuser
$ cd /opt/software/hadoop-3.3.3/etc/hadoop
$ vim hadoop-policy.xml
security.client.protocol.acl
hdfsuser
- 编辑 workers 文件
$ cd /opt/software/hadoop-3.3.3/etc/hadoop
$ vim workers
hdfs001.local.com
hdfs002.local.com
hdfs003.local.com
- 将hadoop-3.3.3 复制到其他节点/opt/software 目录下,并创建数据目录
$ sudo mkdir /hdata
$ sudo chown -R hdfsuser /hdata
$ mkdir -p /hdata/hadoop_data/{namenode,datanode,temDir,journaldata}
- 修改其他节点的/etc/profile.d下添加hadoop环境变量,并source使生效
$ source /etc/profile.d/hadoopenv.sh
7.2 首次启动
- 启动journalnode
# 分别在001,002及003上执行
$ cd /opt/software/hadoop-3.3.3/sbin
$ ./hadoop-daemon.sh start journalnode
# 检查journalnode进程是否启动
$ jps
1539 JournalNode
- 开始格式化
# 格式化hdfs,在hdfs001.local.com上
$ hdfs namenode –format
# 在格式化节点hdfs001.local.com后,开启namenode,在该服务器上
$ hdfs --daemon start namenode
- 开启namenode
# 在hdfs001.local.com上同步格式化信息
hdfs namenode -bootstrapStandby
# 在hdfs002.local.com上同步格式化信后,开启namenode,在hdfs002服务器上
hdfs --daemon start namenode
- 格式化ZKFC(在hdfs001.local.com上执行即可)
# 交互式时,输入y
$ hdfs zkfc -formatZK
- 启动hdfs,在hdfs001.local.com
$ cd /opt/software/hadoop-3.3.3/sbin
$ ./start-dfs.sh
- 启动YARNA如未用到hadoop计算可不启动。
$ cd /opt/software/hadoop-3.3.3/sbin
$ ./start-yarn.sh
- 至此,hadoop HA集群搭建完成,可以使用一些常用的命令进行hadoop集群测试。
7.2 非首次启动
- 启动HDFS在hdfs001.local.com
$ cd /opt/software/hadoop-3.3.3/sbin
$ ./start-dfs.sh
- 启动YARN在hdfs001.local.com上执行,根据实际需求启动。
$ cd /opt/software/hadoop-3.3.3/sbin
$ ./start-yarn.sh
7.3 查询集群状态
- 获取namenode节点状态
# hdfs001.local.com的状态
$ hdfs haadmin –getServiceState nn1
# hdfs002.local.com的状态
$ hdfs haadmin –getServiceState nn2
- 集群是否处在安全模式
$ hdfs dfsadmin -safemode get