- 认证技术问题
- 键鼠模拟点击测试工具--xdotool
- Q:命令行修改屏幕分辨率
- KVM虚拟机-复制克隆(命令行)
- 统信UOS安装cherrytree
- 未进开发者根证书导入uos浏览器生效方案
- 将A卡驱动从从radeon切换为amdgpu
- uos系统ulimit工具使用
- kvm虚拟机运行时显示“启动域时出错”
- UOS文件命名最多支持字符数
- 在uos系统中如何下载不同架构的包
- 登录时输入正确密码后无法登陆 显示密码错误
- 通过终端判断窗口特效是否开启及开启关闭窗口特效
- 应用软件驻留任务栏问题
- FTP底层存储文件乱码
- 初识apache flink
- UOS专业版1032软件商店下载安装包缓存在哪里
- 如何用命令行查看当前cpu温度
- kvm远程连接报错ssh key
- 外设/认证类userid接口获取方式
- 命令行安装字体
- UOS访问windows文件共享
- uos系统中dbus调用api实现一些基本功能
- 用命令行设置关闭显示器、进入待机、自动锁屏
- 命令切换python版本
- uos系统中dbus调用实现系统锁屏
- UOS-lsmhookmanager程序的Demo
- KVM支持UEFI引导
- 身份鉴别系统如何接入UOS的PAM框架
- UOS收集桌面日志
- UOS远程windows桌面
- 获取有效打印机日志方法
- UOS浏览器导入证书
- 开发者问题反馈指南
- 因节信息为空,导致的签名失败的问题
- 搜狗输入法导入字体库
- LightDM桌面显示器相关技术内容(系统默认)
- 源码打包为deb
- 开源大数据部署手册
- UOS查看摄像头是否链接成功
- 安装双系统后无法进入UOS系统
- 进入桌面系统弹黑屏PoolCreationFaile解决办法
- udev详解
- dpkg: 处理软件包 uos-browser-stable
- 统信云打印
- uos-route路由相关操作
- 统信UOS操作系统-共享文件夹
- 统信UOS操作系统-定时关机
- 解决 tail 命令提示“inotify 资源耗尽,无法使用
- 使用字体管理器导入字体
- 清除浏览器dns缓存
- 在 Linux 中永久修改 USB 设备权限
- kvm 显示spice协议错误
- uos系统切换java版本
- rpm包在uos系统无法直接安装
- 更改密码后如何绕过登陆密钥环验证
- deb安装后,启动菜单没有启动图标
- 应用上架报错“获取不到包信息请检查”
- qt creator不能输入中文
- uname -v 第一个字段意义
- zabbix安装文档
- 统信服安装完系统后,切换root用户没有root用户密码
- uos系统中dbus调用实现特效模式开关
- UOS非开发者模式调用dmidecode
- 非适配完成打印机官方驱动安装方法
- apache spark 部署
- 使用MegaCli做raid
- 任务栏出现两个图标
- 通过 gdebi 工具解决安装本地包缺失依赖问题
- UOS通过Nginx托管Net Core服务
- 回收站删除时提示权限问题无法删除
- Linux 命令行查看图片详细信息(分辨率、色深、格式等)
- 切换JDK默认版本
- uos Postgresql 12.1 安装过程笔记
- PostgreSQL关系数据库
- kvm虚拟机运行时显示“启动域时出错”
- 统信UOS安装steam
- 如何查看uos浏览器是否支持flash
- uos系统中dbus调用api实现个性化透明度
- 如何确定deb包是否已经过统信方面的签名
- appimage打包基础步骤
- 命令行制作U盘启动盘方法
- vim配置优化
- 强制关机后磁盘数据损坏,initramfs无法挂载进不去系统
- 如何安装Debian&uos双系统
- 激活过程中提示服务器连接失败
- UOS浏览器内部协议(部分)
- uos设备管理器取值对照表
- 系统最小化安装,字符界面实现EAP认证
- dbus使用方法
- uos系统中dbus调用实现注销系统
- KVM虚拟机--删除(命令行)
- ppd文件打包deb
- uos和uos之间如何共享奔图打印机
- 驱动安装时提示"hplip-plugin"无法安装
- 离线环境先如何激活系统
- 基于UOS 部署微软.NET环境
- 外设驱动重新打包
- 关于打包过程dpkg-source -b . 命令构建出deb包时报错简单分析
- UOS更换英伟达官方显卡驱动
- 如何查看系统安装时间
- uos1030MIPS服务器安装kms激活
- uos远程连接工具--FinalShell的安装和使用
- 公网deb包转uos的deb包
- 开机引导后无法进入系统,显示busybox v1.30.
- 统信UOS命令行更改时间
- 在商店上架的应用显示程序大小异常
- 点击桌面上关机按钮,提示阻挡关机
- UOS浏览器下查看浏览器插件情况
- 打包规范之control文件字段说明
- UOS中输入法框架的开发技术是什么?
- uos右键刷新
- 开源堡垒机JumpServer解决方案
- UOS浏览器导入根证书
- 行业版环境如何实现远程适配?
- 修改系统默认语言编码
- 统信桌面操作系统查看当前版本
- 开源ZooKeeper集群解决方案
- lspci命令的应用
- 获取xxx软件包及依赖
- 开源ffmpeg的使用
- 用uos创建共享打印机
- deb安装,右键卸载失败的问题
- 系统历史启停时间查询脚本
- 串口使用及配置
- UOS切换至root用户命令无法补全问题
- .desktop文件Exec字段参数解释
- 自签名后因为系统时间变更导致程序无法运行
- 怎么使用iBMC工具安装uos操作系统
- activemq服务无法启动
- 二进制软件包打包为deb
- apt命令提示lock异常解决方案
- 应用软件打包辅助工具v1.0版本
- "提示:E: dpkg 被中断,您必须手工运行 ‘sudo
- 最小化环境命令行激活系统
- 如何拉取软件依赖包
- dpkg: 警告: 无法找到软件包 xxx 的文件名列表文件
- uos浏览器如何清除dns缓存
- 浏览器闪退解决方案
- 检查deb包打包规范脚本
- UOS自定义右键新建文档
- apache samza部署
- UOS 实现 rc.local 开机执行命令
- smb自动挂载
- UOS 搭建Firekyin个人网站
- 统信服务器操作系统设置登录时自动填充用户名
- 修改tomcat的最大连接数
- 单用户模式下创建一个可登录的用户
开源大数据部署手册
开源大数据Hadoop+Hbase+Hive+mysql+Spark部署手册
第一章 方案概述
本方案是一套以Hadoop分布式系统作为基础架构,搭配Hbase提供分布式存储服务,使用Spark 大数据分析引擎 连接hive 数据仓库工具读取数据,将统计结果存储到 MySQL中的一个处理数据方式所搭建的平台,为提高可视化分析、语义引擎、大数据采集、大数据存取、大数据质量和大数据管理及处理能力的高可用方案。
第二章 方案规划
2.1.硬件信息
主机名 CPU 内存 IP 磁盘 master 2 4 172.16.96.51 Sda:70G slave1 2 4 172.16.96.52 Sda:70G slave2 2 4 172.16.96.53 Sda:70G
2.2.软件信息
软件名称 版本号 备注 UOS Server V20 Hadoop 2.8.5 Hbase 2.2.4 Hive 2.3.4 Mysql 5.7 Spark 2.4.0
2.3.安装部署
2.3.1. 集群部署前的环境准备
2.3.1.1.为每台主机添加映射,/etc/hosts文件
172.16.96.51 master 172.16.96.52 slave1 172.16.96.53 slave2
2.3.1.2.免密码登录每台主机
master
ssh-keygen -t rsa -C "master"
slave1
ssh-keygen -t rsa -C "slave1"
slave2
ssh-keygen -t rsa -C "slave2" ssh-copy-id -i ~/.ssh/id_rsa.pub master ssh-copy-id -i ~/.ssh/id_rsa.pub slave1 ssh-copy-id -i ~/.ssh/id_rsa.pub slave2
2.3.1.3.时间同步
#master apt install chrony -y vim /etc/chrony/chrony.conf allow 172.16.96.0/24 #允许哪些客户端来同步主机的时间 local stratum 10 #增加,本机不同步任何主机时间,本机作为时间源 systemctl restart chronyd netstat -antulp|grep chronyd udp 0 0 127.0.0.1:323 0.0.0.0:* 6395/chronyd
输出以下信息: udp 0 0 172.16.96.51:40048 162.159.200.123:123 ESTABLISHED 6395/chronyd udp 0 0 0.0.0.0:123 0.0.0.0:* 6395/chronyd udp6 0 0 ::1:323 :::* 6395/chronyd
#slave1与slave2操作一致
apt install chrony -y vim /etc/chrony/chrony.conf
#pool 2.debian.pool.ntp.org #注释此行
server 172.16.96.51 iburst #将时间服务器指向我们自建的服务器,burst表示当此NTP服务器不可用时,向它发送一系列的并发包进行检测 systemctl restart chronyd chronyc sources –v
输出以下信息:
210 Number of sources = 8 MS Name/IP address Stratum Poll Reach LastRx Last sample
^- master 4 6 17 11 +32ms[ -666us] +/- 145ms ^- undefined.hostname.local> 2 6 17 10 +32ms[ -185us] +/- 127ms ^* 202.118.1.130 1 6 113 5 +3021us[ -29ms] +/- 11ms ^- time.cloudflare.com 3 6 17 11 +33ms[ +794us] +/- 139ms ^? chl.la 0 6 0 - +0ns[ +0ns] +/- 0ns ^? ntp6.flashdance.cx 0 6 0 - +0ns[ +0ns] +/- 0ns ^? tock.ntp.infomaniak.ch 0 6 0 - +0ns[ +0ns] +/- 0ns ^? time.cloudflare.com 0 6 0 - +0ns[ +0ns] +/- 0ns 2.3.1.4.三个节点安装openjdk8 apt install -y openjdk-8-jdk
#配置环境变量
cat ~./bashrc export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 export PATH=$PATH:$JAVA_HOME/bin source ~/.bashrc java -version
输出以下信息:
openjdk version "1.8.0_212" OpenJDK Runtime Environment (build 1.8.0_212-8u212-b01-1~deb9u1-b01) OpenJDK 64-Bit Server VM (build 25.212-b01, mixed mode)
2.3.2 Hadoop安装部署
请到官网下载Hadoop安装包https://hadoop.apache.org/releases.html
2.3.1.5.解压Hadoop软件包
解压软件包hadoop-2.8.5.tar.gz到/opt目录
2.3.1.6.修改Hadoop配置文件
进入opt/hadoop-2.8.5/etc/hadoop文件夹下修改配置文件
#修改core-site.xml,这些xml文件,有些有,有些是带template或者default字段,将其copy成我说的文件即可
#修改core-site.xml
fs.defaultFS hdfs://master:9000 hadoop.tmp.dir /home/fay/tmp#修改hdfs-site.xml
dfs.replication 2 dfs.permissions.enabled false dfs.datanode.max.xcievers 4096 Datanode 有一个同时处理文件的上限,至少要有4096 dfs.namenode.secondary.http-address master:9001 dfs.webhdfs.enabled true#修改mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address master:10020 mapreduce.jobhistory.webapp.address master:19888#修改yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.address master:8032 yarn.resourcemanager.scheduler.address master:8030 yarn.resourcemanager.resource-tracker.address master:8031 yarn.resourcemanager.admin.address master:8033 yarn.resourcemanager.webapp.address master:8088#修改yarn-env.sh和hadoop-env.sh,将export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64放到带有java_home的位置 #修改slaves文件 删掉localhost slave1 slave2
2.3.1.7.将Hadoop添加到环境变量
将hadoop添加到环境变量,修改~/.bashrc
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
export HADOOP_HOME=/opt/hadoop-2.8.5
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export
HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source ~/.bashrc
#另外两台机器就简单了,直接copy过去就好
scp -r /opt root@slave1:/
scp -r /opt root@slave2:/
scp ~/.bashrc root@slave1:/root
scp ~/.bashrc root@slave2:/root
2.3.1.8.初始化namenode
在master节点上 初始化namenode
hdfs namenode -format
输出以下内容:
21/02/27 22:03:55 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: user = root
STARTUP_MSG: host = master/172.16.96.51
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.8.5
STARTUP_MSG: classpath = /opt/hadoop-2.8.5/etc/hadoop:/opt/hadoop-2.8.5/share/hadoop/common/lib/jersey-server-1.9.jar:/opt/hadoop-2.8.5/share/hadoop/common/lib/commons-compres
---中间省略一部分信息
21/02/27 22:03:55 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
21/02/27 22:03:55 INFO util.ExitUtil: Exiting with status 0
21/02/27 22:03:55 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/172.16.96.51
************************************************************/
Starting namenodes on [master] master: starting namenode, lo
2.3.1.9.启动Hadoop
#master节点 start-dfs.sh
输出以下内容:
gging to /opt/hadoop-2.8.5/logs/hadoop-root-namenode-master.out slave2: starting datanode, logging to /opt/hadoop-2.8.5/logs/hadoop-root-datanode-slave2.out slave1: starting datanode, logging to /opt/hadoop-2.8.5/logs/hadoop-root-datanode-slave1.out Starting secondary namenodes [master] master: starting secondarynamenode, logging to /opt/hadoop-2.8.5/logs/hadoop-root-secondarynamenode-master.out
start-yarn.sh
输入以下内容:
starting yarn daemons starting resourcemanager, logging to /opt/hadoop-2.8.5/logs/yarn-root-resourcemanager-master.out slave1: starting nodemanager, logging to /opt/hadoop-2.8.5/logs/yarn-root-nodemanager-slave1.out slave2: starting nodemanager, logging to /opt/hadoop-2.8.5/logs/yarn-root-nodemanager-slave2.out
利用jps命令在master节点看到以下内容:
10755 NameNode 11300 Jps 10917 SecondaryNameNode 11049 ResourceManager
利用jps命令在两个slave节点看到以下内容:
9921 DataNode 10137 Jps 10013 NodeManager
2.3.1.10.测试Hadoop
到这里hadoop基本已经安装好了,测试一下hadoop在浏览器上输入172.16.96.51:8088
2.3.1.10.1.跑下hadoop 自带的mapreduce 用例
#matser hdfs dfs -mkdir /user hdfs dfs -mkdir /user/root hdfs dfs -put etc/hadoop input hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.5.jar grep input output 'dfs[a-z.]+'
输出以下内容
21/02/27 22:54:39 INFO client.RMProxy: Connecting to ResourceManager at master/172.16.96.51:8032 21/02/27 22:54:39 INFO input.FileInputFormat: Total input files to process : 29
---省略一部分内容
Map input records=17
Map output records=17
Map output bytes=428
Map output materialized bytes=468
Input split bytes=127
Combine input records=0
Combine output records=0
Reduce input groups=5
Reduce shuffle bytes=468
Reduce input records=17
Reduce output records=17
Spilled Records=34
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=98
CPU time spent (ms)=600
Physical memory (bytes) snapshot=431075328
Virtual memory (bytes) snapshot=3865038848
Total committed heap usage (bytes)=285736960
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=650
File Output Format Counters
Bytes Written=326
没有报java错误,那就ok,在output目录查看下输出结果:
hdfs dfs -cat output/*‘ 6 dfs.audit.logger 4 dfs.class 3 dfs.logger 3 dfs.server.namenode. 2 dfs.audit.log.maxfilesize 2 dfs.period 2 dfs.audit.log.maxbackupindex 1 dfsmetrics.log 1 dfsadmin 1 dfs.webhdfs.enabled 1 dfs.servers 1 dfs.replication 1 dfs.permissions.enabled 1 dfs.log 1 dfs.file 1 dfs.datanode.max.xcievers 1 dfs.namenode.secondary.http
2.3.3 Hbase安装部署
官网下载软件https://hbase.apache.org
2.3.1.11.解压Hbase并配置环境变量
解压tar至/opt/目录 /opt/hbase-2.2.4/conf/下配置环境变量 #hbase-env.sh export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 export HADOOP_HOME=/opt/hadop-2.8.5 export HBASE_MANAGES_ZK=true
2.3.1.12.修改相关配置文件内容
#修改hbase-site.xml文件
hbase.rootdir hdfs://master:9000/hbase hbase.cluster.distributed true hbase.zookeeper.quorum master,slave1,slave2 hbase.zookeeper.property.dataDir /var/log/zookeeper/data hbase.master master zookeeper.znode.parent /hbase hbase.unsafe.stream.capability.enforce false#修改regionservers 删掉localhost master slave1 slave2 #创建backup-masters文件 slave1
2.3.1.13.#在~/.bashrc配置hbase环境变量
export HBASE_HOME=/opt/hbase-2.2.4 export PATH=$PATH:$JAVA_HOME/bin:$HBASE_HOME/bin:$HBASE_HOME/sbin source ~/.bashrc
2.3.1.14.将配置文件和整个目录给其它两个节点
scp -r hbase-2.2.4 root@slave1:/opt/ scp -r hbase-2.2.4 root@slave2:/opt/ scp -r /root/.bashrc root@slave1:/root/ scp -r /root/.bashrc root@slave2:/root/
2.3.1.15.启动Hbase
在hadoop起来的情况下,再启动hbase start-hbase.sh 输出以下内容: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hbase-2.2.4/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] running master, logging to /opt/hbase-2.2.4/logs/hbase-root-master-master.out SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in
----中间省略一部分内容
slave1: SLF4J: Found binding in [jar:file:/opt/hbase-2.2.4/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] slave1: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. slave1: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2.3.1.16.查看各个节点状态 利用jps命令查看下各节点状态 Jps #master,输出以下内容: 21232 Jps 10755 NameNode 10917 SecondaryNameNode 11049 ResourceManager 19673 HMaster 19785 HRegionServer 20717 QuorumPeerMain
如果HMaster没起来,可以执行cp $HBASE_HOME/lib/client-facing-thirdparty/htrace-core-3.1.0-incubating.jar $HBASE_HOME/lib/,关闭hbase再次启动查看jps
#slave节点查看 9921 DataNode 20694 HRegionServer 20790 HMaster #这个是slave1,因为前面配置了backup-masters,slave2没有这个 20588 HQuorumPeer 21036 Jps 10013 NodeManager
2.3.1.17.进入Hbase数据库
hbase shell SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hbase-2.2.4/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell Version 2.2.4, r67779d1a325a4f78a468af3339e73bf075888bac, 2020年 03月 11日 星期三 12:57:39 CST Took 0.0024 seconds hbase(main):001:0> list TABLE 0 row(s) Took 0.4137 seconds => [] hbase(main):002:0> create 'hello','world' Created table hello Took 1.3559 seconds => Hbase::Table - hello hbase(main):003:0> list TABLE hello 1 row(s) Took 0.0109 seconds => ["hello"]
2.3.1.18.进入Hbase首页
进入Hbase首页可以查看各种信息: http://172.16.96.51:16010/master-status
2.3.4 Hive 安装部署
官网下载安装包https://hive.apache.org
2.3.1.19.Hive安装mysql作为元数据库
#安装mysql apt install mysql-server-5.7 -y mysqld --initialize systemctl restart mysql.service grep 'password' /var/log/mysql/error.log 输出以下内容: 2021-02-27T18:04:51.766751Z 4 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:04:55.296064Z 5 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:05:09.316319Z 6 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:05:21.250506Z 7 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:05:39.867338Z 8 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:06:43.216458Z 1 [Note] A temporary password is generated for root@localhost: SkWzvjuVa9*g #数据库初始化密码
2.3.1.20.Mysql相关配置
mysqladmin -uroot -p password Enter password: New password: Confirm new password: Warning: Since password will be sent to server in plain text, use ssl connection to ensure password safety.
2.3.1.21.创建Hive数据库
mysql -puosuos mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 10 Server version: 5.7.26-1 (Uos)
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | +--------------------+ 4 rows in set (0.00 sec)
mysql> create database hive; Query OK, 1 row affected (0.01 sec) mysql> grant all privileges on hive.* to 'root'@'localhost' identified by 'uosuos'; Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges; Query OK, 0 rows affected (0.00 sec)
2.3.1.22.安装Hive
安装hive进行相关配置,将hive包解压至/opt下 ln -s apache-hive-2.3.4-bin/ hive 2.3.1.23.配置环境变量 echo -e '##################HIVE环境变量配置#############\nexport HIVE_HOME=/opt/hive\nexport PATH=$HIVE_HOME/bin:$PATH' >> ~/.bashrc&& source ~/.bashrc&&tail -3 ~/.bashrc
##################HIVE环境变量配置############# export HIVE_HOME=/opt/hive export PATH=$HIVE_HOME/bin:$PATH
2.3.1.24.配置Hive
hive/conf/下配置hive,创建出hive-site.xml
javax.jdo.option.ConnectionUserName root
javax.jdo.option.ConnectionPassword uosuos
javax.jdo.option.ConnectionURLmysql jdbc:mysql://127.0.0.1:3306/hive?useSSL=false
javax.jdo.option.ConnectionDriverName com.mysql.cj.jdbc.Driver hive.support.concurrency true
#拷贝mysql connector拷贝到hive的lib包中,uos系统使用的是mysql-connector-java_8.0.21-1debian10_all.deb
dpkg -i mysql-connector-java_8.0.21-1debian10_all.deb cp /usr/share/java/mysql-connector-java-8.0.21.jar /opt/hive/lib/
2.3.1.25.元数据存储初始化
初始化hive的元数据库(使用mysql数据库) schematool -initSchema -dbType mysql 输出以下内容: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Metastore connection URL: jdbc:mysql://127.0.0.1:3306/hive?useSSL=false Metastore Connection Driver : com.mysql.cj.jdbc.Driver Metastore connection User: root Starting metastore schema initialization to 2.3.0 Initialization script hive-schema-2.3.0.mysql.sql Initialization script completed schemaTool completed
2.3.1.26.进入hive命令行
hive 输出以下内容: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/opt/apache-hive-2.3.4-bin/lib/hive-common-2.3.4.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive>
2.3.5 Spark 安装部署
Spark下载地址http://spark.apache.org/releases/spark-release-2-4-0.html scala下载地址:https://downloads.lightbend.com/scala/2.13.0/scala-2.13.0.tgz
2.3.1.27.解压Scala
解压scala软件至/usr/local,配置环境变量 ~/.bashrc export SCALA_HOME=/usr/local/scala-2.13.0 export SCALA_CLASSPATH=$SCALA_HOME export PATH=$PATH:$SCALA_HOME/bin
source ~/.bashrc scala -version 输出以下内容: Scala code runner version 2.13.0 -- Copyright 2002-2019, LAMP/EPFL and Lightbend, Inc.
2.3.1.28.解压Spark
解压spark包至/opt,配置环境变量 vim ~/.bashrc export SPARK_HOME=/opt/spark-2.3.3-bin-hadoop2.7 export SPARK_CLASSPATH=$SPARK_HOME export PATH=$PATH:$SPARK_HOME/bin source ~/.bashrc
2.3.1.29.修改Spark配置文件
修改spark配置文件(在安装目录下的conf文件夹中) cp spark-env.sh.template spark-env.sh 修改spark-env.sh SPARK_LOCAL_IP=master #本机ip或hostname SPARK_MASTER_IP=master #master节点ip或hostname export HADOOP_CONF_DIR=/opt/hadoop-2.8.5 #hadoop的配置路径 export YARN_CONF_DIR=/opt/hadoop-2.8.5 #yarn路径配置 修改slaves cp slaves.template slaves slave1 slave2 分发安装包和配置到其他节点 scp -r /usr/local/scala-2.13.0/ root@slave1:/usr/local scp -r /opt/spark-2.3.3-bin-hadoop2.7/ root@slave1:/opt/ scp -r /root/.bashrc root@slave1:/root/ scp -r /root/.bashrc root@slave2:/root/
2.3.1.30.修改各个节点
修改各节点中的在spark-env.sh中SPARK_LOCAL_IP选项 vim /opt/spark-2.3.3-bin-hadoop2.7/conf/spark-env.sh SPARK_LOCAL_IP=slave1 #本机ip或hostname SPARK_MASTER_IP=master #master节点ip或hostname export HADOOP_CONF_DIR=/opt/hadoop-2.8.5 #hadoop的配置路径 export YARN_CONF_DIR=/opt/hadoop-2.8.5 #yarn路径配置 #修改下Spark Web UI默认的8080端口,SPARK_HOME/sbin目录下start-master.sh,查找8080定位并修改为你想要的端口即可
2.3.1.31.启动服务
./start-all.sh 输出以下内容: starting org.apache.spark.deploy.master.Master, logging to /opt/spark-2.3.3-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out slave1: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-2.3.3-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out slave2: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-2.3.3-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out
2.3.1.32.查看服务状态
#利用jps命令看下状态 输出以下内容: 34176 Master #master的(master),slave的(worker) 10755 NameNode 10917 SecondaryNameNode 34264 Jps 11049 ResourceManager 19673 HMaster 19785 HRegionServer 20717 QuorumPeerMain
Spark默认8080端口被其他服务占用,自定义修改后可以打开spark-web http://172.16.96.51:8081/