• 首页
  • Qt文档
  • DTK文档
  • 玲珑文档
  • 开源大数据部署手册

    开源大数据Hadoop+Hbase+Hive+mysql+Spark部署手册

    第一章 方案概述

    本方案是一套以Hadoop分布式系统作为基础架构,搭配Hbase提供分布式存储服务,使用Spark 大数据分析引擎 连接hive 数据仓库工具读取数据,将统计结果存储到 MySQL中的一个处理数据方式所搭建的平台,为提高可视化分析、语义引擎、大数据采集、大数据存取、大数据质量和大数据管理及处理能力的高可用方案。

    第二章 方案规划

    2.1.硬件信息

    主机名 CPU 内存 IP 磁盘 master 2 4 172.16.96.51 Sda:70G slave1 2 4 172.16.96.52 Sda:70G slave2 2 4 172.16.96.53 Sda:70G

    2.2.软件信息

    软件名称 版本号 备注 UOS Server V20 Hadoop 2.8.5 Hbase 2.2.4 Hive 2.3.4 Mysql 5.7 Spark 2.4.0

    2.3.安装部署

    2.3.1. 集群部署前的环境准备

    2.3.1.1.为每台主机添加映射,/etc/hosts文件

    172.16.96.51 master 172.16.96.52 slave1 172.16.96.53 slave2

    2.3.1.2.免密码登录每台主机

    master

    ssh-keygen -t rsa -C "master"

    slave1

    ssh-keygen -t rsa -C "slave1"

    slave2

    ssh-keygen -t rsa -C "slave2" ssh-copy-id -i ~/.ssh/id_rsa.pub master ssh-copy-id -i ~/.ssh/id_rsa.pub slave1 ssh-copy-id -i ~/.ssh/id_rsa.pub slave2

    2.3.1.3.时间同步

    #master apt install chrony -y vim /etc/chrony/chrony.conf allow 172.16.96.0/24 #允许哪些客户端来同步主机的时间 local stratum 10 #增加,本机不同步任何主机时间,本机作为时间源 systemctl restart chronyd netstat -antulp|grep chronyd udp 0 0 127.0.0.1:323 0.0.0.0:* 6395/chronyd

    输出以下信息: udp 0 0 172.16.96.51:40048 162.159.200.123:123 ESTABLISHED 6395/chronyd udp 0 0 0.0.0.0:123 0.0.0.0:* 6395/chronyd udp6 0 0 ::1:323 :::* 6395/chronyd

    #slave1与slave2操作一致

    apt install chrony -y vim /etc/chrony/chrony.conf

    #pool 2.debian.pool.ntp.org #注释此行

    server 172.16.96.51 iburst #将时间服务器指向我们自建的服务器,burst表示当此NTP服务器不可用时,向它发送一系列的并发包进行检测 systemctl restart chronyd chronyc sources –v

    输出以下信息:

    210 Number of sources = 8 MS Name/IP address Stratum Poll Reach LastRx Last sample

    ^- master 4 6 17 11 +32ms[ -666us] +/- 145ms ^- undefined.hostname.local> 2 6 17 10 +32ms[ -185us] +/- 127ms ^* 202.118.1.130 1 6 113 5 +3021us[ -29ms] +/- 11ms ^- time.cloudflare.com 3 6 17 11 +33ms[ +794us] +/- 139ms ^? chl.la 0 6 0 - +0ns[ +0ns] +/- 0ns ^? ntp6.flashdance.cx 0 6 0 - +0ns[ +0ns] +/- 0ns ^? tock.ntp.infomaniak.ch 0 6 0 - +0ns[ +0ns] +/- 0ns ^? time.cloudflare.com 0 6 0 - +0ns[ +0ns] +/- 0ns 2.3.1.4.三个节点安装openjdk8 apt install -y openjdk-8-jdk

    #配置环境变量

    cat ~./bashrc export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 export PATH=$PATH:$JAVA_HOME/bin source ~/.bashrc java -version

    输出以下信息:

    openjdk version "1.8.0_212" OpenJDK Runtime Environment (build 1.8.0_212-8u212-b01-1~deb9u1-b01) OpenJDK 64-Bit Server VM (build 25.212-b01, mixed mode)

    2.3.2 Hadoop安装部署

    请到官网下载Hadoop安装包https://hadoop.apache.org/releases.html

    2.3.1.5.解压Hadoop软件包

    解压软件包hadoop-2.8.5.tar.gz到/opt目录

    2.3.1.6.修改Hadoop配置文件

    进入opt/hadoop-2.8.5/etc/hadoop文件夹下修改配置文件

    #修改core-site.xml,这些xml文件,有些有,有些是带template或者default字段,将其copy成我说的文件即可

    #修改core-site.xml

    fs.defaultFS hdfs://master:9000 hadoop.tmp.dir /home/fay/tmp

    #修改hdfs-site.xml

    dfs.replication 2 dfs.permissions.enabled false dfs.datanode.max.xcievers 4096 Datanode 有一个同时处理文件的上限,至少要有4096 dfs.namenode.secondary.http-address master:9001 dfs.webhdfs.enabled true

    #修改mapred-site.xml

    mapreduce.framework.name yarn mapreduce.jobhistory.address master:10020 mapreduce.jobhistory.webapp.address master:19888

    #修改yarn-site.xml

    yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.address master:8032 yarn.resourcemanager.scheduler.address master:8030 yarn.resourcemanager.resource-tracker.address master:8031 yarn.resourcemanager.admin.address master:8033 yarn.resourcemanager.webapp.address master:8088

    #修改yarn-env.sh和hadoop-env.sh,将export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64放到带有java_home的位置 #修改slaves文件 删掉localhost slave1 slave2

    2.3.1.7.将Hadoop添加到环境变量

    将hadoop添加到环境变量,修改~/.bashrc export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 export HADOOP_HOME=/opt/hadoop-2.8.5 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export
    HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native" export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin source ~/.bashrc #另外两台机器就简单了,直接copy过去就好 scp -r /opt root@slave1:/ scp -r /opt root@slave2:/ scp ~/.bashrc root@slave1:/root scp ~/.bashrc root@slave2:/root 2.3.1.8.初始化namenode 在master节点上 初始化namenode hdfs namenode -format 输出以下内容: 21/02/27 22:03:55 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: user = root STARTUP_MSG: host = master/172.16.96.51 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 2.8.5 STARTUP_MSG: classpath = /opt/hadoop-2.8.5/etc/hadoop:/opt/hadoop-2.8.5/share/hadoop/common/lib/jersey-server-1.9.jar:/opt/hadoop-2.8.5/share/hadoop/common/lib/commons-compres ---中间省略一部分信息 21/02/27 22:03:55 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 21/02/27 22:03:55 INFO util.ExitUtil: Exiting with status 0 21/02/27 22:03:55 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/172.16.96.51 ************************************************************/

    Starting namenodes on [master] master: starting namenode, lo

    2.3.1.9.启动Hadoop

    #master节点 start-dfs.sh

    输出以下内容:

    gging to /opt/hadoop-2.8.5/logs/hadoop-root-namenode-master.out slave2: starting datanode, logging to /opt/hadoop-2.8.5/logs/hadoop-root-datanode-slave2.out slave1: starting datanode, logging to /opt/hadoop-2.8.5/logs/hadoop-root-datanode-slave1.out Starting secondary namenodes [master] master: starting secondarynamenode, logging to /opt/hadoop-2.8.5/logs/hadoop-root-secondarynamenode-master.out

    start-yarn.sh

    输入以下内容:

    starting yarn daemons starting resourcemanager, logging to /opt/hadoop-2.8.5/logs/yarn-root-resourcemanager-master.out slave1: starting nodemanager, logging to /opt/hadoop-2.8.5/logs/yarn-root-nodemanager-slave1.out slave2: starting nodemanager, logging to /opt/hadoop-2.8.5/logs/yarn-root-nodemanager-slave2.out

    利用jps命令在master节点看到以下内容:

    10755 NameNode 11300 Jps 10917 SecondaryNameNode 11049 ResourceManager

    利用jps命令在两个slave节点看到以下内容:

    9921 DataNode 10137 Jps 10013 NodeManager

    2.3.1.10.测试Hadoop

    到这里hadoop基本已经安装好了,测试一下hadoop在浏览器上输入172.16.96.51:8088

    2.3.1.10.1.跑下hadoop 自带的mapreduce 用例

    #matser hdfs dfs -mkdir /user hdfs dfs -mkdir /user/root hdfs dfs -put etc/hadoop input hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.5.jar grep input output 'dfs[a-z.]+'

    输出以下内容

    21/02/27 22:54:39 INFO client.RMProxy: Connecting to ResourceManager at master/172.16.96.51:8032 21/02/27 22:54:39 INFO input.FileInputFormat: Total input files to process : 29

    ---省略一部分内容

        Map input records=17
        Map output records=17
        Map output bytes=428
        Map output materialized bytes=468
        Input split bytes=127
        Combine input records=0
        Combine output records=0
        Reduce input groups=5
        Reduce shuffle bytes=468
        Reduce input records=17
        Reduce output records=17
        Spilled Records=34
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=98
        CPU time spent (ms)=600
        Physical memory (bytes) snapshot=431075328
        Virtual memory (bytes) snapshot=3865038848
        Total committed heap usage (bytes)=285736960
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=650
    File Output Format Counters 
        Bytes Written=326
        
    

    没有报java错误,那就ok,在output目录查看下输出结果:

    hdfs dfs -cat output/*‘ 6 dfs.audit.logger 4 dfs.class 3 dfs.logger 3 dfs.server.namenode. 2 dfs.audit.log.maxfilesize 2 dfs.period 2 dfs.audit.log.maxbackupindex 1 dfsmetrics.log 1 dfsadmin 1 dfs.webhdfs.enabled 1 dfs.servers 1 dfs.replication 1 dfs.permissions.enabled 1 dfs.log 1 dfs.file 1 dfs.datanode.max.xcievers 1 dfs.namenode.secondary.http

    2.3.3 Hbase安装部署

    官网下载软件https://hbase.apache.org

    2.3.1.11.解压Hbase并配置环境变量

    解压tar至/opt/目录 /opt/hbase-2.2.4/conf/下配置环境变量 #hbase-env.sh export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 export HADOOP_HOME=/opt/hadop-2.8.5 export HBASE_MANAGES_ZK=true

    2.3.1.12.修改相关配置文件内容

    #修改hbase-site.xml文件

    hbase.rootdir hdfs://master:9000/hbase hbase.cluster.distributed true hbase.zookeeper.quorum master,slave1,slave2 hbase.zookeeper.property.dataDir /var/log/zookeeper/data hbase.master master zookeeper.znode.parent /hbase hbase.unsafe.stream.capability.enforce false

    #修改regionservers 删掉localhost master slave1 slave2 #创建backup-masters文件 slave1

    2.3.1.13.#在~/.bashrc配置hbase环境变量

    export HBASE_HOME=/opt/hbase-2.2.4 export PATH=$PATH:$JAVA_HOME/bin:$HBASE_HOME/bin:$HBASE_HOME/sbin source ~/.bashrc

    2.3.1.14.将配置文件和整个目录给其它两个节点

    scp -r hbase-2.2.4 root@slave1:/opt/ scp -r hbase-2.2.4 root@slave2:/opt/ scp -r /root/.bashrc root@slave1:/root/ scp -r /root/.bashrc root@slave2:/root/

    2.3.1.15.启动Hbase

    在hadoop起来的情况下,再启动hbase start-hbase.sh 输出以下内容: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hbase-2.2.4/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] running master, logging to /opt/hbase-2.2.4/logs/hbase-root-master-master.out SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in

    ----中间省略一部分内容

    slave1: SLF4J: Found binding in [jar:file:/opt/hbase-2.2.4/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] slave1: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. slave1: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2.3.1.16.查看各个节点状态 利用jps命令查看下各节点状态 Jps #master,输出以下内容: 21232 Jps 10755 NameNode 10917 SecondaryNameNode 11049 ResourceManager 19673 HMaster 19785 HRegionServer 20717 QuorumPeerMain

    如果HMaster没起来,可以执行cp $HBASE_HOME/lib/client-facing-thirdparty/htrace-core-3.1.0-incubating.jar $HBASE_HOME/lib/,关闭hbase再次启动查看jps

    #slave节点查看 9921 DataNode 20694 HRegionServer 20790 HMaster #这个是slave1,因为前面配置了backup-masters,slave2没有这个 20588 HQuorumPeer 21036 Jps 10013 NodeManager

    2.3.1.17.进入Hbase数据库

    hbase shell SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hbase-2.2.4/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell Version 2.2.4, r67779d1a325a4f78a468af3339e73bf075888bac, 2020年 03月 11日 星期三 12:57:39 CST Took 0.0024 seconds hbase(main):001:0> list TABLE 0 row(s) Took 0.4137 seconds => [] hbase(main):002:0> create 'hello','world' Created table hello Took 1.3559 seconds => Hbase::Table - hello hbase(main):003:0> list TABLE hello 1 row(s) Took 0.0109 seconds => ["hello"]

    2.3.1.18.进入Hbase首页

    进入Hbase首页可以查看各种信息:  http://172.16.96.51:16010/master-status

    2.3.4 Hive 安装部署

    官网下载安装包https://hive.apache.org

    2.3.1.19.Hive安装mysql作为元数据库

    #安装mysql apt install mysql-server-5.7 -y mysqld --initialize systemctl restart mysql.service grep 'password' /var/log/mysql/error.log 输出以下内容: 2021-02-27T18:04:51.766751Z 4 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:04:55.296064Z 5 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:05:09.316319Z 6 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:05:21.250506Z 7 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:05:39.867338Z 8 [Note] Access denied for user 'root'@'localhost' (using password: YES) 2021-02-27T18:06:43.216458Z 1 [Note] A temporary password is generated for root@localhost: SkWzvjuVa9*g #数据库初始化密码

    2.3.1.20.Mysql相关配置

    mysqladmin -uroot -p password Enter password: New password: Confirm new password: Warning: Since password will be sent to server in plain text, use ssl connection to ensure password safety.

    2.3.1.21.创建Hive数据库

    mysql -puosuos mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 10 Server version: 5.7.26-1 (Uos)

    Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

    Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

    mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | +--------------------+ 4 rows in set (0.00 sec)

    mysql> create database hive; Query OK, 1 row affected (0.01 sec) mysql> grant all privileges on hive.* to 'root'@'localhost' identified by 'uosuos'; Query OK, 0 rows affected, 1 warning (0.00 sec)

    mysql> flush privileges; Query OK, 0 rows affected (0.00 sec)

    2.3.1.22.安装Hive

    安装hive进行相关配置,将hive包解压至/opt下 ln -s apache-hive-2.3.4-bin/ hive 2.3.1.23.配置环境变量 echo -e '##################HIVE环境变量配置#############\nexport HIVE_HOME=/opt/hive\nexport PATH=$HIVE_HOME/bin:$PATH' >> ~/.bashrc&& source ~/.bashrc&&tail -3 ~/.bashrc

    ##################HIVE环境变量配置############# export HIVE_HOME=/opt/hive export PATH=$HIVE_HOME/bin:$PATH

    2.3.1.24.配置Hive

    hive/conf/下配置hive,创建出hive-site.xml

    javax.jdo.option.ConnectionUserName root

    javax.jdo.option.ConnectionPassword uosuos

    javax.jdo.option.ConnectionURLmysql jdbc:mysql://127.0.0.1:3306/hive?useSSL=false

    javax.jdo.option.ConnectionDriverName com.mysql.cj.jdbc.Driver hive.support.concurrency true

    #拷贝mysql connector拷贝到hive的lib包中,uos系统使用的是mysql-connector-java_8.0.21-1debian10_all.deb

    dpkg -i mysql-connector-java_8.0.21-1debian10_all.deb cp /usr/share/java/mysql-connector-java-8.0.21.jar /opt/hive/lib/

    2.3.1.25.元数据存储初始化

    初始化hive的元数据库(使用mysql数据库) schematool -initSchema -dbType mysql 输出以下内容: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Metastore connection URL: jdbc:mysql://127.0.0.1:3306/hive?useSSL=false Metastore Connection Driver : com.mysql.cj.jdbc.Driver Metastore connection User: root Starting metastore schema initialization to 2.3.0 Initialization script hive-schema-2.3.0.mysql.sql Initialization script completed schemaTool completed

    2.3.1.26.进入hive命令行

    hive 输出以下内容: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

    Logging initialized using configuration in jar:file:/opt/apache-hive-2.3.4-bin/lib/hive-common-2.3.4.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive>

    2.3.5 Spark 安装部署

    Spark下载地址http://spark.apache.org/releases/spark-release-2-4-0.html scala下载地址:https://downloads.lightbend.com/scala/2.13.0/scala-2.13.0.tgz

    2.3.1.27.解压Scala

    解压scala软件至/usr/local,配置环境变量 ~/.bashrc export SCALA_HOME=/usr/local/scala-2.13.0 export SCALA_CLASSPATH=$SCALA_HOME export PATH=$PATH:$SCALA_HOME/bin

    source ~/.bashrc scala -version 输出以下内容: Scala code runner version 2.13.0 -- Copyright 2002-2019, LAMP/EPFL and Lightbend, Inc.

    2.3.1.28.解压Spark

    解压spark包至/opt,配置环境变量 vim ~/.bashrc export SPARK_HOME=/opt/spark-2.3.3-bin-hadoop2.7 export SPARK_CLASSPATH=$SPARK_HOME export PATH=$PATH:$SPARK_HOME/bin source ~/.bashrc

    2.3.1.29.修改Spark配置文件

    修改spark配置文件(在安装目录下的conf文件夹中) cp spark-env.sh.template spark-env.sh 修改spark-env.sh SPARK_LOCAL_IP=master #本机ip或hostname SPARK_MASTER_IP=master #master节点ip或hostname export HADOOP_CONF_DIR=/opt/hadoop-2.8.5 #hadoop的配置路径 export YARN_CONF_DIR=/opt/hadoop-2.8.5 #yarn路径配置 修改slaves cp slaves.template slaves slave1 slave2 分发安装包和配置到其他节点 scp -r /usr/local/scala-2.13.0/ root@slave1:/usr/local scp -r /opt/spark-2.3.3-bin-hadoop2.7/ root@slave1:/opt/ scp -r /root/.bashrc root@slave1:/root/ scp -r /root/.bashrc root@slave2:/root/

    2.3.1.30.修改各个节点

    修改各节点中的在spark-env.sh中SPARK_LOCAL_IP选项 vim /opt/spark-2.3.3-bin-hadoop2.7/conf/spark-env.sh SPARK_LOCAL_IP=slave1 #本机ip或hostname SPARK_MASTER_IP=master #master节点ip或hostname export HADOOP_CONF_DIR=/opt/hadoop-2.8.5 #hadoop的配置路径 export YARN_CONF_DIR=/opt/hadoop-2.8.5 #yarn路径配置 #修改下Spark Web UI默认的8080端口,SPARK_HOME/sbin目录下start-master.sh,查找8080定位并修改为你想要的端口即可

    2.3.1.31.启动服务

    ./start-all.sh 输出以下内容: starting org.apache.spark.deploy.master.Master, logging to /opt/spark-2.3.3-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out slave1: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-2.3.3-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out slave2: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-2.3.3-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out

    2.3.1.32.查看服务状态

    #利用jps命令看下状态 输出以下内容: 34176 Master #master的(master),slave的(worker) 10755 NameNode 10917 SecondaryNameNode 34264 Jps 11049 ResourceManager 19673 HMaster 19785 HRegionServer 20717 QuorumPeerMain

    Spark默认8080端口被其他服务占用,自定义修改后可以打开spark-web http://172.16.96.51:8081/