热门IT资讯网

Hadoop - Setup ENV

发表于:2024-11-24 作者:热门IT资讯网编辑
编辑最后更新 2024年11月24日,环境:Xshell: 5Xftp: 4Virtual Box: 5.16Linux: CentOS-7-x86_64-Minimal-1511Vim: yum -y install vim-enhan

环境:

Xshell: 5

Xftp: 4

Virtual Box: 5.16

Linux: CentOS-7-x86_64-Minimal-1511

Vim: yum -y install vim-enhanced

JDK: 8

Hadoop: 2.7.3.tar.gz


在Virtual Box中安装完成 Linux后,设置网卡为自动启动:

检查机器网卡:

nmcli d

可以看到有一个网卡:enp0s3


用vi打开网卡配置文件:

vi /etc/sysconfig/network-scirpts/ifcfg-enp0s3


修改最后一行:ONBOOT=no -> ONBOOT=yes


DEVICE=eth0

描述网卡对应的设备别名,例如ifcfg-eth0的文件中它为eth0

BOOTPROTO=static

设置网卡获得ip地址的方式,可能的选项为static,dhcp或bootp,分别对应静态指定的ip地址,通过dhcp协议获得的ip地址,通过bootp协议获得的ip地址

BROADCAST=192.168.0.255

对应的子网广播地址

HWADDR=00:07:E9:05:E8:B4

对应的网卡物理地址

IPADDR=12.168.1.2

如果设置网卡获得ip地址的方式为静态指定,此字段就指定了网卡对应的ip地址

IPV6INIT=no

开启或关闭IPv6;关闭no,开启yes

IPV6_AUTOCONF=no

开启或关闭IPv6自动配置;关闭no,开启yes

NETMASK=255.255.255.0

网卡对应的网络掩码

NETWORK=192.168.1.0

网卡对应的网络地址

ONBOOT=yes

系统启动时是否设置此网络接口,设置为yes时,系统启动时激活此设备




安装Hadoop

[root@centosmaster opt]# tar zxf hadoop-2.7.3.tar.gz[root@centosmaster opt]# cd hadoop-2.7.3[root@centosmaster opt]# cd /opt/hadoop-2.7.3/etc/hadoop

core-site.xml

      fs.defaultFS    hdfs://CentOS_105:9000        hadoop.tmp.dir    file:/opt/hadoop-2.7.3/current/tmp              fs.trash.interval    8  

hdfs-site.xml

                        dfs.namenode.name.dir                /opt/hadoop-2.7.3/current/dfs/name                                dfs.datanode.data.dir                /opt/hadoop-2.7.3/current/data                                        dfs.replication                1                                dfs.webhdfs.enabled                true                                dfs.permissions.superusergroup                staff                                dfs.permissions.enabled                false        

yarn-site.xml

       yarn.resourcemanager.hostname     centosmaster          yarn.nodemanager.aux.services     mapreduce_shuffle          yarn.nodemanager.aux-services.mapreduce.shuffle.class     org.apache.hadoop.mapred.ShuffleHandler          yarn.resourcemanager.address     centosmaster:18040          yarn.resourcemanager.scheduler.address     centosmaster:18030          yarn.resourcemanager.resource.tracker.address     centosmaster:18025          yarn.resourcemanager.manager.admin.address     centosmaster:18141          yarn.resourcemanager.webapp.address     centosmaster:18088          yarn.log-aggregation-enable     true          yarn.log-aggregation.retain-seconds     86400         yarn.log-aggregation.retain-check-interval-seconds     86400          yarn.nodemanager.remote-app-log-dir     /tmp/logs          yarn.nodemanager.remote-app-log-dir-suffix     logs   

mapred-site.xml

       mapreduce.foramework.name     yarn          mapreduce.jobtracker.http.address     centosmaster:50030          mapreduce.jobhistory.address     centosmaster:10020          mapreduce.jobhistory.webapp.address     centosmaster:19888          mapreduce.jobhistory.done.dir     /jobhistory/done         mapreduce.intermediate-done-dir     /jobhistory/one_intermediate          mapreduce.job.ubertask.enable     true   

在Slaves文件中添加本机ip,指定本机为Slave:

centosmaster


给hadoop指定java jdk

vim hadoop-env.sh# The java implementation to use.export JAVA_HOME=/usr/java/jdk1.8.0_111/

格式化HDFS文件系统

[root@centosmaster~]# hdfs namenode -format************************************************************/16/10/23 08:58:31 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]16/10/23 08:58:31 INFO namenode.NameNode: createNameNode [-format]16/10/23 08:58:31 WARN common.Util: Path /opt/hadoop-2.7.3/current/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.16/10/23 08:58:31 WARN common.Util: Path /opt/hadoop-2.7.3/current/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.Formatting using clusterid: CID-1294bdbb-d45c-49f3-b5c5-3d26934e084f16/10/23 08:58:32 INFO namenode.FSNamesystem: No KeyProvider found.16/10/23 08:58:32 INFO namenode.FSNamesystem: fsLock is fair:true16/10/23 08:58:32 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=100016/10/23 08:58:32 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true16/10/23 08:58:32 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.00016/10/23 08:58:32 INFO blockmanagement.BlockManager: The block deletion will start around 2016 Oct 23 08:58:3216/10/23 08:58:32 INFO util.GSet: Computing capacity for map BlocksMap16/10/23 08:58:32 INFO util.GSet: VM type       = 64-bit16/10/23 08:58:32 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB16/10/23 08:58:32 INFO util.GSet: capacity      = 2^21 = 2097152 entries16/10/23 08:58:32 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false16/10/23 08:58:32 INFO blockmanagement.BlockManager: defaultReplication         = 116/10/23 08:58:32 INFO blockmanagement.BlockManager: maxReplication             = 51216/10/23 08:58:32 INFO blockmanagement.BlockManager: minReplication             = 116/10/23 08:58:32 INFO blockmanagement.BlockManager: maxReplicationStreams      = 216/10/23 08:58:32 INFO blockmanagement.BlockManager: replicationRecheckInterval = 300016/10/23 08:58:32 INFO blockmanagement.BlockManager: encryptDataTransfer        = false16/10/23 08:58:32 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 100016/10/23 08:58:32 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)16/10/23 08:58:32 INFO namenode.FSNamesystem: supergroup          = staff16/10/23 08:58:32 INFO namenode.FSNamesystem: isPermissionEnabled = false16/10/23 08:58:32 INFO namenode.FSNamesystem: HA Enabled: false16/10/23 08:58:32 INFO namenode.FSNamesystem: Append Enabled: true16/10/23 08:58:32 INFO util.GSet: Computing capacity for map INodeMap16/10/23 08:58:32 INFO util.GSet: VM type       = 64-bit16/10/23 08:58:32 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB16/10/23 08:58:32 INFO util.GSet: capacity      = 2^20 = 1048576 entries16/10/23 08:58:32 INFO namenode.FSDirectory: ACLs enabled? false16/10/23 08:58:32 INFO namenode.FSDirectory: XAttrs enabled? true16/10/23 08:58:32 INFO namenode.FSDirectory: Maximum size of an xattr: 1638416/10/23 08:58:32 INFO namenode.NameNode: Caching file names occuring more than 10 times16/10/23 08:58:32 INFO util.GSet: Computing capacity for map cachedBlocks16/10/23 08:58:32 INFO util.GSet: VM type       = 64-bit16/10/23 08:58:32 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB16/10/23 08:58:32 INFO util.GSet: capacity      = 2^18 = 262144 entries16/10/23 08:58:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.999000012874603316/10/23 08:58:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 016/10/23 08:58:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 3000016/10/23 08:58:32 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 1016/10/23 08:58:32 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 1016/10/23 08:58:32 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,2516/10/23 08:58:32 INFO namenode.FSNamesystem: Retry cache on namenode is enabled16/10/23 08:58:32 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis16/10/23 08:58:32 INFO util.GSet: Computing capacity for map NameNodeRetryCache16/10/23 08:58:32 INFO util.GSet: VM type       = 64-bit16/10/23 08:58:32 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB16/10/23 08:58:32 INFO util.GSet: capacity      = 2^15 = 32768 entries16/10/23 08:58:32 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1532573559-192.168.0.105-147718431265116/10/23 08:58:32 INFO common.Storage: Storage directory /opt/hadoop-2.7.3/current/dfs/name has been successfully formatted.16/10/23 08:58:32 INFO namenode.FSImageFormatProtobuf: Saving p_w_picpath file /opt/hadoop-2.7.3/current/dfs/name/current/fsp_w_picpath.ckpt_0000000000000000000 using no compression16/10/23 08:58:32 INFO namenode.FSImageFormatProtobuf: Image file /opt/hadoop-2.7.3/current/dfs/name/current/fsp_w_picpath.ckpt_0000000000000000000 of size 346 bytes saved in 0 seconds.16/10/23 08:58:32 INFO namenode.NNStorageRetentionManager: Going to retain 1 p_w_picpaths with txid >= 016/10/23 08:58:32 INFO util.ExitUtil: Exiting with status 016/10/23 08:58:32 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at CentOS_105/192.168.0.105************************************************************/

从打出的Log文件可以看到格式化成功:

INFO common.Storage: Storage directory /opt/hadoop-2.7.3/current/dfs/name has been successfully formatted.

hdfs的路径有个warning,需要修改hdfs-site.xml

dfs.namenode.name.dir

/opt/hadoop-2.7.3/current/dfs/name

file:///opt/hadoop-2.7.3/current/dfs/name


重新新格式化:

hdfs namenode -format

查看host:

hostnamectl

修改hostname:

[root@centosmaster~]#Hostnamectl set-hostname "centosmaster"


启动hadoop:

[root@centosmaster hadoop-2.7.3]# sbin/start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.shStarting namenodes on [localhost]localhost: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-centosmaster.outcentosmaster: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-centosmaster.outStarting secondary namenodes [Centosmaster]Centosmaster: starting secondarynamenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-centosmaster.outstarting yarn daemonsstarting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-centosmaster.outcentosmaster: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-centosmaster.out

用Jps查看启动了什么节点:

[root@centosmaster hadoop]# jps2546 NodeManager3090 SecondaryNameNode3348 Jps2201 DataNode2109 NameNode2447 ResourceManager

停止Hadoop:

sbin/stop-all.sh


验证:


问题1-权限

[root@CentOS_105 jdk1.8.0_111]# java -versionbash: /usr/java/jdk1.8.0_111//bin/java: Permission denied

解决:chmod 777 /usr/java/jdk1.8.0_111/bin/java


问题2-配置

[root@centos_1 hadoop-2.7.3]# sbin/start-all.shThis script is Deprecated. Instead use start-dfs.sh and start-yarn.shIncorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.Starting namenodes on []

解决:在etc/hadoop/core-site.xml中增加配置:

  fs.default.name  hdfs://127.0.0.1:9000


问题3-Hostname

Does not contain a valid host:port authority:

原因:Hadoop的xml配置中会因为某些特殊字符而不正常.

解决:主机使用的hostname不合法,修改为不包含着'.' '/' '_'等非法字符的主机名


参阅

网卡配置信息:http://www.krizna.com/centos/setup-network-centos-7/

JDK安装详解:http://www.cnblogs.com/wangfajun/p/5257899.html


0