【hadoop】hive 安装实践
1.下载Hive安装包:
官网下载:http://hive.apache.org/downloads.html
2.上传Hive的tar包,并解压:
建议和hadoop目录在一级,方便后续使用;
解压:tar -zxvf apache-hive-1.2.1-bin.tar.gz -C /home/hadoop/hive
修改解压后的文件名称:mv apache-hive-1.2.1-bin hive-1.2.1
3.安装MySql:
MySQL用于存储Hive的元数据,(安装教程见之前的文章)
4.修改配置文件:主要是配置metastore(元数据存储)存储方式
4.1. vi /home/hadoop/hive/hive-1.2.1/conf/hive-site.xml(存储方式:内嵌Derby方式、本地mysql、远端mysql)
4.2 粘贴如下内容:
javax.jdo.option.ConnectionURL jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true JDBC connect string for a JDBC metastore javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver Driver class name for a JDBC metastore javax.jdo.option.ConnectionUserName root username to use against metastore database javax.jdo.option.ConnectionPassword root password to use against metastore database
5.拷贝jar包:
拷贝mysql驱动jar包到Hive的lib目录下面去,
下载路径:https://pan.baidu.com/s/17iHOIjt4XZbRAngGFf_GgA
6.启动Hive:
(1)启动Hive之前需要先把Hadoop集群启动起来。
(2)使用hadoop用户
启动命令:/usr/local/src/hive-1.2.1/bin/hive
出现如下表示启动成功:
hive>
7、验证Hive运行正常:启动Hive以后输入下面的命令:
hive> show databases;
OK
default
test_db
Time taken: 0.567 seconds, Fetched: 2 row(s)
hive> use default;
OK
Time taken: 0.068 seconds
hive> show tables;
OK
Time taken: 0.086 seconds
8、 创建数据库, 数据库的数据文件被存放在HDFS的/user/hive/warehouse/test_db.db下面
hive> create database test_db;
OK
Time taken: 0.505 seconds
9、在test_db里创建表,表的数据文件被存放在HDFS的/user/hive/warehouse/test_db.db/t_test下面;
并且表的数据文件字段以"|"分割开;
use test_db;
create table flat1_test (mobile string,opr_type string,lastupdatetime string,monthly string,sp_code string,oper_code string,unknown string,subtime string)
row format delimited
fields terminated by '|';
10、上传数据文件到hdfs指定目录,目录为hive数据库表文件目录
hadoop fs -put hivefile1.txt /user/hive/warehouse/test_db.db/flat1_test
11、使用sql查询数据
hive> select * from flat1_test;
12、查询Hive的元数据,进入mysql中查询
mysql> show databases;+--------------------+| Database |+--------------------+| information_schema || hive || mysql || performance_schema || test |+--------------------+5 rows in set (0.00 sec)mysql> use hive;Reading table information for completion of table and column namesYou can turn off this feature to get a quicker startup with -ADatabase changedmysql> mysql> show tables;+---------------------------+| Tables_in_hive |+---------------------------+| BUCKETING_COLS || CDS || COLUMNS_V2 || DATABASE_PARAMS || DBS || FUNCS || FUNC_RU || GLOBAL_PRIVS || IDXS || INDEX_PARAMS || PARTITIONS || PARTITION_KEYS || PARTITION_KEY_VALS || PARTITION_PARAMS || PART_COL_PRIVS || PART_COL_STATS || PART_PRIVS || ROLES || SDS || SD_PARAMS || SEQUENCE_TABLE || SERDES || SERDE_PARAMS || SKEWED_COL_NAMES || SKEWED_COL_VALUE_LOC_MAP || SKEWED_STRING_LIST || SKEWED_STRING_LIST_VALUES || SKEWED_VALUES || SORT_COLS || TABLE_PARAMS || TAB_COL_STATS || TBLS || TBL_COL_PRIVS || TBL_PRIVS || VERSION |+---------------------------+35 rows in set (0.01 sec)mysql> select * from DBS;+-------+-----------------------+-----------------------------------------------------------+---------+------------+------------+| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |+-------+-----------------------+-----------------------------------------------------------+---------+------------+------------+| 1 | Default Hive database | hdfs://XXXXXXXXXX:9000/user/hive/warehouse | default | public | ROLE || 6 | NULL | hdfs://XXXXXXXXXX:9000/user/hive/warehouse/test_db.db | test_db | hadoop | USER |+-------+-----------------------+-----------------------------------------------------------+---------+------------+------------+2 rows in set (0.00 sec)mysql>