Hive2安装

前提

适应版本Hadoop已安装,hdfs和yarn可以成功运行

解压缩

$ sudo mv apache-hive-2.3.3-bin.tar.gz /usr/local/
$ cd /usr/local/
$ sudo tar -zxvf apache-hive-2.3.3-bin.tar.gz
$ sudo mv apache-hive-2.3.3-bin hive
$ sudo chown -R hadoop:hadoop hive

配置环境变量

$ vim ~/.bashrc
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/usr/local/hadoop/lib/*:.
export CLASSPATH=$CLASSPATH:/usr/local/hive/lib/*:.

$ source ~/.bashrc

Hive配置

$ cp conf/hive-env.sh.template conf/hive-env.sh
$ vim conf/hive-env.sh
HADOOP_HOME=/usr/local/hadoop
export HIVE_CONF_DIR=/usr/local/hive/conf

$ cd /usr/local/hive
$ cp conf/hive-default.xml.template conf/hive-site.xml
$ vim conf/hive-size.xml
  • system:java.io.tmpir & system:user.name

    <property>
        <name>system:java.io.tmpdir</name>
        <value>/usr/local/hive/tmp</value>
    </property>
    <property>
        <name>system:user.name</name>
        <value>jokers</value>
    </property>
    
  • mysql连接配置

    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name for a JDBC metastore</description>
    </property>
    
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://xx.xxx.xx.xx:3306/hive?createDatabaseIfNotExist=true</value>
        <description>JDBC connect string for a JDBC metastore</description>
    </property>
    
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
        <description>Username to use against metastore database</description>
    </property>
    
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hive</value>
        <description>password to use against metastore database</description>
    </property>
    
  • hive的存储仓库位置,观察配置默认值

    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
        <description>location of default database for the warehouse</description>
    </property>
    
    <property>
        <name>hive.exec.scratchdir</name>
        <value>/tmp/hive</value>
        <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username>is created, with ${hive.scratch.dir.permission}.</description>
    </property>
    
    <property>
        <name>hive.querylog.location</name>
        <value>${system:java.io.tmpdir}/${system:user.name}</value>
        <description>Location of Hive run time structured log file</description>
    </property>
    
  • 使用Hadoop建立Hive仓库

    $ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp/hive
    $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse
    $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp/hive
    $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse
    
  • 下载mysql-connector-java.jar

    $ sudo yum install mysql-connector-java.noarch
    $ cp /usr/share/java/mysql-connector-java.jar $HIVE_HOME/lib/mysql-connector-java.jar
    
  • Hive初始化mysql数据库

    $ schematool --dbType mysql --initSchema
    
  • 启动Hive

    $ hive
    

MySQL数据导入hive

  • 安装Sqoop

  • 配置mysqldump(如果使用–direct)

    ln -fs /user/local/mysql/bin/mysqldump /usr/bin
    
  • 给Hadoop添加环境变量

    $ vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/hive/conf/
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/hive/lib/
    # 不能这样写/usr/local/hive/conf/* /usr/local/hive/lib/*
    
  • 导入示例

    sqoop import --connect jdbc:mysql://xx.xx.x.x:3306/databases --table table --username hive --password hive --direct --fields-terminated-by '|'  --warehouse-dir /user/hive/warehouse  --hive-import
    

远程链接hiveserver2

  • 配置hive-site.xml

    <property>
        <name>hive.server2.thrift.client.user</name>
        <value>username</value>
        <description>Username to use against thrift client</description>
    </property>
    <property>
        <name>hive.server2.thrift.client.password</name>
        <value>password</value>
        <description>Password to use against thrift client</description>
    </property>
    <property>
        <name>hive.server2.transport.mode</name>
        <value>binary</value>
        <description>
            Expects one of [binary, http].
            Transport mode of HiveServer2.
        </description>
    </property>
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
        <description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description>
    </property>
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>0.0.0.0</value>
        <description>Bind host on which to run the HiveServer2 Thrift service.</description>
    </property>
    
  • 配置hadoop下的core-site.xml

    <property>
        <name>hadoop.proxyuser.hadoop.hosts</name>
        <!--value>master</value-->
        <value>*</value>
    </property>
    
    <property>
        <name>hadoop.proxyuser.hadoop.groups</name>
        <!--value>hadoop</value-->
        <value>*</value>
    </property>
    
  • 启动hive服务,客户端可以使用beeline命令行,也可使用sql developer客户端工具,需要下载hive jdbc驱动

    nohup hiveserver2 >hive.log 2>&1 &
    

error解决

hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly

$ vim ~/.bashrc
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/hive/lib/*
$ source ~/.bashrc