Build hive clusters based on different versions of Hadoop (with configuration files)

Focus on Public Number: Alliance of Java Architects, Daily Technical Updates

This tutorial uses two scenarios

One is hive-1.21 and hadoop is hadoop 2.6.5

Another is mainly about the construction based on hadoop3.x hive

 

First come first

1. Local (embedded derby)

step

This storage requires running a mysql server locally and configuring it as follows

 

decompression

Modify hive-under conf folder in installation packageDefault.xml.templateAnd rename it hite-site.xml

<configuration>      <property>         <name>javax.jdo.option.ConnectionURL</name>         <value>jdbc:derby:;databaseName=metastore_db;create=true</value>     </property>     <property>         <name>javax.jdo.option.ConnectionDriverName</name>         <value>org.apache.derby.jdbc.EmbeddedDriver</value>     </property>     <property>         <name>hive.metastore.local</name>         <value>true</value>     </property>     <property>         <name>hive.metastore.warehouse.dir</name>         <value>/user/hive/warehouse</value>     </property>  </configuration>

 

 

Copy the jline jar package from the hive/lib directory to the yarn lib of hadoop and invalidate the original jar package by deleting or renaming it.Otherwise, a version mismatch error will be reported

When using Derby storage, running hive generates a derby file and a metastore_in the current directoryDB directory.The disadvantage of this storage method is that only one hive client can use the database in the same directory at the same time, and multiple users will fail to log in.(This is due to the limitations of the Derby database)

2. Local mode (mysql)

This storage requires running a mysql server locally and configuring it as follows

 

step

Install a mysql database

yum  install mysql-server -y

Copy mysql driver package to $HIVE_Under HOME\lib directory

Modify hive-site,xml

<configuration>  <property>    <name>hive.metastore.warehouse.dir</name>    <value>/user/hive_remote/warehouse</value>  </property>  <property>    <name>hive.metastore.local</name>    <value>true</value>  </property>  <property>    <name>javax.jdo.option.ConnectionURL</name>    <value>jdbc:mysql://localhost/hive_meta?createDatabaseIfNotExist=true</value>  </property>  <property>    <name>javax.jdo.option.ConnectionDriverName</name>    <value>com.mysql.jdbc.Driver</value>  </property>  <property>    <name>javax.jdo.option.ConnectionUserName</name>    <value>hive</value>  </property>  <property>    <name>javax.jdo.option.ConnectionPassword</name>    <value>123</value>  </property>  </configuration>

Start sql service

service mysqld start

Set up boot-up

chkconfig mysqld on

Modify root user permissions

(1) Log on to mysql

mysql -uroot

(2) Modify permissions

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

(3) Refresh

flush privileges;

Create hive_meta database

create database hive_meta;

Add users and modify permissions

(1) Create a hive user and password

CREATE USER 'hive'@'%' IDENTIFIED BY '123';

(2) Grant privileges

 

grant all privileges on hive_meta.* to hive@"%" identified by '123'; flush privileges;

Remove redundant permissions

1. Enter mysql database

 

use mysql;

2. View users

select host,user,password from user;

3. Delete data that has an impact on permissions

delete from user where password = '';

 

Copy jline jar package from hive/lib directory to yarn lib from hadoop

Configure HIVE_HOME, hive starts

 

3. remote mode

When multiple clients use hive, using their own local hive and MySQL can lead to inconsistent metadata at each end, make it difficult to manage, and cause problems.Start a metastore service with a hive based on the local MySQL database as the server. Other clients connect to the metastore using the service-side hive through the thrift protocol, thus using the same metadata in mysql.

 

step

Start the metastore service as a server on a node where mysql-based hive is set up

hive --service metastore 

Unzip the installation package on the client and modify hive-site.xml

<configuration>  <property>      <name>hive.metastore.warehouse.dir</name>      <value>/user/hive/warehouse</value>  </property>        <property>      <name>hive.metastore.local</name>      <value>false</value>  </property>     <property>      <name>hive.metastore.uris</name>      <value>thrift://192.168.23.134:9083</value>  </property>  </configuration>

Note: The start of metastore will always start in the foreground, which can be solved by

 

hive --service metastore >> meta.log 2>&1 &

 

Print logs toMeta.logError log (2) redirect to normal log (1)

 

&Represents background execution

 

In this way, the server only provides metadata and the client runs its own hiv

 

 

Second

This article describes the differences between hive3.x and historical versions.Local modes are used less often than ever. Starting directly with local modes, remote modes are no different from historical versions and will not be repeated in this article.

 

1. local mode

1. Modify hive-site.xml

 

<configuration>    <property>      <name>hive.metastore.warehouse.dir</name>      <value>/user/hive_remote/warehouse</value>    </property>    <property>    <name>hive.exec.scratchdir</name>    <value>/tmp/hive</value>    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>  </property>  <property>    <name>hive.exec.local.scratchdir</name>    <value>/opt/software/hive/temp/root</value>  </property>  <property>    <name>hive.downloaded.resources.dir</name>    <value>/opt/software/hive/temp/${hive.session.id}_resources</value>  </property>  <property>    <name>hive.server2.logging.operation.log.location</name>    <value>/opt/software/hive/temp/root/operation_logs</value>  </property>   <property>    <name>hive.querylog.location</name>    <value>/opt/software/hive/temp/root</value>  </property>  <property>      <name>hive.metastore.local</name>      <value>true</value>    </property>         <property>      <name>javax.jdo.option.ConnectionURL</name>      <value>jdbc:mysql://localhost/hive_meta?createDatabaseIfNotExist=true</value>    </property>         <property>      <name>javax.jdo.option.ConnectionDriverName</name>      <value>com.mysql.jdbc.Driver</value>    </property>         <property>      <name>javax.jdo.option.ConnectionUserName</name>      <value>hive</value>    </property>         <property>      <name>javax.jdo.option.ConnectionPassword</name>      <value>123</value>    </property>  </configuration>

2. Modify hive-env.sh

export HADOOP_HOME=/opt/software/hadoop export HIVE_CONF_DIR=/opt/software/hive/conf export HIVE_AUX_JARS_PATH=/opt/software/hive/lib

3. guava-under hadoop/share/hadoop/common/lib Xx.jarCopy to hive/lib to guava-of hiveXx.jardelete

 

4. You're done. Try it!

Tags: hive MySQL Hadoop Database

Posted on Thu, 28 May 2020 09:18:13 -0700 by adam119