hive之pyhs2作为client连接hive从本地导入数据到hive

字数 1926阅读 4843评论 0赞 0

1、检查安装环境

安装gcc-c++,cyrus-sasl,python-devel

安装cyrus-sasl-devel-2.1.23-15.el6.i686.rpm,根据版本决定

2、安装pyhs2

使用pyshon的pip命令安装pyhs2

pip install pyhs2

如果不报错则说明安装成功，如果报错的话是包的依赖关系，需要安装提示的其它包。

3、编写连接hive脚本

#!/usr/local/bin/python
import pyhs2
import os
import sys

listfile=os.listdir("/home/vmuser/code/data")
#print listfile

conn= pyhs2.connect(host='192.168.98.11',
                   port=10001,
                   authMechanism="PLAIN",
                   user='vmuser',
                   password='vmuser',
                   database='default')
cur=conn.cursor()
print cur.getDatabases()
cur.execute("truncate table  aaa")
for aaa in listfile:
        bbb="load data local inpath '/home/vmuser/code/data/"+aaa+"' into table aaa"
        print bbb
        cur.execute(bbb);
cur.execute("select * from aaa")

print cur.getSchema()
for i in cur.fetch():
        print i

4、测试脚本

测试之前需要先启动hadoop集群，

启动hive脚本，建立表aaa;

在制定目录/home/vmuser/code/data/下建立符合格式的文件

启动hive service。然后执行测试脚本。

脚本将把目录下的文件导入到hive的表里。

pyhs2连接hive pyhs2 hive 数据库

著作权归作者所有

如果觉得我的文章对您有用，请点赞。您的支持将鼓励我继续创作！

添加新评论0 条评论

Ctrl+Enter 发表

匿名评论

hive之pyhs2作为client连接hive从本地导入数据到hive

添加新评论0 条评论

作者其他文章

相关文章

相关问题

相关资料