请教spark on hive的安装问题

软件版本为:
jdk 1.8
Hadoop 2.8
hive 2.1.1
spark 1.6.3
scala 2.12.2
mysql 5.7.17

两台主机,其中节点1为namenode&datanode,节点2为datanode,安装完Hadoop、hive、mysql以后,使用mr引擎,load外部表然后进行查询,查询过程正常完成。

安装spark以后,使用run-example SparkPi命令测试,可以计算出Pi的值。
将hive的默认引擎改为spark,执行select操作,报错,提示信息如下:
Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

请问可能是哪里的问题导致的?多谢了!

5回答

美国队长美国队长  研发工程师 , Alibaba
Yoda_Feilinliwenming49syhand等赞同了此回答
题目应该是hive on spark 另外你这个异常可能是hive跟spark的版本不一致的原因,建议你看一下你的hive源码中的pom.xml里面依赖的spark是什么版本的显示全部

题目应该是hive on spark 另外你这个异常可能是hive跟spark的版本不一致的原因,建议你看一下你的hive源码中的pom.xml里面依赖的spark是什么版本的

收起
 2017-06-20
浏览4226
aixchina 邀答
  • 您好,我使用的hive是apache-hive-2.1.1-bin.tar.gz,我同时又下载了apache-hive-2.1.1-src.tar.gz,在其中的pom.xml文件中搜索spark关键字,查看到<spark.version>1.6.0</spark.version>,这是不是说匹配的spark应该用1.6.0? 谢谢您对题目的纠正。
    2017-06-20
  • 是的
    2017-06-26
Raymond_PanRaymond_Pan  数据库架构师 , HNA
syhandjilaoshi871218赞同了此回答
我执行这条命令,并查看日志hive --hiveconf hive.root.logger=DEBUG,console -e "select count(*) from orders;"其中有错误信息,我节选了一部分:2017-06-20T02:18:45,902 DEBUG [main] metastore.HiveMetaStore: admin role already existsInvalidObjectException(message:...显示全部

我执行这条命令,并查看日志hive --hiveconf hive.root.logger=DEBUG,console -e "select count(*) from orders;"
其中有错误信息,我节选了一部分:
2017-06-20T02:18:45,902 DEBUG [main] metastore.HiveMetaStore: admin role already exists
InvalidObjectException(message:Role admin already exists.)

2017-06-20T02:18:45,905 DEBUG [main] metastore.HiveMetaStore: public role already exists
InvalidObjectException(message:Role public already exists.)

2017-06-20T02:18:45,920 DEBUG [main] metastore.HiveMetaStore: Failed while granting global privs to admin
InvalidObjectException(message:All is already granted by admin)

2017-06-20T02:19:54,104 ERROR [main] client.SparkClientImpl: Error while waiting for client to connect.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '3e695a7a-0da3-4f37-a133-17fc3509f95b'. Error: Child process exited before connecting back with error log Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000
Warning: Ignoring non-spark config property: hive.spark.client.rpc.threads=8
Warning: Ignoring non-spark config property: hive.spark.client.connect.timeout=1000
Warning: Ignoring non-spark config property: hive.spark.client.secret.bits=256
Warning: Ignoring non-spark config property: hive.spark.client.rpc.max.size=52428800
17/06/20 02:18:50 INFO client.RMProxy: Connecting to ResourceManager at hn1/192.168.1.100:8032
17/06/20 02:18:50 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
17/06/20 02:18:50 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
17/06/20 02:18:50 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
17/06/20 02:18:50 INFO yarn.Client: Setting up container launch context for our AM
17/06/20 02:18:50 INFO yarn.Client: Setting up the launch environment for our AM container
17/06/20 02:18:50 INFO yarn.Client: Preparing resources for our AM container
17/06/20 02:18:51 INFO yarn.Client: Uploading resource file:/usr/local/spark/lib/spark-assembly-1.6.3-hadoop2.6.0.jar -> hdfs://hn1:9000/user/hadoop/.sparkStaging/application_1497456141824_0014/spark-assembly-1.6.3-hadoop2.6.0.jar
17/06/20 02:18:53 INFO yarn.Client: Uploading resource file:/usr/local/hive/lib/hive-exec-2.1.1.jar -> hdfs://hn1:9000/user/hadoop/.sparkStaging/application_1497456141824_0014/hive-exec-2.1.1.jar
17/06/20 02:18:53 INFO yarn.Client: Uploading resource file:/tmp/spark-ba9f34e5-1da9-4d64-966c-79b605eab553/spark_conf7167375722585760915.zip -> hdfs://hn1:9000/user/hadoop/.sparkStaging/application_1497456141824_0014/spark_conf7167375722585760915.zip
17/06/20 02:18:53 INFO spark.SecurityManager: Changing view acls to: hadoop
17/06/20 02:18:53 INFO spark.SecurityManager: Changing modify acls to: hadoop
17/06/20 02:18:53 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
17/06/20 02:18:53 INFO yarn.Client: Submitting application 14 to ResourceManager
17/06/20 02:18:53 INFO impl.YarnClientImpl: Submitted application application_1497456141824_0014
17/06/20 02:19:53 INFO yarn.Client: Application report for application_1497456141824_0014 (state: FAILED)
17/06/20 02:19:53 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1497456141824_0014 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1497456141824_0014_000001 exited with exitCode: 15
Failing this attempt.Diagnostics: Exception from container-launch.
Container id: container_1497456141824_0014_01_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
at org.apache.hadoop.util.Shell.run(Shell.java:869)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 15
For more detailed output, check the application tracking page: http://hn1:8088/cluster/app/application_1497456141824_0014 Then click on links to logs of each attempt.
. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1497896333728
final status: FAILED
tracking URL: http://hn1:8088/cluster/app/application_1497456141824_0014
user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1497456141824_0014 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/06/20 02:19:53 INFO util.ShutdownHookManager: Shutdown hook called
17/06/20 02:19:53 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-ba9f34e5-1da9-4d64-966c-79b605eab553
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:106)
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:95)
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:69)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:89)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:742)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.RuntimeException: Cancel client '3e695a7a-0da3-4f37-a133-17fc3509f95b'. Error: Child process exited before connecting back with error log Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000

收起
 2017-06-20
浏览4493
haichuan0227haichuan0227  项目经理 , 新浪云计算
syhandjilaoshi871218赞同了此回答
应该是spark版本和你的hive不match。 hive 2.1.1源码pom.xml里的spark版本是1.6.0,你的是1.6.3。见下图:000.jpeg spark用1.6.0看看。显示全部

应该是spark版本和你的hive不match。

hive 2.1.1源码pom.xml里的spark版本是1.6.0,你的是1.6.3。见下图:
000.jpeg

000.jpeg

spark用1.6.0看看。

收起
 2017-06-20
浏览4106
阳海阳海  IT顾问 , 某平台架构部高级技术经理
syhandjilaoshi871218赞同了此回答
从网上搜索到,说是因为内存设置太小。你看看这两个参数:yarn.scheduler.maximum-allocation-mb'或'yarn.nodemanager.resource.memory-mb'. 调整一下内存大小试试。显示全部

从网上搜索到,说是因为内存设置太小。你看看这两个参数:yarn.scheduler.maximum-allocation-mb'或'yarn.nodemanager.resource.memory-mb'. 调整一下内存大小试试。

收起
 2017-06-20
浏览4097
aixchina 邀答
bendshabendsha  研发工程师 , 上海爱数信息技术股份有限公司
syhand赞同了此回答
你好,要使用Hive on Spark,所用的Spark版本必须不包含Hive的相关jar包,hive on spark 的官网上说“Note that you must have a version of Spark which does not include the Hive jars”。在spark官网下载的编译的Spark都是有集成Hive的,因此需要自己下载源码来编译,并且编译...显示全部

你好,要使用Hive on Spark,所用的Spark版本必须不包含Hive的相关jar包,hive on spark 的官网上说“Note that you must have a version of Spark which does not include the Hive jars”。在spark官网下载的编译的Spark都是有集成Hive的,因此需要自己下载源码来编译,并且编译的时候不指定Hive。

收起
 2017-06-25
浏览4060
aixchina 邀答

提问者

Raymond_Pan数据库架构师, HNA

问题状态

  • 发布时间:2017-06-20
  • 关注会员:5 人
  • 问题浏览:5838
  • 最近回答:2017-06-25
  • 关于TWT  使用指南  社区专家合作  厂商入驻社区  企业招聘  投诉建议  版权与免责声明  联系我们
    © 2019  talkwithtrend — talk with trend,talk with technologist 京ICP备09031017号-30