互联网服务Hadoopsparkhive

请教spark on hive的安装问题

软件版本为:
jdk 1.8
Hadoop 2.8
hive 2.1.1
spark 1.6.3
scala 2.12.2
mysql 5.7.17

两台主机,其中节点1为namenode&datanode,节点2为datanode,安装完Hadoop、hive、mysql以后,使用mr引擎,load外部表然后进行查询,查询过程正常完成。

安装spark以后,使用run-example SparkPi命令测试,可以计算出Pi的值。
将hive的默认引擎改为spark,执行select操作,报错,提示信息如下:
Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

请问可能是哪里的问题导致的?多谢了!

参与28

5同行回答

bendshabendsha  系统架构师 , 上海爱数信息技术股份有限公司
你好,要使用Hive on Spark,所用的Spark版本必须不包含Hive的相关jar包,hive on spark 的官网上说“Note that you must have a version of Spark which does not include the Hive jars”。在spark官网下载的编译的Spark都是有集成Hive的,因此需要自己下载源码来编译,并且编译...显示全部

你好,要使用Hive on Spark,所用的Spark版本必须不包含Hive的相关jar包,hive on spark 的官网上说“Note that you must have a version of Spark which does not include the Hive jars”。在spark官网下载的编译的Spark都是有集成Hive的,因此需要自己下载源码来编译,并且编译的时候不指定Hive。

收起
软件开发 · 2017-06-25
Raymond_PanRaymond_Pan  数据库架构师 , HNA
我执行这条命令,并查看日志hive --hiveconf hive.root.logger=DEBUG,console -e "select count(*) from orders;"其中有错误信息,我节选了一部分:2017-06-20T02:18:45,902 DEBUG [main] metastore.HiveMetaStore: admin role already existsInvalidObjectException(message:...显示全部

我执行这条命令,并查看日志hive --hiveconf hive.root.logger=DEBUG,console -e "select count(*) from orders;"
其中有错误信息,我节选了一部分:
2017-06-20T02:18:45,902 DEBUG [main] metastore.HiveMetaStore: admin role already exists
InvalidObjectException(message:Role admin already exists.)

2017-06-20T02:18:45,905 DEBUG [main] metastore.HiveMetaStore: public role already exists
InvalidObjectException(message:Role public already exists.)

2017-06-20T02:18:45,920 DEBUG [main] metastore.HiveMetaStore: Failed while granting global privs to admin
InvalidObjectException(message:All is already granted by admin)

2017-06-20T02:19:54,104 ERROR [main] client.SparkClientImpl: Error while waiting for client to connect.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '3e695a7a-0da3-4f37-a133-17fc3509f95b'. Error: Child process exited before connecting back with error log Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000
Warning: Ignoring non-spark config property: hive.spark.client.rpc.threads=8
Warning: Ignoring non-spark config property: hive.spark.client.connect.timeout=1000
Warning: Ignoring non-spark config property: hive.spark.client.secret.bits=256
Warning: Ignoring non-spark config property: hive.spark.client.rpc.max.size=52428800
17/06/20 02:18:50 INFO client.RMProxy: Connecting to ResourceManager at hn1/192.168.1.100:8032
17/06/20 02:18:50 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
17/06/20 02:18:50 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
17/06/20 02:18:50 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
17/06/20 02:18:50 INFO yarn.Client: Setting up container launch context for our AM
17/06/20 02:18:50 INFO yarn.Client: Setting up the launch environment for our AM container
17/06/20 02:18:50 INFO yarn.Client: Preparing resources for our AM container
17/06/20 02:18:51 INFO yarn.Client: Uploading resource file:/usr/local/spark/lib/spark-assembly-1.6.3-hadoop2.6.0.jar -> hdfs://hn1:9000/user/hadoop/.sparkStaging/application_1497456141824_0014/spark-assembly-1.6.3-hadoop2.6.0.jar
17/06/20 02:18:53 INFO yarn.Client: Uploading resource file:/usr/local/hive/lib/hive-exec-2.1.1.jar -> hdfs://hn1:9000/user/hadoop/.sparkStaging/application_1497456141824_0014/hive-exec-2.1.1.jar
17/06/20 02:18:53 INFO yarn.Client: Uploading resource file:/tmp/spark-ba9f34e5-1da9-4d64-966c-79b605eab553/__spark_conf__7167375722585760915.zip -> hdfs://hn1:9000/user/hadoop/.sparkStaging/application_1497456141824_0014/__spark_conf__7167375722585760915.zip
17/06/20 02:18:53 INFO spark.SecurityManager: Changing view acls to: hadoop
17/06/20 02:18:53 INFO spark.SecurityManager: Changing modify acls to: hadoop
17/06/20 02:18:53 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
17/06/20 02:18:53 INFO yarn.Client: Submitting application 14 to ResourceManager
17/06/20 02:18:53 INFO impl.YarnClientImpl: Submitted application application_1497456141824_0014
17/06/20 02:19:53 INFO yarn.Client: Application report for application_1497456141824_0014 (state: FAILED)
17/06/20 02:19:53 INFO yarn.Client:

     client token: N/A
     diagnostics: Application application_1497456141824_0014 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1497456141824_0014_000001 exited with  exitCode: 15

Failing this attempt.Diagnostics: Exception from container-launch.
Container id: container_1497456141824_0014_01_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
    at org.apache.hadoop.util.Shell.run(Shell.java:869)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

Container exited with a non-zero exit code 15
For more detailed output, check the application tracking page: http://hn1:8088/cluster/app/application_1497456141824_0014 Then click on links to logs of each attempt.
. Failing the application.

     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1497896333728
     final status: FAILED
     tracking URL: http://hn1:8088/cluster/app/application_1497456141824_0014
     user: hadoop

Exception in thread "main" org.apache.spark.SparkException: Application application_1497456141824_0014 finished with failed status

    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

17/06/20 02:19:53 INFO util.ShutdownHookManager: Shutdown hook called
17/06/20 02:19:53 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-ba9f34e5-1da9-4d64-966c-79b605eab553

    at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
    at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:106)
    at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
    at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99)
    at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:95)
    at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:69)
    at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
    at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
    at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:136)
    at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:89)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:742)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Caused by: java.lang.RuntimeException: Cancel client '3e695a7a-0da3-4f37-a133-17fc3509f95b'. Error: Child process exited before connecting back with error log Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000

收起
互联网服务 · 2017-06-20
浏览9919
haichuan0227haichuan0227  项目经理 , 新浪云计算
应该是spark版本和你的hive不match。hive 2.1.1源码pom.xml里的spark版本是1.6.0,你的是1.6.3。见下图:000.jpegspark用1.6.0看看。显示全部

应该是spark版本和你的hive不match。

hive 2.1.1源码pom.xml里的spark版本是1.6.0,你的是1.6.3。见下图:
000.jpeg

000.jpeg

spark用1.6.0看看。

收起
互联网服务 · 2017-06-20
浏览9148
杨博杨博  IT顾问 , 某科技公司
从网上搜索到,说是因为内存设置太小。你看看这两个参数:yarn.scheduler.maximum-allocation-mb'或'yarn.nodemanager.resource.memory-mb'. 调整一下内存大小试试。显示全部

从网上搜索到,说是因为内存设置太小。你看看这两个参数:yarn.scheduler.maximum-allocation-mb'或'yarn.nodemanager.resource.memory-mb'. 调整一下内存大小试试。

收起
互联网服务 · 2017-06-20
美国队长美国队长  研发工程师 , Alibaba
题目应该是hive on spark 另外你这个异常可能是hive跟spark的版本不一致的原因,建议你看一下你的hive源码中的pom.xml里面依赖的spark是什么版本的显示全部

题目应该是hive on spark 另外你这个异常可能是hive跟spark的版本不一致的原因,建议你看一下你的hive源码中的pom.xml里面依赖的spark是什么版本的

收起
互联网服务 · 2017-06-20
  • 您好,我使用的hive是apache-hive-2.1.1-bin.tar.gz,我同时又下载了apache-hive-2.1.1-src.tar.gz,在其中的pom.xml文件中搜索spark关键字,查看到&lt;spark.version&gt;1.6.0&lt;/spark.version&gt;,这是不是说匹配的spark应该用1.6.0? 谢谢您对题目的纠正。
    2017-06-20
  • 是的
    2017-06-26

提问者

Raymond_Pan
数据库架构师HNA

相关问题

相关资料

相关文章

问题状态

  • 发布时间:2017-06-20
  • 关注会员:5 人
  • 问题浏览:13417
  • 最近回答:2017-06-25
  • X社区推广