Spark application development on local cluster by IntelliJ -
i tried many things execute application on local cluster. did not work.
i using cdh 5.7 , spark version 1.6. trying create dataframe hive on cdh 5.7.
if use spark-shell, codes works well. however, have no idea how can set intellj configuration efficient development environment.
here code;
import org.apache.spark.{sparkconf, sparkcontext} object dataframe { def main(args: array[string]): unit = { println("hello dataframe") val conf = new sparkconf() // skip loading external settingg .setmaster("local") // "local[4]" 4 threads .setappname("dataframe-example") .set("spark.logconf", "true") val sc = new sparkcontext(conf) sc.setloglevel("warn") println(s"running spark version ${sc.version}") val sqlcontext = new org.apache.spark.sql.hive.hivecontext(sc) sqlcontext.sql("from src select key, value").collect().foreach(println) } }
when run program on intellij, error messages following;
hello dataframe using spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/05/29 11:30:57 info slf4jlogger: slf4jlogger started running spark version 1.6.0 16/05/29 11:31:02 warn objectstore: version information not found in metastore. hive.metastore.schema.verification not enabled recording schema version 1.1.0 16/05/29 11:31:02 warn objectstore: failed database default, returning nosuchobjectexception exception in thread "main" java.lang.reflect.invocationtargetexception @ sun.reflect.nativeconstructoraccessorimpl.newinstance0(native method) @ sun.reflect.nativeconstructoraccessorimpl.newinstance(nativeconstructoraccessorimpl.java:62) @ sun.reflect.delegatingconstructoraccessorimpl.newinstance(delegatingconstructoraccessorimpl.java:45) @ java.lang.reflect.constructor.newinstance(constructor.java:423) @ org.apache.spark.sql.hive.client.isolatedclientloader.createclient(isolatedclientloader.scala:249) @ org.apache.spark.sql.hive.hivecontext.metadatahive$lzycompute(hivecontext.scala:329) @ org.apache.spark.sql.hive.hivecontext.metadatahive(hivecontext.scala:239) @ org.apache.spark.sql.hive.hivecontext$$anon$2.<init>(hivecontext.scala:459) @ org.apache.spark.sql.hive.hivecontext.catalog$lzycompute(hivecontext.scala:459) @ org.apache.spark.sql.hive.hivecontext.catalog(hivecontext.scala:458) @ org.apache.spark.sql.hive.hivecontext$$anon$3.<init>(hivecontext.scala:475) @ org.apache.spark.sql.hive.hivecontext.analyzer$lzycompute(hivecontext.scala:475) @ org.apache.spark.sql.hive.hivecontext.analyzer(hivecontext.scala:474) @ org.apache.spark.sql.execution.queryexecution.assertanalyzed(queryexecution.scala:34) @ org.apache.spark.sql.dataframe.<init>(dataframe.scala:133) @ org.apache.spark.sql.dataframe$.apply(dataframe.scala:52) @ org.apache.spark.sql.sqlcontext.sql(sqlcontext.scala:817) @ org.corus.spark.example.dataframe$.main(dataframe.scala:25) @ org.corus.spark.example.dataframe.main(dataframe.scala) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ com.intellij.rt.execution.application.appmain.main(appmain.java:144) caused by: java.lang.runtimeexception: java.lang.runtimeexception: root scratch dir: /tmp/hive on hdfs should writable. current permissions are: rwx------ @ org.apache.hadoop.hive.ql.session.sessionstate.start(sessionstate.java:539) @ org.apache.spark.sql.hive.client.clientwrapper.<init>(clientwrapper.scala:194) ... 24 more caused by: java.lang.runtimeexception: root scratch dir: /tmp/hive on hdfs should writable. current permissions are: rwx------ @ org.apache.hadoop.hive.ql.session.sessionstate.createroothdfsdir(sessionstate.java:624) @ org.apache.hadoop.hive.ql.session.sessionstate.createsessiondirs(sessionstate.java:573) @ org.apache.hadoop.hive.ql.session.sessionstate.start(sessionstate.java:517) ... 25 more process finished exit code 1
is there know solution? thanks.
i found several resources problem. none of them did not work. https://www.linkedin.com/pulse/develop-apache-spark-apps-intellij-idea-windows-os-samuel-yee https://blog.cloudera.com/blog/2014/06/how-to-create-an-intellij-idea-project-for-apache-hadoop/
thanks all. solved problem self. problem local spark(maven version) did not know information of hive on our cluster.
the solution simple.
just add following codes on source code. conf.set("spark.sql.hive.thriftserver.singlesession", "true")
system.setproperty("hive.metastore.uris","thrift://hostname:serviceport")
it works! let's play spark.
Comments
Post a Comment