Bel*_*gor 5 java maven apache-spark apache-spark-sql
我是Spark的新手,遇到以下麻烦:当我尝试导入SQLContext时:
import org.apache.spark.sql.SQLContext;
Run Code Online (Sandbox Code Playgroud)
或尝试显式初始化SQLContext变量:
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);
Run Code Online (Sandbox Code Playgroud)
我从Eclipse得到一个错误:
无法解析导入org.apache.spark.sql.SQLContext
我把Spark放到了依赖文件中,除了SQLContext之外其他一切都很好.整个代码:
package main.java;
import java.io.Serializable;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.SQLContext;
public class SparkTests {
public static void main(String[] args){
SparkConf conf = new SparkConf().setAppName("SparkMain");
JavaSparkContext sc = new JavaSparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);
//DataFrame df = sqlContext
System.out.println("\n\n\nHello world!\n\n\n");
}
}
Run Code Online (Sandbox Code Playgroud)
当我尝试编译它时mvn package
,我得到编译错误:
包org.apache.spark.sql不存在
任何想法为什么无法找到SQL包?
编辑:
依赖文件pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<groupId>edu.berkeley</groupId>
<artifactId>simple-project</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>Simple Project</name>
<packaging>jar</packaging>
<version>1.0</version>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.1</version>
</dependency>
</dependencies>
</project>
Run Code Online (Sandbox Code Playgroud)
如果要使用Spark SQL,或者DataFrames
在项目中,则必须将spark-sql
工件添加为依赖项.在这种特殊情况下:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId> <!-- matching Scala version -->
<version>1.6.1</version> <!-- matching Spark Core version -->
</dependency>
Run Code Online (Sandbox Code Playgroud)
应该做的伎俩.
归档时间: |
|
查看次数: |
12028 次 |
最近记录: |