这有点奇怪.当运行非常简单sparkContext.parallelize(List("1","2","3"))
我收到以下错误:
java.lang.VerifyError: class com.fasterxml.jackson.module.scala.ser.ScalaIteratorSerializer overrides final method withResolved.(Lcom/fasterxml/jackson/databind/BeanProperty;Lcom/fasterxml/jackson/databind/jsontype/TypeSerializer;Lcom/fasterxml/jackson/databind/JsonSerializer;)Lcom/fasterxml/jackson/databind/ser/std/AsArraySerializerBase;
Run Code Online (Sandbox Code Playgroud)
我猜一些库的依赖性存在一些冲突.我的build.sbt看起来像这样:
scalaVersion := "2.11.7"
//Library repositories
resolvers ++= Seq(
Resolver.mavenLocal,
"Scala-Tools Maven2 Repository" at "http://scala-tools.org/repo-releases",
"Java.net repository" at "http://download.java.net/maven/2",
"GeoTools" at "http://download.osgeo.org/webdav/geotools",
"Apache" at "https://repository.apache.org/service/local/repositories/releases/content",
"Cloudera" at "https://repository.cloudera.com/artifactory/cloudera-repos/",
"OpenGeo Maven Repository" at "http://repo.opengeo.org",
"Typesafe" at "https://repo.typesafe.com/typesafe/releases/",
"Spray Repository" at "http://repo.spray.io"
)
//Library versions
val geotools_version = "13.2"
val accumulo_version = "1.6.0-cdh5.1.4"
val hadoop_version = "2.6.0-cdh5.4.5"
val hadoop_client_version = "2.6.0-mr1-cdh5.4.5"
val geowave_version = "0.9.0-SNAPSHOT"
val akka_version = "2.4.0"
val spray_version = …
Run Code Online (Sandbox Code Playgroud) 我想杀死一个特定的线程,但无法想出让它工作的方法.我接下来包括所有信息,即使它看起来不重要:
我使用动作栏sherlock并想要在动作按钮事件中杀死一个线程.所以我有:
Thread myThread;
myThread = new Thread(new Runnable(){
public void run(){
functionX();
}
});
myThread.start();
Run Code Online (Sandbox Code Playgroud)
这个线程是一个长时间运行的线程,funcionX()也创建了一些新线程.我希望在以下情况下杀死该线程:
public boolean onOptionsItemSelected(MenuItem item){
switch (item.getItemId()) {
case android.R.id.home:
myThread.interrupt();
break;
};
Run Code Online (Sandbox Code Playgroud)
我尝试使用ExecutorService,使用submit(runnable)获取Future结果和Future.cancel,但似乎不起作用.我还要提一下,functionX()使用http get请求从JSON服务获取数据.
我有一个非常简单的SparkSQL连接到Postgres数据库的设置,我正在尝试从表中获取一个DataFrame,Dataframe有一些X分区(比方说2).代码如下:
Map<String, String> options = new HashMap<String, String>();
options.put("url", DB_URL);
options.put("driver", POSTGRES_DRIVER);
options.put("dbtable", "select ID, OTHER from TABLE limit 1000");
options.put("partitionColumn", "ID");
options.put("lowerBound", "100");
options.put("upperBound", "500");
options.put("numPartitions","2");
DataFrame housingDataFrame = sqlContext.read().format("jdbc").options(options).load();
Run Code Online (Sandbox Code Playgroud)
出于某种原因,DataFrame的一个分区几乎包含所有行.
我能理解的lowerBound/upperBound
是用于微调这个的参数.在SparkSQL的文档(Spark 1.4.0 - spark-sql_2.11)中,它表示它们用于定义步幅,而不是用于过滤/范围分区列.但这提出了几个问题:
似乎无法找到这些问题的明确答案,并且想知道是否有些人可以为我清楚这一点,因为现在正在影响我的集群性能,当处理X万行并且所有繁重的工作都归结为一个遗嘱执行人.
干杯谢谢你的时间.