小编Min*_*nie的帖子

使用SparkR将csv文件读入Rstudio时清空输出

我是SparkR的新用户.我正在尝试使用SparkR将csv文件加载到R中.

Sys.setenv(SPARK_HOME="/usr/local/bin/spark-1.5.1-bin-hadoop2.6")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))

library(SparkR)

sc <- sparkR.init(master="local", sparkPackages="com.databricks:spark-csv_2.11:1.0.3")
sqlContext <- sparkRSQL.init(sc)
Run Code Online (Sandbox Code Playgroud)

我使用了nyc飞行数据集的子集进行测试.它只有4行4列:gyear month day dep_time 2013 1 1 517 2013 1 1 533 2013 1 1 542 2013 1 1 544

n5 <- read.df(sqlContext, "/users/zhiyi.zhang/Downloads/n5.csv", "com.databricks.spark.csv", header="true")
head(n5)
Run Code Online (Sandbox Code Playgroud)

然后,当我想查看数据时,我看到了这些错误:

`15/11/03 13:45:53 ERROR CsvRelation$: Exception while parsing line: 2013,1,1,517. 

java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String

at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:45)
at org.apache.spark.sql.catalyst.expressions.GenericMutableRow.getUTF8String(rows.scala:247)
at org.apache.spark.sql.catalyst.expressions.BoundReference.eval(BoundAttribute.scala:49)
at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:247)
at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:82)
at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:61)
at com.databricks.spark.csv.CsvRelation$$anonfun$com$databricks$spark$csv$CsvRelation$$parseCSV$1.apply(CsvRelation.scala:150)
at com.databricks.spark.csv.CsvRelation$$anonfun$com$databricks$spark$csv$CsvRelation$$parseCSV$1.apply(CsvRelation.scala:130)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) …
Run Code Online (Sandbox Code Playgroud)

csv r apache-spark sparkr

4
推荐指数
1
解决办法
566
查看次数

标签 统计

apache-spark ×1

csv ×1

r ×1

sparkr ×1