我想知道镶木地板文件是否可以像文件流和目录流一样提供给Spark Streaming.那么一个新的镶木地板生成的时候火花应该得到它的线?
我有一个在集群模式下运行的简单 Spark 应用程序。
val funcGSSNFilterHeader = (x: String) => {
println(!x.contains("servedMSISDN")
!x.contains("servedMSISDN")
}
val ssc = new StreamingContext(sc, Seconds(batchIntervalSeconds))
val ggsnFileLines = ssc.fileStream[LongWritable, Text, TextInputFormat]("C:\\Users\\Mbazarganigilani\\Documents\\RA\\GGSN\\Files1", filterF, false)
val ggsnArrays = ggsnFileLines
.map(x => x._2.toString()).filter(x => funcGSSNFilterHeader(x))
ggsnArrays.foreachRDD(s => {println(x.toString()})
Run Code Online (Sandbox Code Playgroud)
我需要在 map 函数中打印 !x.contains("servedMSISDN") 以进行调试,但这不会在控制台上打印
我正在尝试启动 Redshift 集群。我创建了自己的 VPC,(180.18.16.0/16)包含两个子网:
180.18.16.0/20
180.18.0.0/20
Run Code Online (Sandbox Code Playgroud)
这些都在俄亥俄地区。然后,我尝试在同一区域创建 Redshift 集群。但是,当我尝试列出 VPC 时,它们已列出,但已被禁用,并且我无法选择任何 VPC(无论是默认 VPC 还是我自己的 VPC)。

我究竟做错了什么?
我编写了一个简单的代码来将控制台的输出打印到一个 BytreArrayOutputStream。我正在使用 JDK 1.7。但是,当我想要缓冲区为字符串时,我不能使用方法 BytreArrayOutputStream.ToString (String Charset)..
它没有这个功能。我正在使用 JDk1.7,应该支持它。我在 windows 7 中使用 Netbean。
PrintStream co1=new PrintStream(new java.io.ByteArrayOutputStream());
System.setOut(co1);
StatsUtil.submit(command);
co1.flush();
co1.close();
co1.toString();//acceptted this but it doesn't give me the stream content
String t=co1.toString("UTF-8");//the compliers give me errors the method doesn't get any string parameter
Run Code Online (Sandbox Code Playgroud)
任何帮助将不胜感激。
我试图向 Scala 添加一个元素 HashMap
val c2 = new collection.mutable.HashMap[String,Int]()
c2 += ("hh",1)
Run Code Online (Sandbox Code Playgroud)
但上面给了我一个编译错误。
[error] found : String("hh")
[error] required: (String, Int)
[error] c2 += ("hh", 1)
[error] ^
[error] /scalathing/Main.scala:5: type mismatch;
[error] found : Int(1)
[error] required: (String, Int)
[error] c2 += ("hh", 1)
[error] ^
[error] two errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 3 s, completed Sep 1, 2016 1:22:52 AM
Run Code Online (Sandbox Code Playgroud)
我添加的一对似乎是HashMap. 为什么会出现编译错误?
我正在运行一个Spark应用程序,但总是遇到内存不足的异常。
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
Run Code Online (Sandbox Code Playgroud)
我在Linux上的节点集群中的local [5]下运行程序,但是它仍然给我这个错误。
我有一个案例课
case class AGG_RECON(SUBSCRIBER_ID:String , ChargingID:String ,NodeID:String, START_TIME:String, DXE_First_Report_Time:String, DXE_Last_Report_Time:String, DXE_Session_Start_Time:String, DXE_Bearer_Creation_Time:String, DXE_IMSI:String, DXE_MSISDN:String, DXE_RAT_Type:String,
DXE_Subscriber_Type:String, DXE_VPMN:String, DXE_ROAM_TYPE:String, DXE_APN:String, DXE_APN_Category:String, DXE_Charging_Characteristics:String, DXE_CDR_Count:String,
NW_First_Report_Time:String, NW_Last_Report_Time:String, NW_Session_Start_Time:String, NW_IMSI:String, NW_MSISDN:String, NW_RAT_Type:String, NW_ROAM_TYPE:String, NW_APN:String, NW_APN_Category:String, NW_Charging_Characteristics:String, NW_CDR_Count:String,
CHG_First_Report_Time:String, CHG_Last_Report_Time:String, CHG_Session_Start_Time:String, CHG_IMSI:String, CHG_MSISDN:String, CHG_ROAM_TYPE:String, CHG_APN:String,
CHG_APN_Category:String, CHG_Charging_Characteristics:String, CHG_Rate_Plan:String, CHG_Rating_Group:String, CHG_CDR_Count:String, VOL_PROBE_UL_VOL:String, VOL_PROBE_DL_VOL:String, VOL_PROBE_FREE_VOL:String, VOL_PROBE_TOT_VOL:String, VOL_NW_UL_VOL:String,VOL_NW_DL_VOL:String, VOL_NW_FREE_VOL:String, VOL_NW_TOT_VOL:String, VOL_CHG_UL_VOL:String,
VOL_CHG_DL_VOL:String, VOL_CHG_FREE_VOL:String, VOL_CHG_TOT_VOL:String, VOL_DXE_Session_End_Time:String, VOL_NW_Session_End_Time:String,
VOL_CHG_Session_End_Time:String, VOL_Session_Closed_Time:String, VOL_DXE_Is_Completed:String, VOL_NW_Is_Completed:String, VOL_CHG_Is_Completed:String, VOL_Is_Closed:String, VOL_Session_Category:String)
{
override def toString(): String = {
val result = s"AGG_RECON_SUBSCRIBER_ID=${SUBSCRIBER_ID}, AGG_RECON_ChargingID=${ChargingID} ,AGG_RECON_NodeID=${NodeID}, AGG_RECON_START_TIME=${START_TIME}" + …Run Code Online (Sandbox Code Playgroud)