我是scala尝试理解的新手,通过将等效的java更改为scala来让我更好地理解.
如何将java 8 map,filter和streams转换为scala?
我有以下java 8代码,我试图转换为Scala:
public Set<String> getValidUsages(String itemId, long sNo, Date timeOfAccess) {
Set<String> itemSet = Sets.newHashSet();
TestWindows testWindows = items.get(itemId).getTestWindows();
final boolean isTV = existsEligibleTestWindow(testWindows.getTV(), timeOfAccess);
if (isTV) {
itemSet.add(TV);
} else {
final boolean isCableUseable = existsEligibleTestWindow(testWindows.getCableUse(), timeOfAccess);
final boolean isWifi = existsEligibleTestWindow(testWindows.getWifi(), timeOfAccess);
if (isCableUseable || isWifi) {
itemSet.add(MOVIE);
}
}
if (testWindows.getUsageIds() != null) {
itemSet.addAll(testWindows.getUsageIds()
.entrySet()
.stream()
.filter(entry -> existsEligibleTestWindow(entry.getValue(), timeOfAccess))
.map(Map.Entry::getKey)
.collect(Collectors.toSet()));
}
return itemSet;
}
private boolean existsEligibleTestWindow(List<TestWindow> windows, Date timeOfAccess) …Run Code Online (Sandbox Code Playgroud) 我试图在火花数据帧上使用SQL.但是数据框有1个值有字符串(这是JSON之类的结构):
我将数据框保存到临时表:TestTable
当我做desc时:
col_name data_type
requestId string
name string
features string
Run Code Online (Sandbox Code Playgroud)
但是功能值是一个json:
{"places":11,"movies":2,"totalPlacesVisited":0,"totalSpent":13,"SpentMap":{"Movie":2,"Park Visit":11},"benefits":{"freeTime":13}}
Run Code Online (Sandbox Code Playgroud)
我只想查询TestTable,其中totalSpent> 10.有人可以告诉我该怎么做?
我的JSON文件如下所示:
{
"requestId": 232323,
"name": "ravi",
"features": "{"places":11,"movies":2,"totalPlacesVisited":0,"totalSpent":13,"SpentMap":{"Movie":2,"Park Visit":11},"benefits":{"freeTime":13}}"
}
Run Code Online (Sandbox Code Playgroud)
功能是一个字符串.我只需要totalSpent.我尝试过:
val features = StructType(
Array(StructField("totalSpent",LongType,true),
StructField("movies",LongType,true)
))
val schema = StructType(Array(
StructField("requestId",StringType,true),
StructField("name",StringType,true),
StructField("features",features,true),
)
)
val records = sqlContext.read.schema(schema).json(filePath)
Run Code Online (Sandbox Code Playgroud)
由于每个请求都有一个JSON字符串的功能.但这给了我错误.
我试过的时候
val records = sqlContext.jsonFile(filePath)
records.printSchema
Run Code Online (Sandbox Code Playgroud)
告诉我:
root
|-- requestId: string (nullable = true)
|-- features: string (nullable = true)
|-- name: string (nullable = true)
Run Code Online (Sandbox Code Playgroud)
我可以在创建模式时在StructField中使用parallelize吗?我尝试过:
I first tried with : …Run Code Online (Sandbox Code Playgroud) 我的数据如下:
[null,223433,WrappedArray(),null,460036382,0,home,home,home]
Run Code Online (Sandbox Code Playgroud)
如何在spark sql中查询col3是否为空?我试图爆炸但是当我这样做时,空数组行正在消失.有人可以建议我这样做的方法.
我试过了 :
val homeSet = result.withColumn("subscriptionProvider", explode($"subscriptionProvider"))
Run Code Online (Sandbox Code Playgroud)
subscriptionProvider(WrappedArray())具有值数组的列在哪里,但某些数组可以为空.我需要使用null值获取subscriptionProvider,而且subscriptionProvider数组具有"Comcast"