我有一个工厂(注册表DP)初始化类:
public class GenericFactory extends AbstractFactory {
public GenericPostProcessorFactory() {
factory.put("Test",
defaultSupplier(() -> new Test()));
factory.put("TestWithArgs",
defaultSupplier(() -> new TestWithArgs(2,4)));
}
}
interface Validation
Test implements Validation
TestWithArgs implements Validation
Run Code Online (Sandbox Code Playgroud)
并在AbstractFactory中
protected Supplier<Validation> defaultSupplier(Class<? extends Validation> validationClass) {
return () -> {
try {
return validationClass.newInstance();
} catch (InstantiationException | IllegalAccessException e) {
throw new RuntimeException("Unable to create instance of " + validationClass, e);
}
};
}
Run Code Online (Sandbox Code Playgroud)
但我不断得到无法推断功能接口类型错误.我在这做错了什么?
我有两个数据框,我正试图找出它们之间的区别。2个数据帧包含struct数组。我不需要该结构中的1个键。因此,我首先将其删除,然后转换为JSON字符串。比较时,我需要知道该数组(Json)中更改了多少个元素。有办法做到这一点吗?
双方base_data_set并target_data_set包含ID和KEY。KEY是一个array<Struct>:
root
|-- id: string (nullable = true)
|-- result: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- key1: integer (nullable = true)
| | |-- key3: string (nullable = false)
| | |-- key2: string (nullable = true)
| | |-- key4: string (nullable = true)
val temp_base = base_data_set
.withColumn("base_result", explode(base_data_set(RESULT)))
.withColumn("base",
struct($"base_result.key1", $"base_result.key2", $"base_result.key3"))
.groupBy(ID)
.agg(to_json(collect_list("base")).as("base_picks"))
val temp_target …Run Code Online (Sandbox Code Playgroud) 如何使用java 8添加所有内容?
processeditemList is a Map<Integer, Map<Item, Boolean>>
Run Code Online (Sandbox Code Playgroud)
至于现在我在做:
List<Item> itemList = Lists.newLinkedList();
for (Map<Item, Boolean> entry : processeditemList.values()) {
itemList.addAll(entry.keySet());
}
return itemList;
Run Code Online (Sandbox Code Playgroud) 至于现在我在做:
Map<Item, Boolean> processedItem = processedItemMap.get(i);
Map.Entry<Item, Boolean> entrySet = getNextPosition(processedItem);
Item key = entrySet.getKey();
Boolean value = entrySet.getValue();
public static Map.Entry<Item, Boolean> getNextPosition(Map<Item, Boolean> processedItem) {
return processedItem.entrySet().iterator().next();
}
Run Code Online (Sandbox Code Playgroud)
有没有更简洁的方法来使用java8?
我正在将我的系统从java迁移到Scala.我在我的java代码中使用了注册表模式来从字符串中获取实现.scala有什么类似的事情吗?我是scala的新手,有人可以指点我正确的参考吗?
我的java代码:
public class ItemRegistry {
private final Map<String, ItemFactory> factoryRegistry;
public ItemRegistry() {
this.factoryRegistry = new HashMap<>();
}
public ItemRegistry(List<ItemFactory> factories) {
factoryRegistry = new HashMap<>();
for (ItemFactory factory : factories) {
registerFactory(factory);
}
}
public void registerFactory(ItemFactory factory) {
Set<String> aliases = factory.getRegisteredItems();
for (String alias : aliases) {
factoryRegistry.put(alias, factory);
}
}
public Item newInstance(String itemName) throws ItemException {
ItemFactory factory = factoryRegistry.get(itemName);
if (factory == null) {
throw new ItemException("Unable to find factory containing alias " …Run Code Online (Sandbox Code Playgroud) 我试图从s3(15天的数据)查询.我试着单独查询它们(每天)它工作正常.它也能正常工作14天.但是当我查询15天时,作业一直在运行(挂起)并且任务#没有更新.
我的设置 :
我正在使用具有动态分配和最大资源打开的51节点集群r3.4x large.
我所做的只是=
val startTime="2017-11-21T08:00:00Z"
val endTime="2017-12-05T08:00:00Z"
val start = DateUtils.getLocalTimeStamp( startTime )
val end = DateUtils.getLocalTimeStamp( endTime )
val days: Int = Days.daysBetween( start, end ).getDays
val files: Seq[String] = (0 to days)
.map( start.plusDays )
.map( d => s"$input_path${DateTimeFormat.forPattern( "yyyy/MM/dd" ).print( d )}/*/*" )
sqlSession.sparkContext.textFile( files.mkString( "," ) ).count
Run Code Online (Sandbox Code Playgroud)
当我运行相同的14天时,我得到了197337380(计数),我分别运行了第15天,得到了27676788.但是当我查询15天总共工作挂起
更新:
这项工作很好:
var df = sqlSession.createDataFrame(sc.emptyRDD[Row], schema)
for(n <- files ){
val tempDF = sqlSession.read.schema( schema ).json(n)
df = df(tempDF)
}
df.count
Run Code Online (Sandbox Code Playgroud)
但有人可以解释为什么它现在有效但不是之前?
更新:将mapreduce.input.fileinputformat.split.minsize设置为256 GB后,它现在工作正常.
如何比较这两个字符串:
val a = "fit bit versa"
val b = "fitbit"
Run Code Online (Sandbox Code Playgroud)
另一个例子
val a = "go pro hero 6"
val b = "gopro"
Run Code Online (Sandbox Code Playgroud)
另一个例子
val a = "hero go pro 6"
val b = "gopro"
Run Code Online (Sandbox Code Playgroud)
另一个例子
val a = "hero 6 go pro"
val b = "gopro"
Run Code Online (Sandbox Code Playgroud)
对于以上比较,我想获得“ true”,但在这里不行:
val a = "vegan protein powder"
val b = "vega"
Run Code Online (Sandbox Code Playgroud)
这应该是错误的。
目前我正在做:
def matchPattern(a:String, b: String):String=
{
val dd = a.split(" ")
val result = dd.map(_.toLowerCase())
if(result contains b.toLowerCase) true
else …Run Code Online (Sandbox Code Playgroud) 我有一个Seq和数据框。数据框包含一列数组类型。我正在尝试Seq从列中删除 中的元素。
例如:
val stop_words = Seq("a", "and", "for", "in", "of", "on", "the", "with", "s", "t")
+---------------------------------------------------+
|sorted_items |
+---------------------------------------------------+
|[flannel, and, for, s, shirts, sleeve, warm] |
|[3, 5, kitchenaid, s] |
|[5, 6, case, flip, inch, iphone, on, xs] |
|[almonds, chocolate, covered, dark, joe, s, the] |
|null |
|[] |
|[animation, book] |
Run Code Online (Sandbox Code Playgroud)
预期输出:
+---------------------------------------------------+
|sorted_items |
+---------------------------------------------------+
|[flannel, shirts, sleeve, warm] |
|[3, 5, kitchenaid] |
|[5, 6, case, flip, inch, …Run Code Online (Sandbox Code Playgroud) 如何获取字符串的子字符串:
"s3n://bucket/test/files/*/*"
Run Code Online (Sandbox Code Playgroud)
我想单独获取s3n:// bucket/test/files.我尝试了拆分:
"s3n://bucket/test/files/*/*".split("/*/*") 但是这给了我每个角色的字符串数组.
我有缓存,每天从数据库加载所有客户详细信息。但在加载每日客户详细信息之前,我需要删除缓存中的所有先前条目。
目前我正在做:
public enum PeriodicUpdater {
TIMER;
private final AtomicBoolean isPublishing = new AtomicBoolean(false);
private final long period = TimeUnit.DAYS.toMillis(1);
@Autowired
@Qualifier("TestUtils") @Setter
private TestUtils testUtils;
public synchronized boolean initialize() {
return initialize(period, period);
}
boolean initialize(long delay, long period) {
if (isPublishing.get()) {
return false;
}
TimerTask task = new TimerTask() {
@Override public void run() {
try {
String path = getFile();
if(TestUtils.getFileNameCache().getIfPresent(path) == null) {
TestUtils.setFileNameCache(testUtils.buildFileCache(path));
}
} catch (Exception e) {
log.warn("Failed!", e);
}
}
}; …Run Code Online (Sandbox Code Playgroud)