我试图从我的下面的转变中获得该Link领域User Defined Java Class.

这是我写的代码User Defined Java Class:
private String link;
public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException
{
Object[] r=getRow();
if (r == null) {
setOutputDone();
return false;
}
if (first) {
link = getParameter("Link");
first = false;
}
String linkField = get(Fields.In, link).getString(r);
logBasic("link:" + link);
return true;
}
Run Code Online (Sandbox Code Playgroud)
当我运行上面的代码时,这是我在用户定义的Java类步骤中得到的错误:
2016/06/28 11:26:57 - User Defined Java Class.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected error
2016/06/28 …Run Code Online (Sandbox Code Playgroud) 我很新boilerpipe,我正在尝试以下基本代码:
package contentExtraction;
import java.net.URL;
import de.l3s.boilerpipe.extractors.ArticleExtractor;
public class ContentExtractor {
public static void main(String[] args) throws Exception {
final URL url = new URL(
// "http://www.l3s.de/web/page11g.do?sp=page11g&link=ln104g&stu1g.LanguageISOCtxParam=en"
"http://www.dn.se/nyheter/vetenskap/annu-godare-choklad-med-hjalp-av-dna-teknik"
);
System.out.println(ArticleExtractor.INSTANCE.getText(url));
}
}
Run Code Online (Sandbox Code Playgroud)
但是在尝试运行上面的代码时出现以下错误:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/xerces/parsers/AbstractSAXParser
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$100(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at de.l3s.boilerpipe.sax.BoilerpipeSAXInput.getTextDocument(BoilerpipeSAXInput.java:51)
at …Run Code Online (Sandbox Code Playgroud) ClassifierBasedPOSTagger我正在尝试使用with执行 POS 标记classifier_builder=MaxentClassifier.train。这是一段代码:
from nltk.tag.sequential import ClassifierBasedPOSTagger
from nltk.classify import MaxentClassifier
from nltk.corpus import brown
brown_tagged_sents = brown.tagged_sents(categories='news')
size = int(len(brown_tagged_sents) * 0.9)
train_sents = brown_tagged_sents[:size]
test_sents = brown_tagged_sents[size:]
me_tagger = ClassifierBasedPOSTagger(train=train_sents, classifier_builder=MaxentClassifier.train)
print(me_tagger.evaluate(test_sents))
Run Code Online (Sandbox Code Playgroud)
但运行代码一个小时后,我发现它仍在初始化ClassifierBasedPOSTagger(train=train_sents, classifier_builder=MaxentClassifier.train). 在输出中,我可以看到以下代码正在运行:
==> Training (100 iterations)
Iteration Log Likelihood Accuracy
---------------------------------------
1 -5.35659 0.007
2 -0.85922 0.953
3 -0.56125 0.986
Run Code Online (Sandbox Code Playgroud)
我认为在分类器准备好为任何输入标记词性之前,迭代次数将达到 100 次。我想这需要一整天的时间。为什么要花这么多时间?减少迭代次数会让这段代码变得有点实用(意味着减少时间并且仍然足够有用),如果是,那么如何减少这些迭代?
编辑
1.5 小时后,我得到以下输出:
==> Training (100 iterations)
Iteration Log Likelihood Accuracy
---------------------------------------
1 -5.35659 0.007
2 -0.85922 …Run Code Online (Sandbox Code Playgroud) 我想知道哪种方法在比较两个类时更有效。
方法一:
a = '123'
a.class.name == 'String'
Run Code Online (Sandbox Code Playgroud)
方法二:
a = '123'
a.kind_of? String
Run Code Online (Sandbox Code Playgroud)
任何指针将不胜感激。谢谢!