我有一组名为的文件:
Friends - 6x03 - Tow Ross' Denial.srt
Friends - 6x20 - Tow Mac and C.H.E.E.S.E..srt
Friends - 6x05 - Tow Joey's Porshe.srt
Run Code Online (Sandbox Code Playgroud)
我想像下面那样重命名它们
S06E03.srt
S06E20.srt
S06E05.srt
Run Code Online (Sandbox Code Playgroud)
我该怎么做才能在linux终端上完成工作?我已经安装了重命名但是U使用以下方法获得错误:
rename -n 's/(\w+) - (\d{1})x(\d{2})*$/S0$2E$3\.srt/' *.srt
Run Code Online (Sandbox Code Playgroud) 我想要一个列表数组.在c ++中我喜欢:
List<int> a[100];
Run Code Online (Sandbox Code Playgroud)
这是一个包含100个列表的数组.每个列表可以包含许多元素.我不知道如何在c#中这样做.谁能帮我?
我的数据库中有一个表,我想运行一个查询
SELECT column1, column2 FROM my_table WHERE my_condition;
Run Code Online (Sandbox Code Playgroud)
但我希望mysql返回column2
utf8编码.在mysql中执行这样的任务有什么功能吗?那是什么?
我使用lucene索引了一组文档.我还为每个文档内容存储了DocumentTermVector.我写了一个程序并为每个文档得到了术语频率向量,但是如何获得每个文档的tf-idf向量?
这是我的代码,在每个文档中输出术语频率:
Directory dir = FSDirectory.open(new File(indexDir));
IndexReader ir = IndexReader.open(dir);
for (int docNum=0; docNum<ir.numDocs(); docNum++) {
System.out.println(ir.document(docNum).getField("filename").stringValue());
TermFreqVector tfv = ir.getTermFreqVector(docNum, "contents");
if (tfv == null) {
// ignore empty fields
continue;
}
String terms[] = tfv.getTerms();
int termCount = terms.length;
int freqs[] = tfv.getTermFrequencies();
for (int t=0; t < termCount; t++) {
System.out.println(terms[t] + " " + freqs[t]);
}
}
Run Code Online (Sandbox Code Playgroud)
在lucene中有没有任何buit-in功能让我这样做?
没有人帮忙,我自己做了:
Directory dir = FSDirectory.open(new File(indexDir));
IndexReader ir = IndexReader.open(dir);
int docNum;
for (docNum = 0; docNum<ir.numDocs(); …
Run Code Online (Sandbox Code Playgroud) 我试图在Lucene中使用多个线程构建索引.所以,我开始编写代码并编写了以下代码.首先,我找到文件,每个文件,我创建一个线程来索引它.之后,我加入线程并优化索引.它有效,但我不确定......我可以大规模信任它吗?有没有办法改善它?
import java.io.File;
import java.io.FileFilter;
import java.io.FileReader;
import java.io.IOException;
import java.io.File;
import java.io.FileReader;
import java.io.BufferedReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Document;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.analysis.StopAnalyzer;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
import org.apache.lucene.index.TermFreqVector;
public class mIndexer extends Thread {
private File ifile;
private static IndexWriter writer;
public mIndexer(File f) {
ifile = f.getAbsoluteFile();
}
public static void main(String args[]) throws Exception {
System.out.println("here...");
String indexDir;
String dataDir;
if (args.length != 2) {
dataDir = new String("/home/omid/Ranking/docs/"); …
Run Code Online (Sandbox Code Playgroud) 我想给文件写一些字符串.所以,我使用了BufferedWriter类.由于许多线程倾向于写入该文件,我想知道write和writeLine方法是否是原子的.
另外,我希望程序输出写入到多个文件,并且每个文件100线(比如file.txt0,file.txt1,...).例如
public class Main {
static ExecutorService exec = Executors.newFixedThreadPool(5);
BufferedWriter bw;
public class myWriter implements Runnable {
String str;
myWriter (String str) {
this.str = str;
}
public void run() {
bw.write(str);
bw.writeLine();
}
}
public static void main(String[] args) {
bw = new BufferedWriter(new FileWriter("train.txt"));
for (String arg: args)
exec.execute(new myWriter(arg));
exec.awaitTermination(100000, TimeUnit.MILLISECONDS);
}
}
Run Code Online (Sandbox Code Playgroud)
有人能帮我吗?如果它们不是原子的,我怎样才能使它们成为原子并避免碰撞?
我试图在我的系统上运行Nutch 2爬虫,但是我收到以下错误:
Exception in thread "main" org.apache.gora.util.GoraException: java.io.IOException: java.sql.SQLTransientConnectionException: java.net.ConnectException: Connection refused
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:69)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:243)
at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
Caused by: java.io.IOException: java.sql.SQLTr
ansientConnectionException: java.net.ConnectException: Connection refused
at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
... 8 more
Caused by: java.sql.SQLTransientConnectionException: java.net.ConnectException: Connection refused
at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
at org.hsqldb.jdbc.JDBCConnection.<init>(Unknown Source)
at org.hsqldb.jdbc.JDBCDriver.getConnection(Unknown Source)
at org.hsqldb.jdbc.JDBCDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:620)
at java.sql.DriverManager.getConnection(DriverManager.java:200)
at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:739)
... 11 more
Caused …
Run Code Online (Sandbox Code Playgroud) 我已经将一组文档与Lucene分类(字段:内容,类别).每个文档都有自己的类别,但其中一些标记为未分类.有没有办法在java中轻松地对这些文档进行分类?
我有一个有几个成员的 Java 类。我想为它编写一个自定义演员表。我想知道怎么可能这样做?
我们假设该类如下:
class Person {
private int age;
private float weight;
// getters and setters and etc
}
Run Code Online (Sandbox Code Playgroud)
我希望强制转换返回对象的int
成员,并希望强制转换返回对象的成员。age
float
weight
例如:
class Person {
private int age;
private float weight;
// getters and setters and etc
}
Run Code Online (Sandbox Code Playgroud)
我想知道是否可以做相反的事情。特别是,转换int
to将返回其被分配Person
的实例,对于 也类似。Person
age
float
我知道这个问题可能没有答案。但由于我在搜索中没有找到任何有用的结果,所以我决定询问一下。
PS 我知道对于 a String
,该toString
方法将处理情况 1。
我正在使用Lucene 3.5.0,我想输出每个文档的术语向量.例如,我想知道所有文档和每个特定文档中术语的频率.我的索引代码是:
import java.io.FileFilter;
import java.io.FileReader;
import java.io.IOException;
import java.io.File;
import java.io.FileReader;
import java.io.BufferedReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Document;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
public class Indexer {
public static void main(String[] args) throws Exception {
if (args.length != 2) {
throw new IllegalArgumentException("Usage: java " + Indexer.class.getName() + " <index dir> <data dir>");
}
String indexDir = args[0];
String dataDir = args[1];
long start = System.currentTimeMillis();
Indexer indexer = new Indexer(indexDir);
int numIndexed; …
Run Code Online (Sandbox Code Playgroud)