小编rom*_*mbi的帖子

有效计算两个字符串之间的编辑距离

我有一个长度为 1000 的字符串 S 和一个长度为 100 的查询字符串 Q。我想计算查询字符串 Q 与长度为 100 的字符串 S 的每个子字符串的编辑距离。一种简单的方法是动态计算编辑距离每个子串独立,即edDist(q,s[0:100]), edDist(q,s[1:101]), edDist(q,s[2:102])....... edDist(q,s[900:1000])

def edDist(x, y):
""" Calculate edit distance between sequences x and y using
    matrix dynamic programming.  Return distance. """
D = zeros((len(x)+1, len(y)+1), dtype=int)
D[0, 1:] = range(1, len(y)+1)
D[1:, 0] = range(1, len(x)+1)
for i in range(1, len(x)+1):
    for j in range(1, len(y)+1):
        delt = 1 if x[i-1] != y[j-1] else 0
        D[i, j] = min(D[i-1, j-1]+delt, …
Run Code Online (Sandbox Code Playgroud)

algorithm edit-distance

5
推荐指数
1
解决办法
1465
查看次数

Stanford Parser内存不足

我试图使用python代码在Ubuntu中运行Stanford解析器.我的文本文件是500 Mb,我试图解析.我有一个32GB的RAM.我正在增加JVM大小,但我不知道它是否实际上是否正在增加,因为每次我收到此错误.请帮帮我

WARNING!! OUT OF MEMORY! THERE WAS NOT ENOUGH  ***
***  MEMORY TO RUN ALL PARSERS.  EITHER GIVE THE    ***
***  JVM MORE MEMORY, SET THE MAXIMUM SENTENCE      ***
***  LENGTH WITH -maxLength, OR PERHAPS YOU ARE     ***
***  HAPPY TO HAVE THE PARSER FALL BACK TO USING    ***
***  A SIMPLER PARSER FOR VERY LONG SENTENCES.      ***
Sentence has no parse using PCFG grammar (or no PCFG fallback).  Skipping...
Exception in thread "main" edu.stanford.nlp.parser.common.NoSuchParseException
    at edu.stanford.nlp.parser.lexparser.LexicalizedParserQuery.getBestParse(LexicalizedParserQuery.java:398)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParserQuery.getBestParse(LexicalizedParserQuery.java:370) …
Run Code Online (Sandbox Code Playgroud)

python java ubuntu jvm stanford-nlp

3
推荐指数
1
解决办法
3431
查看次数

标签 统计

algorithm ×1

edit-distance ×1

java ×1

jvm ×1

python ×1

stanford-nlp ×1

ubuntu ×1