相关疑难解决方法(0)

Python中的模糊字符串比较,与使用哪个库相混淆

我想做模糊字符串比较,但与使用哪个库混淆.

选项1:

import Levenshtein
Levenshtein.ratio('hello world', 'hello')

Result: 0.625

Run Code Online (Sandbox Code Playgroud)

选项2:

import difflib
difflib.SequenceMatcher(None, 'hello world', 'hello').ratio()

Result: 0.625

Run Code Online (Sandbox Code Playgroud)

在这个例子中,两者给出了相同的答案.但我更喜欢使用__CODE__.专家的任何建议.谢谢.

__CODE__

我正在进行临床信息规范化(拼写检查),其中我检查每个给定的单词对900,000字的医学词典.我更关注时间复杂度/性能.

在这种情况下,你认为两者都表现相似吗？

python string-matching difflib levenshtein-distance

Mag*_*gie

2019 06-25

119
推荐指数

2
解决办法

6万
查看次数

将一个 numpy 数组中的字符串匹配到另一个 numpy 数组

您好，我正在工作python 3，我已经面临这个问题有一段时间了，我似乎无法弄清楚这一点。

我有 2 个 numpy 数组，其中包含strings

array_one = np.array(['alice', 'in', 'a', 'wonder', 'land', 'alice in', 'in a', 'a wonder', 'wonder land', 'alice in a', 'in a wonder', 'a wonder land', 'alice in a wonder', 'in a wonder land', 'alice in a wonder land'])

Run Code Online (Sandbox Code Playgroud)

如果您注意到，array_one实际上是一个包含1-gram, 2-gram, 3-gram, 4-gram, 5-gram句子的数组alice in a wonder land。

我故意把“和”wonderland当作两个词。wonderland

现在我有另一个numpy array包含一些位置和名称的。

array_two = np.array(['new york', 'las vegas', 'wonderland', 'florida']) …

Run Code Online (Sandbox Code Playgroud)

python numpy

iam*_*rot

2018 02-25

5
推荐指数

1
解决办法

2773
查看次数

标签统计

python ×2

difflib ×1

levenshtein-distance ×1

numpy ×1

string-matching ×1

Python中的模糊字符串比较,与使用哪个库相混淆

将一个 numpy 数组中的字符串匹配到另一个 numpy 数组

标签 统计

标签统计