在Python中使用NLTK的短语的一致性

Question

在Python中使用NLTK的短语的一致性

是否有可能在NLTK中获得一个短语的一致性？

import nltk
from nltk.corpus import PlaintextCorpusReader

corpus_loc = "c://temp//text//"
files = ".*\.txt"
read_corpus = PlaintextCorpusReader(corpus_loc, files)
corpus  = nltk.Text(read_corpus.words())
test = nltk.TextCollection(corpus_loc)

corpus.concordance("claim")

Run Code Online (Sandbox Code Playgroud)

例如上面的回报

on okay okay okay i can give you the claim number and my information and
 decide on the shop okay okay so the claim number is xxxx - xx - xxxx got

Run Code Online (Sandbox Code Playgroud)

现在,如果我尝试corpus.concordance("claim number")它不起作用...我确实有代码通过使用.partition()方法和相同的一些进一步编码来做到这一点...但我想知道是否可以使用相同的concordance.

Answer 1

b30*_*000 6

根据这个问题,用该concordance()功能搜索多个单词是不可能的.

Answer 2

ale*_*xis 5

如果您阅读有关@ b3000挖出这个问题的讨论，您会发现奇怪的是，实际上多词一致性是可用的-但仅在图形一致性工具中，您可以像这样启动：

>>> from nltk.app import concordance
>>> concordance()

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，1 月前
查看次数：	1796 次
最近记录：	8 年前