从句子中获取所有成对的右分支词

bry*_*bee 4 python nlp nltk python-itertools

鉴于我有一个像这样的字符串:

 'velvet evening purse bags'
Run Code Online (Sandbox Code Playgroud)

我怎样才能得到所有的单词对?换句话说,这是所有2字组合:

'velvet evening'
'velvet purse'
'velvet bags'
'evening purse'
'evening bags'
'purse bags'
Run Code Online (Sandbox Code Playgroud)

我知道python的nltk软件包可以提供二元组,但是我正在寻找功能之外的东西。还是我必须用Python编写自己的自定义函数?

MrG*_*eek 7

您可以itertools.combinations为此使用:

s = 'velvet evening purse bags'

from nltk import word_tokenize

words = word_tokenize(s)

from itertools import combinations

pairs = [' '.join(comb) for comb in combinations(words, 2)]

print(pairs)
Run Code Online (Sandbox Code Playgroud)

输出:

['velvet evening', 'velvet purse', 'velvet bags', 'evening purse', 'evening bags', 'purse bags']
Run Code Online (Sandbox Code Playgroud)