Joh*_*erz 0 python dictionary list wordnet
所以我有这个由数字和单词组成的文本(wordnet)文件,例如像这样 -
"09807754 18 n 03 aristocrat 0 blue_blood 0 patrician"
Run Code Online (Sandbox Code Playgroud)
我想在第一个数字中读取后续单词的字典名称(或列表).它的布局永远不会改变,它始终是一个8位数的键,后跟一个两位数字,一个字母和一个两位数字.最后两位数字(03)表示有多少单词(在这种情况下为三个单词)与前8位数字键相关联.
我的想法是,我会搜索字符串中的第14位并使用该数字运行循环来挑选与该键相关的所有单词
所以我认为它会像这样
with open('nouns.txt','r') as f:
for line in f:
words = range(14,15)
numOfWords = int(words)
while i =< numOfWords
#here is where the problem arises,
#i want to search for words after the spaces 3 (numOfWords) times
#and put them into a dictionary(or list) associated with the key
range(0,7) = {word(i+1), word(i+2)}
Run Code Online (Sandbox Code Playgroud)
从技术上讲,我正在寻找其中任何一个更有意义:
09807754 = { 'word1':aristocrat, 'word2':blue_blood , 'word3':patrician }
or
09807754 = ['aristocrat', 'blue_blood', 'patrician']
Run Code Online (Sandbox Code Playgroud)
显然这不会运行,但如果有人能给我任何指针,将不胜感激
>>> L = "09807754 18 n 03 aristocrat 0 blue_blood 0 patrician".split()
>>> L[0], L[4::2]
('09807754', ['aristocrat', 'blue_blood', 'patrician'])
>>> D = {}
>>> D.update({L[0]: L[4::2]})
>>> D
{'09807754': ['aristocrat', 'blue_blood', 'patrician']}
Run Code Online (Sandbox Code Playgroud)
对于评论中的额外行,需要一些额外的逻辑
>>> L = "09827177 18 n 03 aristocrat 0 blue_blood 0 patrician 0 013 @ 09646208 n 0000".split()
>>> D.update({L[0]: L[4:4 + 2 * int(L[3]):2]})
>>> D
{'09807754': ['aristocrat', 'blue_blood', 'patrician'], '09827177': ['aristocrat', 'blue_blood', 'patrician']}
Run Code Online (Sandbox Code Playgroud)