我只是对gensim字典的实现感到好奇.我有以下代码:
def build_dictionary(documents):
dictionary = corpora.Dictionary(documents)
dictionary.save('/tmp/deerwester.dict') # store the dictionary
return dictionary
Run Code Online (Sandbox Code Playgroud)
我查看了文件deerwester.dict,它看起来像这样:
8002 6367 656e 7369 6d2e 636f 7270 6f72
612e 6469 6374 696f 6e61 7279 0a44 6963
7469 6f6e 6172 790a 7101 2981 7102 7d71
0328 5508 6e75 6d5f 646f 6373 7104 4b09
5508 ...
Run Code Online (Sandbox Code Playgroud)
但是,以下代码
my_dict = dictionary.load('/tmp/deerwester.dict')
print my_dict.token2id #view dictionary
Run Code Online (Sandbox Code Playgroud)
得出这个:
{'minors': 30, 'generation': 22, 'testing': 16, 'iv': 29, 'engineering': 15, 'computer': 2, 'relation': 20, 'human': 3, 'measurement': 18, 'unordered': 25, 'binary': 21, 'abc': …Run Code Online (Sandbox Code Playgroud) 请查看以下代码段.我在"this.directories.Add(new directory(s));"中遇到了nullreferenceexception.递归似乎一直有效,直到它"解开",此时"新目录"似乎是空的.我不确定它为什么会这样表现,我想也许有特殊的规则,因为递归是在构造函数中.请帮忙.
namespace AnalyzeDir
{
class directory
{
public string[] files;
public ArrayList directories;
public string mypath;
public string myname;
public directory(string mp)
{
mypath = mp;
myname = mypath.Substring(mypath.LastIndexOf("\\"));
files = Directory.GetFiles(mypath);
fillDirectoriesRescursive();
}
public void fillDirectoriesRescursive()
{
string[] dirpaths = Directory.GetDirectories(mypath);
if (dirpaths != null && (dirpaths.Length > 0))
{
foreach(string s in dirpaths)
{
this.directories.Add(new directory(s));
}
}
}
Run Code Online (Sandbox Code Playgroud)