我正在创建句子的单词表示.然后将句子中存在的单词与文件"vectors.txt"进行比较,以获得它们的嵌入向量.在获得句子中存在的每个单词的向量之后,我将对句子中单词的向量进行平均.这是我的代码:
import nltk
import numpy as np
from nltk import FreqDist
from nltk.corpus import brown
news = brown.words(categories='news')
news_sents = brown.sents(categories='news')
fdist = FreqDist(w.lower() for w in news)
vocabulary = [word for word, _ in fdist.most_common(10)]
num_sents = len(news_sents)
def averageEmbeddings(sentenceTokens, embeddingLookupTable):
listOfEmb=[]
for token in sentenceTokens:
embedding = embeddingLookupTable[token]
listOfEmb.append(embedding)
return sum(np.asarray(listOfEmb)) / float(len(listOfEmb))
embeddingVectors = {}
with open("D:\\Embedding\\vectors.txt") as file:
for line in file:
(key, *val) = line.split()
embeddingVectors[key] = val
for i in range(num_sents):
features = {} …Run Code Online (Sandbox Code Playgroud) 我使用Anaconda和gdsCAD并在正确安装所有软件包时出错.如下所述:http://pythonhosted.org/gdsCAD/
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('S32') dtype('S32') dtype('S32')
Run Code Online (Sandbox Code Playgroud)
我的导入看起来像这样(最后我导入了所有东西):
import numpy as np
from gdsCAD import *
import matplotlib.pyplot as plt
Run Code Online (Sandbox Code Playgroud)
我的示例代码如下所示:
something = core.Elements()
box=shapes.Box( (5,5),(1,5),0.5)
core.default_layer = 1
core.default_colors = 2
something.add(box)
something.show()
Run Code Online (Sandbox Code Playgroud)
我的错误消息如下所示:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-2f90b960c1c1> in <module>()
31 puffer_wafer = shapes.Circle((0.,0.), puffer_wafer_radius, puffer_line_thickness)
32 bp.add(puffer_wafer)
---> 33 bp.show()
34 wafer = shapes.Circle((0.,0.), wafer_radius, wafer_line_thickness)
35 bp.add(wafer)
C:\Users\rpilz\AppData\Local\Continuum\Anaconda2\lib\site-packages\gdscad-0.4.5-py2.7.egg\gdsCAD\core.pyc in _show(self)
80 …Run Code Online (Sandbox Code Playgroud)