小编Jes*_*ABI的帖子

有没有办法获取 numpy 数组(Python)每行的前 k 个值?

给定一个如下形式的 numpy 数组:

x = [[4.,3.,2.,1.,8.],[1.2,3.1,0.,9.2,5.5],[0.2,7.0,4.4,0.2,1.3]]
Run Code Online (Sandbox Code Playgroud)

有没有办法在 python 中保留每行中的前 3 个值并将其他值设置为零(无需显式循环)。上面示例的结果将是

x = [[4.,3.,0.,0.,8.],[0.,3.1,0.,9.2,5.5],[0.0,7.0,4.4,0.0,1.3]]
Run Code Online (Sandbox Code Playgroud)

一个例子的代码

import numpy as np
arr = np.array([1.2,3.1,0.,9.2,5.5,3.2])
indexes=arr.argsort()[-3:][::-1]
a = list(range(6))
A=set(indexes); B=set(a)
zero_ind=(B.difference(A)) 
arr[list(zero_ind)]=0
Run Code Online (Sandbox Code Playgroud)

输出:

array([0. , 0. , 0. , 9.2, 5.5, 3.2])
Run Code Online (Sandbox Code Playgroud)

上面是我的一维 numpy 数组的示例代码(有很多行)。循环遍历 numpy 数组的每一行并重复执行相同的计算将非常昂贵。有没有更简单的方法?

python loops numpy python-2.7 python-3.x

6
推荐指数
1
解决办法
3716
查看次数

从字符串列表中删除短重叠字符串

我有一个字符串列表:mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"],我需要删除作为列表中另一个字符串的子字符串的较短字符串。

例如,在上述情况下,输出应为:[“Tom Hanks”,“Tom Can”]。

我在 python 中做了什么:

mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
newlst = []
for x in mylist:
    noexist = True
    for j in mylist:
        if x==j:continue
        noexist = noexist and not(x in j)         
    if (noexist==True):
        newlst.append(x)
print(newlst)            
Run Code Online (Sandbox Code Playgroud)

该代码运行良好。我怎样才能让它变得高效?

python string list

4
推荐指数
1
解决办法
911
查看次数

将人称代词替换为之前提到的人称(吵闹的 coref)

我想做一个嘈杂的解决方案,以便给定一个人称代词,该代词被前一个(最近的)人代替。

例如:

Alex is looking at buying a U.K. startup for $1 billion. He is very confident that this is going to happen. Sussan is also in the same situation. However, she has lost hope.

输出是:

Alex is looking at buying a U.K. startup for $1 billion. Alex is very confident that this is going to happen. Sussan is also in the same situation. However, Susan has lost hope.

另一个例子,

Peter is a friend of Gates. But Gates does not …

python nlp python-3.x spacy coreference-resolution

4
推荐指数
1
解决办法
1425
查看次数

使用 Stanza 和 CoreNLPClient 提取名词短语

我正在尝试使用 Stanza(使用斯坦福 CoreNLP)从句子中提取名词短语。这只能通过 Stanza 中的 CoreNLPClient 模块来完成。

# Import client module
from stanza.server import CoreNLPClient
# Construct a CoreNLPClient with some basic annotators, a memory allocation of 4GB, and port number 9001
client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse'], memory='4G', endpoint='http://localhost:9001')
Run Code Online (Sandbox Code Playgroud)

这是一个句子的例子,我正在使用tregrex客户端中的函数来获取所有名词短语。Tregex函数dict of dicts在python中返回a 。因此,我需要先处理 的输出,tregrex然后再将其传递给Tree.fromstringNLTK 中的函数,以将名词短语正确提取为字符串。

pattern = 'NP'
text = "Albert Einstein was a German-born theoretical physicist. He developed the theory of relativity."
matches = client.tregrex(text, pattern) ``
Run Code Online (Sandbox Code Playgroud)

因此,我想出了一种方法stanza_phrases,它必须循​​环遍历NLTK …

python nlp stanford-nlp stanford-stanza

3
推荐指数
1
解决办法
1129
查看次数