小编Jes*_*ABI的帖子

有没有办法获取 numpy 数组（Python）每行的前 k 个值？

给定一个如下形式的 numpy 数组：

x = [[4.,3.,2.,1.,8.],[1.2,3.1,0.,9.2,5.5],[0.2,7.0,4.4,0.2,1.3]]

Run Code Online (Sandbox Code Playgroud)

有没有办法在 python 中保留每行中的前 3 个值并将其他值设置为零（无需显式循环）。上面示例的结果将是

x = [[4.,3.,0.,0.,8.],[0.,3.1,0.,9.2,5.5],[0.0,7.0,4.4,0.0,1.3]]

Run Code Online (Sandbox Code Playgroud)

一个例子的代码

import numpy as np
arr = np.array([1.2,3.1,0.,9.2,5.5,3.2])
indexes=arr.argsort()[-3:][::-1]
a = list(range(6))
A=set(indexes); B=set(a)
zero_ind=(B.difference(A)) 
arr[list(zero_ind)]=0

Run Code Online (Sandbox Code Playgroud)

输出：

array([0. , 0. , 0. , 9.2, 5.5, 3.2])

Run Code Online (Sandbox Code Playgroud)

上面是我的一维 numpy 数组的示例代码（有很多行）。循环遍历 numpy 数组的每一行并重复执行相同的计算将非常昂贵。有没有更简单的方法？

python loops numpy python-2.7 python-3.x

Jes*_*ABI

2019 12-19

6
推荐指数

1
解决办法

3716
查看次数

从字符串列表中删除短重叠字符串

我有一个字符串列表：mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]，我需要删除作为列表中另一个字符串的子字符串的较短字符串。

例如，在上述情况下，输出应为：[“Tom Hanks”，“Tom Can”]。

我在 python 中做了什么：

mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
newlst = []
for x in mylist:
    noexist = True
    for j in mylist:
        if x==j:continue
        noexist = noexist and not(x in j)         
    if (noexist==True):
        newlst.append(x)
print(newlst)

Run Code Online (Sandbox Code Playgroud)

该代码运行良好。我怎样才能让它变得高效？

python string list

Jes*_*ABI

2020 07-09

4
推荐指数

1
解决办法

911
查看次数

将人称代词替换为之前提到的人称（吵闹的 coref）

我想做一个嘈杂的解决方案，以便给定一个人称代词，该代词被前一个（最近的）人代替。

例如：

Alex is looking at buying a U.K. startup for $1 billion. He is very confident that this is going to happen. Sussan is also in the same situation. However, she has lost hope.

输出是：

Alex is looking at buying a U.K. startup for $1 billion. Alex is very confident that this is going to happen. Sussan is also in the same situation. However, Susan has lost hope.

另一个例子，

Peter is a friend of Gates. But Gates does not …

python nlp python-3.x spacy coreference-resolution

Jes*_*ABI

2020 10-10

4
推荐指数

1
解决办法

1425
查看次数

使用 Stanza 和 CoreNLPClient 提取名词短语

我正在尝试使用 Stanza（使用斯坦福 CoreNLP）从句子中提取名词短语。这只能通过 Stanza 中的 CoreNLPClient 模块来完成。

# Import client module
from stanza.server import CoreNLPClient
# Construct a CoreNLPClient with some basic annotators, a memory allocation of 4GB, and port number 9001
client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse'], memory='4G', endpoint='http://localhost:9001')

Run Code Online (Sandbox Code Playgroud)

这是一个句子的例子，我正在使用tregrex客户端中的函数来获取所有名词短语。Tregex函数dict of dicts在python中返回a 。因此，我需要先处理的输出，tregrex然后再将其传递给Tree.fromstringNLTK 中的函数，以将名词短语正确提取为字符串。

pattern = 'NP'
text = "Albert Einstein was a German-born theoretical physicist. He developed the theory of relativity."
matches = client.tregrex(text, pattern) ``

Run Code Online (Sandbox Code Playgroud)

因此，我想出了一种方法stanza_phrases，它必须循环遍历NLTK …

python nlp stanford-nlp stanford-stanza

Jes*_*ABI

2020 05-19

3
推荐指数

1
解决办法

1129
查看次数