给定一个如下形式的 numpy 数组:
x = [[4.,3.,2.,1.,8.],[1.2,3.1,0.,9.2,5.5],[0.2,7.0,4.4,0.2,1.3]]
Run Code Online (Sandbox Code Playgroud)
有没有办法在 python 中保留每行中的前 3 个值并将其他值设置为零(无需显式循环)。上面示例的结果将是
x = [[4.,3.,0.,0.,8.],[0.,3.1,0.,9.2,5.5],[0.0,7.0,4.4,0.0,1.3]]
Run Code Online (Sandbox Code Playgroud)
一个例子的代码
import numpy as np
arr = np.array([1.2,3.1,0.,9.2,5.5,3.2])
indexes=arr.argsort()[-3:][::-1]
a = list(range(6))
A=set(indexes); B=set(a)
zero_ind=(B.difference(A))
arr[list(zero_ind)]=0
Run Code Online (Sandbox Code Playgroud)
输出:
array([0. , 0. , 0. , 9.2, 5.5, 3.2])
Run Code Online (Sandbox Code Playgroud)
上面是我的一维 numpy 数组的示例代码(有很多行)。循环遍历 numpy 数组的每一行并重复执行相同的计算将非常昂贵。有没有更简单的方法?
我有一个字符串列表:mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"],我需要删除作为列表中另一个字符串的子字符串的较短字符串。
例如,在上述情况下,输出应为:[“Tom Hanks”,“Tom Can”]。
我在 python 中做了什么:
mylist = ["Hanks", "Tom Hanks","Tom","Tom Can"]
newlst = []
for x in mylist:
noexist = True
for j in mylist:
if x==j:continue
noexist = noexist and not(x in j)
if (noexist==True):
newlst.append(x)
print(newlst)
Run Code Online (Sandbox Code Playgroud)
该代码运行良好。我怎样才能让它变得高效?
我想做一个嘈杂的解决方案,以便给定一个人称代词,该代词被前一个(最近的)人代替。
例如:
Alex is looking at buying a U.K. startup for $1 billion. He is very confident that this is going to happen. Sussan is also in the same situation. However, she has lost hope.
输出是:
Alex is looking at buying a U.K. startup for $1 billion. Alex is very confident that this is going to happen. Sussan is also in the same situation. However, Susan has lost hope.
另一个例子,
Peter is a friend of Gates. But Gates does not …
我正在尝试使用 Stanza(使用斯坦福 CoreNLP)从句子中提取名词短语。这只能通过 Stanza 中的 CoreNLPClient 模块来完成。
# Import client module
from stanza.server import CoreNLPClient
# Construct a CoreNLPClient with some basic annotators, a memory allocation of 4GB, and port number 9001
client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse'], memory='4G', endpoint='http://localhost:9001')
Run Code Online (Sandbox Code Playgroud)
这是一个句子的例子,我正在使用tregrex客户端中的函数来获取所有名词短语。Tregex函数dict of dicts在python中返回a 。因此,我需要先处理 的输出,tregrex然后再将其传递给Tree.fromstringNLTK 中的函数,以将名词短语正确提取为字符串。
pattern = 'NP'
text = "Albert Einstein was a German-born theoretical physicist. He developed the theory of relativity."
matches = client.tregrex(text, pattern) ``
Run Code Online (Sandbox Code Playgroud)
因此,我想出了一种方法stanza_phrases,它必须循环遍历NLTK …
python ×4
nlp ×2
python-3.x ×2
list ×1
loops ×1
numpy ×1
python-2.7 ×1
spacy ×1
stanford-nlp ×1
string ×1