tee*_*tee 7 python arrays numpy vectorization
如何通过查找数组B中数组A的值来合并以下两个数组?
数组A:
array([['GG', 'AB', IPv4Network('1.2.3.41/26')],
['GG', 'AC', IPv4Network('1.2.3.42/25')],
['GG', 'AD', IPv4Network('1.2.3.43/24')],
['GG', 'AE', IPv4Network('1.2.3.47/23')],
['GG', 'AF', IPv4Network('1.2.3.5/24')]],
dtype=object)
Run Code Online (Sandbox Code Playgroud)
和数组B:
array([['123456', 'A1', IPv4Address('1.2.3.5'), nan],
['987654', 'B1', IPv4Address('1.2.3.47'), nan]],
dtype=object)
Run Code Online (Sandbox Code Playgroud)
这里的目标是创建Array C,通过从Array A中的Array B查找IPv4Address并比较它们,并获取相应数组的第二个值并存储它:
数组C:
array([['123456', 'A1', IPv4Address('1.2.3.5'), nan, 'AF'],
['987654', 'B1', IPv4Address('1.2.3.47'), nan, 'AE']],
dtype=object)
Run Code Online (Sandbox Code Playgroud)
ip地址属于这种类型:https://docs.python.org/3/library/ipaddress.html#ipaddress.ip_network
我怎样才能做到这一点?
请注意,合并取决于IP匹配,因此生成的数组C将具有与数组B相同数量的数组,但它将具有一个更多值.建议的重复链接没有回答相同的问题.
这应该做你要求的(至少输出正是你想要的),我做了一些小的假设来处理你的#dummydata,但这不应该太重要.
码:
import numpy as np
import ipaddress as ip
array_A = np.array([['GG', 'AB', ip.ip_network('192.168.0.0/32')],
['GG', 'AC', ip.ip_network('192.168.0.0/31')],
['GG', 'AD', ip.ip_network('192.168.0.0/30')],
['GG', 'AE', ip.ip_network('192.168.0.0/29')],
['GG', 'AF', ip.ip_network('192.168.0.0/28')]],
dtype=object)
array_B = np.array([['123456', 'A1', ip.ip_network('192.168.0.0/28'), np.nan],
['987654', 'B1', ip.ip_network('192.168.0.0/29'), np.nan]],
dtype=object)
def merge_by_ip(A, B):
# initializing an empty array with len(B) rows and 5 columns for the values you want to save in it
C = np.empty([len(B), 5],dtype=object)
for n in range(len(B)):
for a in A:
# checking condition: if ip address in a is ip address in b
if a[2] == B[n][2]:
# add the entry of b with the second value of a to the new Array c
C[n] = np.append(B[n], a[1])
return C
print(merge_by_ip(array_A, array_B))
Run Code Online (Sandbox Code Playgroud)
输出:
[['123456' 'A1' IPv4Network('192.168.0.0/28') nan 'AF']
['987654' 'B1' IPv4Network('192.168.0.0/29') nan 'AE']]
Run Code Online (Sandbox Code Playgroud)
注意:
这种解决方案具有O(m * n)复杂性,这是不必要的,有许多开箱即用(Pandas)和自定义(例如使用dict)方式以较低的复杂性进行合并.
| 归档时间: |
|
| 查看次数: |
1045 次 |
| 最近记录: |