thy*_*yde 5 python csv numpy adjacency-matrix
我的数据格式如下:
eventid mnbr
20 1
26 1
12 2
14 2
15 3
14 3
10 3
Run Code Online (Sandbox Code Playgroud)
eventid是一个成员参加数据的事件被表示为一个小组,因此您可以看到每个成员参加多个活动,多个成员可以参加同一个活动.我的目标是创建一个邻接矩阵,显示:
mnbr 1 2 3
1 1 0 0
2 0 1 1
3 0 1 1
Run Code Online (Sandbox Code Playgroud)
只要两名成员参加同一活动,就会有1.我成功地将csv文件的列读入2个独立的1D numpy数组.然而,在这里,我不确定如何继续.如何使用第2列创建矩阵,以及如何使用第1列填充值?我知道我没有发布任何代码,并且不期望在这方面有任何解决方案,但会非常感谢如何以有效的方式解决问题.我有大约300万个观测值,因此创建太多外部变量会有问题.提前致谢.我收到一条通知,说我的问题可能是重复的,但我的问题是解析数据而不是创建邻接矩阵.
这是一个解决方案。它不会直接为您提供所请求的邻接矩阵,而是为您提供自己创建邻接矩阵所需的内容。
#assume you stored every line of your input as a tuples (eventid, mnbr).
observations = [(20, 1), (26, 1), (12, 2), (14, 2), (15,3 ), (14, 3), (10, 3)]
#then creates an event link dictionary. i.e something that link every event to all its mnbrs
eventLinks = {}
for (eventid, mnbr) in observations :
#If this event have never been encoutered then create a new entry in links
if not eventid in eventLinks.keys():
eventLinks[eventid] = []
eventLinks[eventid].append(mnbr)
#collect the mnbrs
mnbrs = set([mnbr for (eventid, mnbr) in observations])
#create a member link dictionary. This one link a mnbr to other mnbr linked to it.
mnbrLinks = { mnbr : set() for mnbr in mnbrs }
for mnbrList in eventLinks.values() :
#add for each mnbr all the mnbr implied in the same event.
for mnbr in mnbrList:
mnbrLinks[mnbr] = mnbrLinks[mnbr].union(set(mnbrList))
print(mnbrLinks)
Run Code Online (Sandbox Code Playgroud)
执行此代码给出以下结果:
{1: {1}, 2: {2, 3}, 3: {2, 3}}
Run Code Online (Sandbox Code Playgroud)
这是一个字典,其中每个字典mnbr都有一组关联的邻接关系mnbrs。这实际上是一个邻接表,即一个压缩的邻接矩阵。您可以扩展它并使用字典键和值作为行和列索引来构建您请求的矩阵。
希望有帮助。亚瑟.
编辑:我提供了一种使用邻接列表的方法,让您实现自己的邻接矩阵构建。但如果您的数据稀疏,您应该考虑真正使用此数据结构。请参阅http://en.wikipedia.org/wiki/Adjacency_list
编辑2:添加代码将adjacencyList转换为一个小智能adjacencyMatrix
adjacencyList = {1: {1}, 2: {2, 3}, 3: {2, 3}}
class AdjacencyMatrix():
def __init__(self, adjacencyList, label = ""):
"""
Instanciation method of the class.
Create an adjacency matrix from an adjacencyList.
It is supposed that graph vertices are labeled with numbers from 1 to n.
"""
self.matrix = []
self.label = label
#create an empty matrix
for i in range(len(adjacencyList.keys())):
self.matrix.append( [0]*(len(adjacencyList.keys())) )
for key in adjacencyList.keys():
for value in adjacencyList[key]:
self[key-1][value-1] = 1
def __str__(self):
# return self.__repr__() is another possibility that just print the list of list
# see python doc about difference between __str__ and __repr__
#label first line
string = self.label + "\t"
for i in range(len(self.matrix)):
string += str(i+1) + "\t"
string += "\n"
#for each matrix line :
for row in range(len(self.matrix)):
string += str(row+1) + "\t"
for column in range(len(self.matrix)):
string += str(self[row][column]) + "\t"
string += "\n"
return string
def __repr__(self):
return str(self.matrix)
def __getitem__(self, index):
""" Allow to access matrix element using matrix[index][index] syntax """
return self.matrix.__getitem__(index)
def __setitem__(self, index, item):
""" Allow to set matrix element using matrix[index][index] = value syntax """
return self.matrix.__setitem__(index, item)
def areAdjacent(self, i, j):
return self[i-1][j-1] == 1
m = AdjacencyMatrix(adjacencyList, label="mbr")
print(m)
print("m.areAdjacent(1,2) :",m.areAdjacent(1,2))
print("m.areAdjacent(2,3) :",m.areAdjacent(2,3))
Run Code Online (Sandbox Code Playgroud)
该代码给出以下结果:
mbr 1 2 3
1 1 0 0
2 0 1 1
3 0 1 1
m.areAdjacent(1,2) : False
m.areAdjacent(2,3) : True
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1364 次 |
| 最近记录: |