Roc*_*Owl 5 python networkx python-3.x pytorch pytorch-geometric
问题:将图形从networkxpytorch 几何图形转换为图形时如何保留节点顺序/标签?
代码:(在 Google Colab 中运行)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
import torch
from torch.nn import Linear
import torch.nn.functional as F
torch.__version__
# install pytorch geometric
!pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.10.0+cpu.html
from torch_geometric.nn import GCNConv
from torch_geometric.utils.convert import to_networkx, from_networkx
# Make the networkx graph
G = nx.Graph()
# Add some cars
G.add_nodes_from([
('Ford', {'y': 0, 'Name': 'Ford'}),
('Lexus', {'y': 1, 'Name': 'Lexus'}),
('Peugot', {'y': 2, 'Name': 'Peugot'}),
('Mitsubushi', {'y': 3, 'Name': 'Mitsubishi'}),
('Mazda', {'y': 4, 'Name': 'Mazda'}),
])
# Relabel the nodes
remapping = {x[0]: i for i, x in enumerate(G.nodes(data = True))}
G = nx.relabel_nodes(G, remapping, copy=False)
# Add some edges --> A = [(0, 1, 0, 1, 1), (1, 0, 1, 1, 0), (0, 1, 0, 0, 1), (1, 1, 0, 0, 0), (1, 0, 1, 0, 0)] as the adjacency matrix
G.add_edges_from([
(0, 1), (0, 3), (0, 4),
(1, 2), (1, 3),
(2, 1), (2, 4),
(3, 0), (3, 1),
(4, 0), (4, 2)
])
# Convert the graph into PyTorch geometric
pyg_graph = from_networkx(G)
pyg_graph.edge_index
Run Code Online (Sandbox Code Playgroud)
当我在代码的最后一行打印边缘索引时,每次运行都会得到不同的答案。最重要的是,我希望始终获得相同(正确)的答案,从而从 networkx 保留每个节点编号:
tensor([[0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4, 4],
[4, 2, 4, 2, 3, 0, 1, 1, 4, 0, 1, 3]])
Run Code Online (Sandbox Code Playgroud)
该边缘索引张量的形式为:
对于要保留的节点 id,我们希望节点 0 在第一个(源)列表中出现三次,而不是仅仅两次。
有什么方法可以强制 PyTorch Geometric 复制节点 id 吗?
谢谢
[编辑]我的一种可能的解决方法是使用以下代码,它能够为 PyTorch 几何生成边缘索引和权重张量
# Create a dictionary of the mappings from company --> node id
mapping_dict = {x: i for i, x in enumerate(list(G.nodes()))}
# Get the number of nodes
num_nodes = len(mapping_dict)
# Now create a source, target, and edge list for PyTorch geometric graph
edge_source_list = []
edge_target_list = []
edge_weight_list = []
# iterate through all the edges
for e in G.edges():
# first element of tuple is appended to source edge list
edge_source_list.append(mapping_dict[e[0]])
# last element of tuple is appended to target edge list
edge_target_list.append(mapping_dict[e[1]])
# add the edge weight to the edge weight list
edge_weight_list.append(1)
# now create full edge lists for pytorch geometric - undirected edges need to be defined in both directions
full_source_list = edge_source_list + edge_target_list # full source list
full_target_list = edge_target_list + edge_source_list # full target list
full_weight_list = edge_weight_list + edge_weight_list # full edge weight list
print(len(edge_source_list), len(edge_target_list), len(full_source_list))
# now convert these to torch tensors
edge_index_tensor = torch.LongTensor( np.concatenate([ [np.array(full_source_list)], [np.array(full_target_list)]] ))
edge_weight_tensor = torch.FloatTensor(np.array(full_weight_list))
Run Code Online (Sandbox Code Playgroud)
看来这个问题在评论中已经解决了(@Sparky05提出的解决方案是使用copy=True,这是默认的nx.relabel_nodes),但下面是为什么节点顺序改变的解释。
当copy=False通过时,nx.relabel_nodes将按照节点在字典键集中出现的顺序将节点重新添加到图中remapping。代码中的相关行在这里:
def _relabel_inplace(G, mapping):
old_labels = set(mapping.keys())
new_labels = set(mapping.values())
if len(old_labels & new_labels) > 0:
# skip codes for labels sets that overlap
else:
# non-overlapping label sets
nodes = old_labels
# skip lines
for old in nodes: # this is now in the set order
Run Code Online (Sandbox Code Playgroud)
通过使用set节点的顺序进行修改,因此为了保留顺序,非重叠标签集应视为:
else:
# non-overlapping label sets
nodes = mapping.keys()
Run Code Online (Sandbox Code Playgroud)
相关 PR 已在此处提交。
| 归档时间: |
|
| 查看次数: |
1280 次 |
| 最近记录: |