0 python r hierarchical-clustering matplotlib seaborn
我正在处理一群患者的肿瘤图像表达数据,对于每个患者,我都有一个提取的肿瘤图像特征的列表。我使用层次聚合聚类对患者和特征进行了聚类,并使用Seaborn 的 .clustermap对其进行了绘制。这是我到目前为止所拥有的:

现在,每个患者都有一堆与之相关的分类信息,这些是癌症亚型(A、B、C、D)、T 分期(1、2、3、4)、N 分期(0、1、2、3) )、M stage(0,1) 以及它们所属的 HAC(1,2,3,...) 集群。此外,每个图像特征也属于不同的类别。我想在每个轴上显示这个分类信息(我知道 {row, col}_colors。本质上我正在尝试重新创建下面的图,我想知道在 Python 中使用 matplotlib/seaborn 是否可以实现这一点。

另外,你认为这个图的作者是在 2014 年用什么来生成它的。R?
我的代码带有一些随机数据:
# Random dummy data
np_zfeatures = np.random.random((420, 1218)) # example matrix of z-scored features [patients, features]
patient_T_stage = np.random.randint(low=1, high=5, size=(420,))
patient_N_stage = np.random.randint(low=0, high=4, size=(420,))
patient_M_stage = np.random.randint(low=0, high=2, size=(420,))
patient_O_stage = np.random.randint(low=0, high=5, size=(420,))
patient_subtype = np.random.randint(low=0, high=5, size=(420,))
feature_class = np.random.randint(low=0, high=5, size=(1218,)) # There's 5 categories of features (first order, shape, textural, wavelet, LoG)
# HAC clustering (compute linkage matrices)
method = 'ward'
feature_links = scipy.cluster.hierarchy.linkage(np_zfeatures, method=method, metric='euclidean')
patient_links = scipy.cluster.hierarchy.linkage(np_zfeatures.transpose(), method=method, metric='euclidean')
# plot the re-ordered cluster map
cbar_kws={'orientation': 'vertical',
'label': 'feature Z-score',
'extend': 'both',
'extendrect':True
}
arguments = {
'row_cluster': True,
'col_cluster': True,
'row_linkage': patient_links,
'col_linkage': feature_links
}
cmap = 'Spectral_r'
cg = sns.clustermap(np_zfeatures.transpose(), **arguments, cmap=cmap, vmin=-2, vmax=2, cbar_pos=(0.155,0.644,0.04, 0.15), cbar_kws=cbar_kws)
cg.ax_row_dendrogram.set_visible(False)
cg.ax_col_dendrogram.set_visible(True)
ax = cg.ax_heatmap
ax.set_xlabel('Patients', fontsize=16)
ax.set_ylabel('Radiomics Features', fontsize=16)
cb_ax = cg.ax_cbar.yaxis.set_ticks_position('left')
cb_ax = cg.ax_cbar.yaxis.set_label_position('left')
cg.savefig(f'hierarchical cluster map - method: {method}')
Run Code Online (Sandbox Code Playgroud)
你必须手动完成情节,我认为不值得尝试破解seaborn的ClusterGrid确实得到你需要的结果。您可以使用 生成树状图scipy,并使用 绘制热图imshow()
我无法花时间编写精确的副本,但这里有一个快速模型。希望其中没有错误,但这只是证明它是可行的。
import scipy
# Random dummy data
np.random.seed(1234)
Npatients = 10
Nfeatures = 20
np_zfeatures = np.random.random((Npatients, Nfeatures)) # example matrix of z-scored features [patients, features]
patient_T_stage = np.random.randint(low=1, high=5, size=(Npatients,))
patient_N_stage = np.random.randint(low=0, high=4, size=(Npatients,))
patient_M_stage = np.random.randint(low=0, high=2, size=(Npatients,))
patient_O_stage = np.random.randint(low=0, high=5, size=(Npatients,))
patient_subtype = np.random.randint(low=0, high=5, size=(Npatients,))
feature_class = np.random.randint(low=0, high=5, size=(Nfeatures,)) # There's 5 categories of features (first order, shape, textural, wavelet, LoG)
N_rows_patients = 5
N_col_features = 1
# HAC clustering (compute linkage matrices)
method = 'ward'
feature_links = scipy.cluster.hierarchy.linkage(np_zfeatures, method=method, metric='euclidean')
patient_links = scipy.cluster.hierarchy.linkage(np_zfeatures.transpose(), method=method, metric='euclidean')
fig = plt.figure()
gs0 = matplotlib.gridspec.GridSpec(2,1, figure=fig,
height_ratios=[8,2], hspace=0.05)
gs1 = matplotlib.gridspec.GridSpecFromSubplotSpec(2,1, subplot_spec=gs0[0],
height_ratios=[2,8],
hspace=0)
ax_heatmap = fig.add_subplot(gs1[1])
ax_col_dendrogram = fig.add_subplot(gs1[0], sharex=ax_heatmap)
col_dendrogram = scipy.cluster.hierarchy.dendrogram(feature_links, ax=ax_col_dendrogram)
row_dendrogram = scipy.cluster.hierarchy.dendrogram(patient_links, no_plot=True)
ax_col_dendrogram.set_axis_off()
xind = col_dendrogram['leaves']
yind = row_dendrogram['leaves']
xmin,xmax = ax_col_dendrogram.get_xlim()
data = pd.DataFrame(np_zfeatures)
ax_heatmap.imshow(data.iloc[xind,yind].T, aspect='auto', extent=[xmin,xmax,0,1], cmap='Spectral_r', vmin=-2, vmax=2)
ax_heatmap.yaxis.tick_right()
plt.setp(ax_heatmap.get_xticklabels(), visible=False)
gs2 = matplotlib.gridspec.GridSpecFromSubplotSpec(N_rows_patients, 1, subplot_spec=gs0[1])
for i,(data,label) in enumerate(zip([patient_T_stage,patient_N_stage,patient_M_stage,patient_O_stage,patient_subtype],
['T-stage','N-stage','M-stage','Overall stage','Subtype'])):
ax = fig.add_subplot(gs2[i], sharex=ax_heatmap)
ax.imshow(np.vstack([data[xind],data[xind]]), aspect='auto', extent=[xmin,xmax,0,1], cmap='Blues')
ax.set_yticks([])
ax.set_ylabel(label, rotation=0, ha='right', va='center')
if not ax.is_last_row():
plt.setp(ax.get_xticklabels(), visible=False)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2786 次 |
| 最近记录: |