交换Python scipy的树状图/链接的叶子

Question

交换Python scipy的树状图/链接的叶子

dme*_*meu 5 python hierarchical-clustering matplotlib dendrogram scipy

我为我的数据集生成了一个树状图，但我不满意如何对某些级别的拆分进行排序。因此，我正在寻找一种交换单个拆分的两个分支（或叶子）的方法。

如果我们看一下底部的代码和树状图，则有两个标签，11并且25与大集群的其余部分分开。我对此实在不满意，并希望带有11和25的分支成为拆分的右分支，而集群的其余部分成为左分支。所示的距离将仍然相同，因此数据不会改变，只是美观。

能做到吗？如何？我特别适合手动干预，因为最佳叶子排序算法在这种情况下可能无法正常工作。

import numpy as np

# random data set with two clusters
np.random.seed(65)  # for repeatability of this tutorial
a = np.random.multivariate_normal([10, 0], [[3, 1], [1, 4]], size=[10,])
b = np.random.multivariate_normal([0, 20], [[3, 1], [1, 4]], size=[20,])
X = np.concatenate((a, b),)

# create linkage and plot dendrogram    
from scipy.cluster.hierarchy import dendrogram, linkage
Z = linkage(X, 'ward')

plt.figure(figsize=(15, 5))
plt.title('Hierarchical Clustering Dendrogram')
plt.xlabel('sample index')
plt.ylabel('distance')
dendrogram(
    Z,
    leaf_rotation=90.,  # rotates the x axis labels
    leaf_font_size=12.,  # font size for the x axis labels
)
plt.show()

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 1

我遇到了类似的问题，并通过在链接中使用optimal_ordering选项解决了这个问题。我附上了您的案例的代码和结果，这可能不完全是您喜欢的，但对我来说似乎有了很大的改进。

import numpy as np
import matplotlib.pyplot as plt

# random data set with two clusters
np.random.seed(65)  # for repeatability of this tutorial
a = np.random.multivariate_normal([10, 0], [[3, 1], [1, 4]], size=[10,])
b = np.random.multivariate_normal([0, 20], [[3, 1], [1, 4]], size=[20,])
X = np.concatenate((a, b),)

# create linkage and plot dendrogram    
from scipy.cluster.hierarchy import dendrogram, linkage
Z = linkage(X, 'ward', optimal_ordering = True)

plt.figure(figsize=(15, 5))
plt.title('Hierarchical Clustering Dendrogram')
plt.xlabel('sample index')
plt.ylabel('distance')
dendrogram(
    Z,
    leaf_rotation=90.,  # rotates the x axis labels
    leaf_font_size=12.,  # font size for the x axis labels
    distance_sort=False,
    show_leaf_counts=True,
    count_sort=False
)
plt.show()

Run Code Online (Sandbox Code Playgroud)

在链接中使用optimal_ordering的结果

归档时间：	7 年，10 月前
查看次数：	140 次
最近记录：	7 年，10 月前