标签: dendrogram

python 如何从数据帧制作树状图

我试图找到使用 python 中的 PANDAS 包创建的树状图。下面显示了示例数据。

import numpy as np
from pandas import *
import matplotlib.pyplot as plt
from hcluster import pdist, linkage, dendrogram
from numpy.random import rand

Index= ['aaa','bbb','ccc','ddd','eee']
Cols = ['A', 'B', 'C','D']
df = DataFrame(abs(np.random.randn(5, 4)), index= Index, columns=Cols)


>>> df
            A         B         C         D
aaa  0.987415  0.192240  0.709559  0.317106
bbb  0.856932  0.252441  1.183127  0.712855
ccc  1.687198  0.462673  1.046469  0.159287
ddd  0.977152  2.657582  0.491975  0.027280
eee  0.120464  0.945034  0.142658  0.537024
>>> 

X = df.T.values #Transpose values 
Y = …
Run Code Online (Sandbox Code Playgroud)

python dendrogram multidimensional-array pandas

6
推荐指数
1
解决办法
7215
查看次数

R 如何删除树状图上的标签?

我怎样才能删除这个情节的所有标签?或者,甚至更好,我怎样才能使它可读?

我用这个命令创建了它:

plot(hclust(distance), main="Dissimilarity = 1 - Correlation", xlab= NA, sub=NA)
Run Code Online (Sandbox Code Playgroud)

我读了很多遍,实际上xlabsub应该删除标签,但它对我不起作用!

我的情节是这样的:

在此处输入图片说明

plot r labels dendrogram

6
推荐指数
2
解决办法
5413
查看次数

按组进行树状图着色

我使用seaborn clustermap基于spearman的相关矩阵创建了一个热图,如下所示:我想绘制树状图。我希望树状图看起来像这样: dendrogram but on the heatmap

我创建了一个颜色字典,如下所示,但出现错误:

def assign_tree_colour(name,val_dict,coding_names_df):
ret = None
if val_dict.get(name, '') == 'Group 1':
    ret = "(0,0.9,0.4)"   #green
elif val_dict.get(name, '') == 'Group 2':
    ret = "(0.6,0.1,0)"   #red
elif val_dict.get(name, '') == 'Group 3':
    ret = "(0.3,0.8,1)"   #light blue
elif val_dict.get(name, '') == 'Group 4':
    ret = "(0.4,0.1,1)"   #purple
elif val_dict.get(name, '') == 'Group 5':
    ret = "(1,0.9,0.1)"   #yellow
elif val_dict.get(name, '') == 'Group 6':
    ret = "(0,0,0)"   #black
else:
    ret = "(0,0,0)" …
Run Code Online (Sandbox Code Playgroud)

python hierarchical-clustering dendrogram seaborn

6
推荐指数
1
解决办法
2624
查看次数

将hcluster生成的ndarray转换为Newick字符串,以便与ete2包一起使用

我有一个通过运行创建的向量列表:

import hcluster
import numpy as np
from ete2 import Tree

vecs = [np.array(i) for i in document_list] 
Run Code Online (Sandbox Code Playgroud)

其中document_list是我正在分析的Web文档的集合.然后我执行分层聚类:

Z = hcluster.linkage(vecs, metric='cosine') 
Run Code Online (Sandbox Code Playgroud)

这会生成一个ndarray,例如:

[[ 12.          19.           0.           1.        ]
[ 15.          21.           0.           3.        ]
[ 18.          22.           0.           4.        ]
[  3.          16.           0.           7.        ]
[  8.          23.           0.           6.        ]
[  5.          27.           0.           6.        ]
[  1.          28.           0.           7.        ]
[  0.          21.           0.           2.        ]
[  5.          29.           0.18350472   2.        ]
[  2. …
Run Code Online (Sandbox Code Playgroud)

python numpy dendrogram hcluster etetoolkit

5
推荐指数
1
解决办法
2072
查看次数

从物种列表中制作简单的系统发育树状图(树)

我想为海洋生物学课程制作一个简单的系统发育树作为教育的例子.我有一个具有分类等级的物种清单:

    Group <- c("Benthos","Benthos","Benthos","Benthos","Benthos","Benthos","Zooplankton","Zooplankton","Zooplankton","Zooplankton",
"Zooplankton","Zooplankton","Fish","Fish","Fish","Fish","Fish","Fish","Phytoplankton","Phytoplankton","Phytoplankton","Phytoplankton")
Domain <- rep("Eukaryota", length(Group))
Kingdom <- c(rep("Animalia", 18), rep("Chromalveolata", 4))
Phylum <- c("Annelida","Annelida","Arthropoda","Arthropoda","Porifera","Sipunculida","Arthropoda","Arthropoda","Arthropoda",
"Arthropoda","Echinoidermata","Chorfata","Chordata","Chordata","Chordata","Chordata","Chordata","Chordata","Heterokontophyta",
"Heterokontophyta","Heterokontophyta","Dinoflagellata")
Class <- c("Polychaeta","Polychaeta","Malacostraca","Malacostraca","Demospongiae","NA","Malacostraca","Malacostraca",
"Malacostraca","Maxillopoda","Ophiuroidea","Actinopterygii","Chondrichthyes","Chondrichthyes","Chondrichthyes","Actinopterygii",
"Actinopterygii","Actinopterygii","Bacillariophyceae","Bacillariophyceae","Prymnesiophyceae","NA")
Order <- c("NA","NA","Amphipoda","Cumacea","NA","NA","Amphipoda","Decapoda","Euphausiacea","Calanioda","NA","Gadiformes",
"NA","NA","NA","NA","Gadiformes","Gadiformes","NA","NA","NA","NA")                     
Species <- c("Nephtys sp.","Nereis sp.","Gammarus sp.","Diastylis sp.","Axinella sp.","Ph. Sipunculida","Themisto abyssorum","Decapod larvae (Zoea)",
"Thysanoessa sp.","Centropages typicus","Ophiuroidea larvae","Gadus morhua eggs / larvae","Etmopterus spinax","Amblyraja radiata",
"Chimaera monstrosa","Clupea harengus","Melanogrammus aeglefinus","Gadus morhua","Thalassiosira sp.","Cylindrotheca closterium",
"Phaeocystis pouchetii","Ph. Dinoflagellata")   
dat <- data.frame(Group, Domain, Kingdom, Phylum, Class, Order, Species)
dat
Run Code Online (Sandbox Code Playgroud)

我想获得树状图(聚类分析)并使用Domain作为第一个切割点,Kindom作为第二个切割点,Phylum作为第三个切割点,等等.缺失值应该被忽略(没有切割点,而是直线).组应该用作标签的着色类别.

我有点不确定如何从这个数据帧制作距离矩阵.R有很多系统发育树包,他们似乎想要新的数据/ DNA /其他高级信息.因此,对此的帮助将不胜感激.

tree r dendrogram phylogeny

5
推荐指数
1
解决办法
2186
查看次数

将树形图导出为R中的表格

我想将R中的hclust-dendrogram导出到数据表中,以便随后将其导入另一个("自制")软件.str(unclass(fit))提供了树形图的文本概述,但我正在寻找的是一个数字表.我看过Bioconductor ctc软件包,但它产生的输出看起来有点密码.我想有类似这样的桌上的东西:http://stn.spotfire.com/spotfire_client_help/heat/heat_importing_exporting_dendrograms.htm 有没有办法让这出R中的hclust对象的?

r cluster-analysis dendrogram hclust

5
推荐指数
1
解决办法
5303
查看次数

在树形图中抑制叶标签

我不是集群分析方面的专家,因此并不熟悉所有"特殊"贡献的软件包.因此,我只是使用基本例程作为我在这里做的快速示例.

问题是我的数据集大约有7800个观测值,因此叶子标签太多了,整个x轴都很杂乱.那么,我该如何抑制标签的绘图呢?我猜测必须有一些"异国情调"的参数可通过par()控制这个?

plot r cluster-analysis dendrogram axis-labels

5
推荐指数
1
解决办法
2703
查看次数

R中漂亮的树状图?

我的树形图非常丑陋,处于不可读的边缘,通常看起来像这样:

在此输入图像描述

library(TraMineR)
library(cluster)
data(biofam)
lab <- c("P","L","M","LM","C","LC","LMC","D")
biofam.seq <- seqdef(biofam[1:500,10:25], states=lab)

ccost <- seqsubm(biofam.seq, method = "CONSTANT", cval = 2, with.missing=TRUE)
sequences.OM <- seqdist(biofam.seq, method = "OM", norm= TRUE, sm = ccost,     
with.missing=TRUE)

clusterward <- agnes(sequences.OM, diss = TRUE, method = "ward")
plot(clusterward, which.plots = 2)
Run Code Online (Sandbox Code Playgroud)

我想要创建的是类似下面的内容,意思是圆形树状图,可以仔细控制标签的大小,使它们实际可见:

在此输入图像描述

我怎样才能在R中实现这一目标?

r cluster-analysis dendrogram traminer

5
推荐指数
1
解决办法
1507
查看次数

5
推荐指数
1
解决办法
1375
查看次数

在R中切割树状图

我试图将此树形图切割成3组:(T24,T1,T17等),(T12,T15,T6等)和(T2,T8,T3,T9)

在此输入图像描述

我尝试过使用cutree(hc,k = 3,h = 400),但它继续制作相同的组.任何帮助是极大的赞赏.这是我的代码.

#temps must have date/time as column headers, not row headers
load(temps)
distMatrix <- dist(temps)
#create label colors
labelColors = c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00", "#FFFF33")
# cut dendrogram in 3 clusters
clusMember = cutree(hc, k=3, h=400)
colLab <- function(n) {
  if (is.leaf(n)) {
  a <- attributes(n)
  labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]]
  attr(n, "nodePar") <- c(a$nodePar, lab.col = labCol)
  }
  n
}
hcd = as.dendrogram(hc)
clusDendro = dendrapply(hcd, colLab)
plot(clusDendro, main = "Cluster Analysis")
Run Code Online (Sandbox Code Playgroud)

r dendrogram dendextend

5
推荐指数
1
解决办法
3708
查看次数