Wil*_*car 5 tree recursion r igraph sqldf
我正在做家谱:
我已经根据sqldf https://www.r-bloggers.com/exploring-recursive-ctes-with-sqldf/改编了Bob Horton的示例
我的资料:
person father
Guillou Arthur NA
Cleach Marc NA
Guillou Eric Guillou Arthur
Guillou Jacques Guillou Arthur
Cleach Franck Cleach Marc
Cleach Leo Cleach Marc
Cleach Herbet Cleach Leo
Cleach Adele Cleach Herbet
Guillou Jean Guillou Eric
Guillou Alan Guillou Eric
Run Code Online (Sandbox Code Playgroud)
我的结果是,后代按“ Guillou Arthur”(没有父亲的头等人物)的等级排序:
name parent_name level
Guillou Arthur NA 1
Guillou Eric Guillou Arthur 2
Guillou Jacques Guillou Arthur 2
Guillou Alan Guillou Eric 3
Guillou Jean Guillou Eric 3
Run Code Online (Sandbox Code Playgroud)
您可以使用sqldf进行递归查询来构建此表:
数据 :
person <- c("Guillou Arthur",
"Cleach Marc",
"Guillou Eric",
"Guillou Jacques",
"Cleach Franck",
"Cleach Leo",
"Cleach Herbet",
"Cleach Adele",
"Guillou Jean",
"Guillou Alan" )
father <- c(NA, NA, "Guillou Arthur" , "Guillou Arthur", "Cleach Marc", "Cleach Marc", "Cleach Leo", "Cleach Herbet", "Guillou Eric", "Guillou Eric")
family <- data.frame(person, father)
Run Code Online (Sandbox Code Playgroud)
大到长格式转换:
library(tidyr)
long_family <- gather(family, parent, parent_name, -person)
long_family
Run Code Online (Sandbox Code Playgroud)
递归查询以查找“ Guillou Arthur”(没有父亲的头号人物)的后代:
library(sqldf)
descendants_sql <- "
WITH RECURSIVE descendants (name, parent_name, level) AS (
SELECT person, parent_name, 1 FROM long_family
WHERE person = '%s'
AND parent = '%s'
UNION ALL
SELECT F.person, F.parent_name, D.level + 1
FROM descendants D
JOIN long_family F
ON F.parent_name = D.name)
SELECT * FROM descendants ORDER BY level, name
"
fam <- sqldf(sprintf(descendants_sql, 'Guillou Arthur', 'father'))
fam
Run Code Online (Sandbox Code Playgroud)
我的问题:
如何直接用R(而不是sql)创建包括所有家族树的data.frame对象。每棵树都以先祖(无父)开头,例如“ Cleach Marc”。(使用R方法或sqldf方法)
我们构建了一个递归函数来获取父行,从那里一切都很容易。
首先,我们定义数据以stringsAsFactors = FALSE进行更平滑的重新格式化。
family <- data.frame(person, father,stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)
功能
father_line <- function(x){
dad <- subset(family,person==x)$father
if(is.na(dad)) return(x)
c(x,father_line(dad))
}
father_line ("Guillou Alan")
# [1] "Guillou Alan" "Guillou Eric" "Guillou Arthur"
Run Code Online (Sandbox Code Playgroud)
用它来获得等级和其他东西
family$father_line <- lapply(family$person,father_line)
family$level <- lengths(family$father_line)
family$patriarch <- sapply(family$father_line,tail,1)
# person father father_line level patriarch
# 1 Guillou Arthur <NA> Guillou Arthur 1 Guillou Arthur
# 2 Cleach Marc <NA> Cleach Marc 1 Cleach Marc
# 3 Guillou Eric Guillou Arthur Guillou Eric, Guillou Arthur 2 Guillou Arthur
# 4 Guillou Jacques Guillou Arthur Guillou Jacques, Guillou Arthur 2 Guillou Arthur
# 5 Cleach Franck Cleach Marc Cleach Franck, Cleach Marc 2 Cleach Marc
# 6 Cleach Leo Cleach Marc Cleach Leo, Cleach Marc 2 Cleach Marc
# 7 Cleach Herbet Cleach Leo Cleach Herbet, Cleach Leo, Cleach Marc 3 Cleach Marc
# 8 Cleach Adele Cleach Herbet Cleach Adele, Cleach Herbet, Cleach Leo, Cleach Marc 4 Cleach Marc
# 9 Guillou Jean Guillou Eric Guillou Jean, Guillou Eric, Guillou Arthur 3 Guillou Arthur
# 10 Guillou Alan Guillou Eric Guillou Alan, Guillou Eric, Guillou Arthur 3 Guillou Arthur
Run Code Online (Sandbox Code Playgroud)
例如要获得规定的预期输出:
subset(family,patriarch == "Guillou Arthur",select=c(person,father,level))
# person father level
# 1 Guillou Arthur <NA> 1
# 3 Guillou Eric Guillou Arthur 2
# 4 Guillou Jacques Guillou Arthur 2
# 9 Guillou Jean Guillou Eric 3
# 10 Guillou Alan Guillou Eric 3
Run Code Online (Sandbox Code Playgroud)
的tidyverse方式,它是这样的:
library(tidyverse)
family %>%
mutate(family_line = map(person,father_line),
level = lengths(family_line),
patriarch = map(family_line,last)) %>%
filter(patriarch == "Guillou Arthur") %>%
select(person,father,level)
# person father level
# 1 Guillou Arthur <NA> 1
# 2 Guillou Eric Guillou Arthur 2
# 3 Guillou Jacques Guillou Arthur 2
# 4 Guillou Jean Guillou Eric 3
# 5 Guillou Alan Guillou Eric 3
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
498 次 |
| 最近记录: |